A scalable, mature and versatile web crawler based on Apache Storm
-
Updated
Sep 16, 2024 - HTML
A scalable, mature and versatile web crawler based on Apache Storm
Resources for running StormCrawler with Docker services
a suite of benchmark applications for distributed data stream processing systems
Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
Fast Advanced Spam Analysis Tool
Docker image packaging for Apache Storm
Apache Pulsar Adapters
News crawling with StormCrawler - stores content as WARC
Process web archives (WARC format) with StormCrawler and index content into Elasticsearch or Solr
Stream Processing Abstraction Framework for Java.
A stream processing project that analyzes the chat stream of a Twitch channel. Built on Apache Storm for CS442 Distributed Systems course.
Sentiment Classifier of Tweets, based on Lambda Architecture.
Full term Project of the exam of Parallel Computing of University of Florence. Implementation of Twitter Sentiment Analysis using Hadoop, Apache Storm and HBase to obtain parallelization.
Basic Apache Topology Example
A Storm-based Tag Cloud Platform for Multiple SNS Users
Battle-tested Apache Storm Multi-Lang implementation for Python
[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Add a description, image, and links to the apache-storm topic page so that developers can more easily learn about it.
To associate your repository with the apache-storm topic, visit your repo's landing page and select "manage topics."