storm-crawler
A scalable, mature and versatile web crawler based on Apache Storm
How to download and setup storm-crawler
Open terminal and run command
git clone https://github.com/DigitalPebble/storm-crawler.git
git clone is used to create a copy or clone of storm-crawler repositories.
You pass git clone a repository URL. it supports a few different network protocols and corresponding URL formats.
Also you may download zip file with storm-crawler https://github.com/DigitalPebble/storm-crawler/archive/master.zip
Or simply clone storm-crawler with SSH
[email protected]:DigitalPebble/storm-crawler.git
If you have some problems with storm-crawler
You may open issue on storm-crawler support forum (system) here: https://github.com/DigitalPebble/storm-crawler/issuesSimilar to storm-crawler repositories
Here you may see storm-crawler alternatives and analogs
tensorflow scrapy CNTK diaspora Qix handson-ml Sasila Price-monitor infinit diplomat olric qTox LightGBM h2o-3 catboost distributed tns webmagic colly headless-chrome-crawler scrapy-cluster Lulu newcrawler scrapple goose-parser arachnid gopa scrapy-zyte-smartproxy EvaEngine.js dgraph