246 Forks
808 Stars
808 Watchers

storm-crawler

A scalable, mature and versatile web crawler based on Apache Storm

How to download and setup storm-crawler

Open terminal and run command
git clone https://github.com/DigitalPebble/storm-crawler.git
git clone is used to create a copy or clone of storm-crawler repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with storm-crawler https://github.com/DigitalPebble/storm-crawler/archive/master.zip

Or simply clone storm-crawler with SSH
[email protected]:DigitalPebble/storm-crawler.git

If you have some problems with storm-crawler

You may open issue on storm-crawler support forum (system) here: https://github.com/DigitalPebble/storm-crawler/issues

Similar to storm-crawler repositories

Here you may see storm-crawler alternatives and analogs

 tensorflow    scrapy    CNTK    diaspora    Qix    handson-ml    Sasila    Price-monitor    infinit    diplomat    olric    qTox    LightGBM    h2o-3    catboost    distributed    tns    webmagic    colly    headless-chrome-crawler    scrapy-cluster    Lulu    newcrawler    scrapple    goose-parser    arachnid    gopa    scrapy-zyte-smartproxy    EvaEngine.js    dgraph