webhunger
WebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing without concerning for the crawling process.
How to download and setup webhunger
Open terminal and run command
git clone https://github.com/jerrycshen/webhunger.git
git clone is used to create a copy or clone of webhunger repositories.
You pass git clone a repository URL. it supports a few different network protocols and corresponding URL formats.
Also you may download zip file with webhunger https://github.com/jerrycshen/webhunger/archive/master.zip
Or simply clone webhunger with SSH
[email protected]:jerrycshen/webhunger.git
If you have some problems with webhunger
You may open issue on webhunger support forum (system) here: https://github.com/jerrycshen/webhunger/issuesSimilar to webhunger repositories
Here you may see webhunger alternatives and analogs
tensorflow scrapy CNTK diaspora Qix handson-ml Sasila Price-monitor infinit diplomat olric qTox LightGBM h2o-3 catboost distributed tns webmagic colly headless-chrome-crawler scrapy-cluster Lulu newcrawler scrapple goose-parser arachnid gopa scrapy-zyte-smartproxy EvaEngine.js dgraph