66 Forks
351 Stars
351 Watchers

supercrawler

A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.

How to download and setup supercrawler

Open terminal and run command
git clone https://github.com/brendonboshell/supercrawler.git
git clone is used to create a copy or clone of supercrawler repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with supercrawler https://github.com/brendonboshell/supercrawler/archive/master.zip

Or simply clone supercrawler with SSH
[email protected]:brendonboshell/supercrawler.git

If you have some problems with supercrawler

You may open issue on supercrawler support forum (system) here: https://github.com/brendonboshell/supercrawler/issues

Similar to supercrawler repositories

Here you may see supercrawler alternatives and analogs

 scrapy    Sasila    Price-monitor    webmagic    colly    headless-chrome-crawler    Lulu    newcrawler    scrapple    goose-parser    arachnid    gopa    scrapy-zyte-smartproxy    node-crawler    arachni    newspaper    webster    spidy    N2H4    easy-scraping-tutorial    antch    pomp    talospider    podcastcrawler    FileMasta    lux    scrapy-redis    haipproxy    DotnetSpider    TumblThree