10 Forks
35 Stars
35 Watchers

ArticleSpider

Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).

How to download and setup ArticleSpider

Open terminal and run command
git clone https://github.com/hackfengJam/ArticleSpider.git
git clone is used to create a copy or clone of ArticleSpider repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with ArticleSpider https://github.com/hackfengJam/ArticleSpider/archive/master.zip

Or simply clone ArticleSpider with SSH
[email protected]:hackfengJam/ArticleSpider.git

If you have some problems with ArticleSpider

You may open issue on ArticleSpider support forum (system) here: https://github.com/hackfengJam/ArticleSpider/issues

Similar to ArticleSpider repositories

Here you may see ArticleSpider alternatives and analogs

 scrapy    grafana    etcd    nsq    Qix    elasticsearch    dubbo    incubator-mxnet    Sasila    Price-monitor    hraftd    diplomat    js    elasticell    olric    translations    FOSElasticaBundle    webmagic    colly    headless-chrome-crawler    Lulu    newcrawler    scrapple    goose-parser    arachnid    gopa    scrapy-zyte-smartproxy    bookbrainz-site    elastic4s    scalecube-services