147 Forks
320 Stars
320 Watchers

java-spider

一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。

How to download and setup java-spider

Open terminal and run command
git clone https://github.com/hemin1003/java-spider.git
git clone is used to create a copy or clone of java-spider repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with java-spider https://github.com/hemin1003/java-spider/archive/master.zip

Or simply clone java-spider with SSH
[email protected]:hemin1003/java-spider.git

If you have some problems with java-spider

You may open issue on java-spider support forum (system) here: https://github.com/hemin1003/java-spider/issues