java-spider
一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。
How to download and setup java-spider
Open terminal and run command
git clone https://github.com/hemin1003/java-spider.git
git clone is used to create a copy or clone of java-spider repositories.
You pass git clone a repository URL. it supports a few different network protocols and corresponding URL formats.
Also you may download zip file with java-spider https://github.com/hemin1003/java-spider/archive/master.zip
Or simply clone java-spider with SSH
[email protected]:hemin1003/java-spider.git
If you have some problems with java-spider
You may open issue on java-spider support forum (system) here: https://github.com/hemin1003/java-spider/issuesSimilar to java-spider repositories
Here you may see java-spider alternatives and analogs
grafana elasticsearch FOSElasticaBundle gopa bookbrainz-site elastic4s elk-docker dev-setup Opserver elasticsearch-HQ pipeline sentinl awesome-aws yii2-elasticsearch great-big-example-application gardening dejavu mirage kibana NewsBlur elasticsearch-analysis-ik docker-elk elasticsearch-sql Linux-Tutorial searchkit elasticsearch-dump peek elastic vue-storefront elasticsearch-rails