Most popular crawler repositories and open source projects

js_block

研究学习各种拦截:反爬虫、拦截ad、防广告注入、斗黄牛等

16   62   62  

tieba-zhuaqu

百度贴吧分布式爬虫,用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析

26   62   62  

Java-Carwler-Technology

网络数据采集技术—Java网络爬虫 (书稿完整代码,涉及网络爬虫的各种技术和...

20   62   62  

Instagram-downloader

Instagram user's photos and videos downloader. Download all media file...

16   62   62  

slime

🍰 A visual crawler management platform

28   62   62  

crawdad

Cross-platform persistent and distributed web crawler :crab:

9   61   61  

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC...

10   61   61  

feaplat

爬虫管理系统,支持集群,弹性伸缩。支持运行feapder、scrapy、selenium、p...

13   61   61  

pomp

Screen scraping and web crawling framework

10   60   60  

Chemrtron

A document viewer; fuzzy match incremental search.

14   60   60  

WebCrawler

Just a simple web crawler which return crawled links as IObservable us...

33   60   60  

zhihu-crawler

徒手实现定时爬取知乎,从中发掘有价值的信息,并可视化爬取的数据作网页展...

9   60   60  

Web-Iota

Iota is a web scraper which can find all of the images and links/subur...

5   60   60  

metacritic_api

PHP Metacritic API - Mirror from my GitLab

13   60   60  

WebSpider

基于Nodejs,superagent,cheerio的在线web爬虫项目,支持生成API

20   59   59  

crawler-project

Google资深工程师深度讲解Go语言 爬虫项目。

29   59   59  

rewe-discounts

Grabs current REWE discounts and saves them in a markdown file || Holt...

5   59   59  

webspot

An intelligent web service to automatically detect web content and ext...

9   59   59  

damai-tickets

大麦网抢票脚本案例

8   59   59  

phpcrawl

Copy of http://phpcrawl.cuab.de/ for using with composer

33   58   58  

lyrics-crawler

Get the lyrics for the song currently playing on Spotify

19   58   58  

ipfs-crawler

A crawler for the IPFS network, code for our paper (https://arxiv.org/...

14   58   58  

Daily-code

日常代码爬虫、gui小工具等

5   57   57  

TumblTwo

TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.

16   57   57  

slideshare-downloader

Python script to download slideshare pdf. This script able to download...

24   57   57  

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

21   57   57  

Tor_Spider

Python project to crawl and scrap the lesser known deep web or one can...

16   57   57  

SoFIFA

A SoFIFA webcrawler and Machine Learning prediction

13   57   57  

Bilibili_manga_download

带图形界面的哔哩哔哩漫画下载工具

1   57   57  

WebScrapper

Telegram Bot to scrap webpages using Requests, html5lib and Beautifuls...

43   57   57  

PicCrawler

使用RxJava2 和 Java 8的特性开发的图片爬虫

14   56   56  

m3u8Downloader

meijuba.net,Python crawler,M3U8格式视频下载,桌面应用

22   56   56  

devsearch

A web search engine built with Python which uses TF-IDF and PageRank t...

13   56   56  

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler dev...

10   56   56  

actor-facebook-scraper

Scrape public Facebook pages, posts, reviews and comments

32   56   56  

skweez

Fast website scraper and wordlist generator

5   56   56  

librengine

Privacy Web Search Engine (not meta, own crawler)

4   56   56  

talospider

talospider - A simple,lightweight scraping micro-framework

4   55   55  

MyCrawler

我的爬虫合集

3   55   55  

All-IT-eBooks-Spider

[Updated] A simple python crawler for my tutorial blog at http://www.j...

34   55   55  

bolsa

Biblioteca feita em Python com o objetivo de facilitar o acesso a dado...

16   55   55  

simple_bank_korea

simple crawler for Korean banks with Transactions

9   55   55  

custom-crawler

🌌 High productivity semi-automatic crawler generator 🛠️🧰

2   55   55  

nest-crawler

An easiest crawling and scraping module for NestJS

8   55   55  

telegram-groups-crawler

A Telegram crawler made in Python to automatically search groups and c...

23   55   55  

JMComic-Crawler-Python

Python API For JMComic (禁漫天堂)

141   55   55  

kalel

Kal El Network Stress Test and Penetration Testing Toolkit

16   54   54  

tool-gin

基于go-gin框架建立减少冗余动作项目,如:下载一些工具

20   54   54  

instagram-hashtag-crawler

Crawl Instagram hashtags

19   54   54  

local-api-client-typescript

Official JavaScript/TypeScript library for interacting with Kameleo Cl...

1   54   54