Most popular crawler repositories and open source projects

TumblThree

A Tumblr Blog Backup Application

139   927   927  

zhihu-crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展...

382   907   907  

magnet-dht

✌️ Python3 BitTorrent DHT crawler

284   907   907  

bilili

:beers: bilibili video (including bangumi) and danmaku downloader | B...

74   894   894  

crawler

A high performance web crawler / scraper in Elixir.

86   877   877  

fess

Fess is very powerful and easily deployable Enterprise Search Server.

167   871   871  

XSRFProbe

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Too...

175   862   862  

go-dork

The fastest dork scanner written in Go.

102   816   816  

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

140   813   813  

storm-crawler

A scalable, mature and versatile web crawler based on Apache Storm

246   808   808  

fakebrowser

🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and key...

182   804   804  

till

DataHen Till is a companion tool to your existing web scraper that ins...

24   803   803  

pic-gather

🛑 image collector, which supports custom acquisition source configurat...

212   801   801  

sperm

浏览过的精彩逆向文章汇总,值得一看

226   791   791  

goclone

Website Cloner - Utilizes powerful Go routines to clone websites to y...

185   783   783  

BaiduImageSpider

一个超级轻量的百度图片爬虫

390   781   781  

creeper

:paw_prints: Creeper - The Next Generation Crawler Framework (Go)

61   775   775  

scrapyrt

HTTP API for Scrapy spiders

157   775   775  

fetchbot

A simple and flexible web crawler that follows the robots.txt policies...

99   774   774  

BaiduSpider

BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图...

186   765   765  

spidr

A versatile Ruby web spidering library that can spider a site, multipl...

109   760   760  

icrawler

A multi-thread crawler framework with many builtin image crawlers prov...

163   759   759  

crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

91   736   736  

ComputerStudent

计算机专业系统性学习资料(python,c,c++,计算机组成,计算机网络,编译原...

292   728   728  

easy-scraping-tutorial

Simple but useful Python web scraping tutorial code.

551   723   723  

SecCrawler

一个方便安全研究人员获取每日安全日报的爬虫和推送程序,目前爬取范围包括...

132   722   722  

bookcorpus

Crawl BookCorpus

94   694   694  

xeHentai

Doujinshi downloader 绅士漫画下载

84   692   692  

PyPtt

直接連線登入的 PTT library,支援 PTT, PTT2

91   678   678  

course-crawler

🎓 中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下载...

199   677   677  

xxl-crawler

A distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)

293   664   664  

crawler

K 哥爬虫代码分享,JS 逆向,爬虫进阶。关注公众号:K哥爬虫

238   662   662  

chatWeb

ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main...

100   661   661  

SCrawler

:rainbow_flag: Media downloader from any sites, including Twitter, Red...

35   654   654  

skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability...

52   653   653  

spider_collection

python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,...

164   647   647  

Weibo-Analyst

Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分...

171   640   640  

FileMasta

A search application to explore, discover and share online files

79   630   630  

NetDiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间...

156   628   628  

fbcrawl

A Facebook crawler

231   624   624  

DouYin

API of DouYin for Humans used to Crawl Popular Videos and Musics

260   621   621  

dotcommon

What do people have in their dotfiles?

31   620   620  

go_jobs

带你了解一下Golang的市场行情

123   612   612  

google-play-scraper

Google play scraper for Python inspired by <facundoolano/google-play-s...

155   582   582  

newcrawler

Free Web Scraping Tool with Java

115   581   581  

runoob-PDF-

爬取菜鸟教程网站并转PDF__python_crawer_by_chrome

365   569   569  

x-crawl

x-crawl is a flexible Node.js multifunctional crawler library. Flexibl...

35   567   567  

scrapedin

LinkedIn Scraper (currently working 2020)

173   566   566  

jvppeteer

Headless Chrome For Java (Java 爬虫)

135   557   557  

XHS-Spider

小红书数据采集、网站图片、视频资源批量下载工具,颜值超高的数据采集工具...

68   556   556