Most popular crawler repositories and open source projects

ast-hook-for-js-RE

浏览器内存漫游解决方案(探索中...)

309   967   967  

Pxer

A tool for pixiv.net. 人人可用的P站爬虫

111   959   959  

crawler

A high performance web crawler / scraper in Elixir.

90   948   948  

NewPipeExtractor

NewPipe's core library for extracting data from streaming sites

347   930   930  

BT-btt

磁力網站U3C3介紹以及域名更新

84   929   929  

TumblThree

A Tumblr Blog Backup Application

139   927   927  

zhihu-crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展...

382   907   907  

magnet-dht

✌️ Python3 BitTorrent DHT crawler

284   907   907  

SecCrawler

一个方便安全研究人员获取每日安全日报的爬虫和推送程序,目前爬取范围包括...

140   905   905  

bilili

:beers: bilibili video (including bangumi) and danmaku downloader | B...

74   894   894  

fess

Fess is very powerful and easily deployable Enterprise Search Server.

167   871   871  

XSRFProbe

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Too...

175   862   862  

go-dork

The fastest dork scanner written in Go.

102   816   816  

spidr

A versatile Ruby web spidering library that can spider a site, multipl...

108   815   815  

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

140   813   813  

storm-crawler

A scalable, mature and versatile web crawler based on Apache Storm

246   808   808  

till

DataHen Till is a companion tool to your existing web scraper that ins...

24   803   803  

pic-gather

🛑 image collector, which supports custom acquisition source configura...

212   801   801  

sperm

浏览过的精彩逆向文章汇总,值得一看

226   791   791  

BaiduImageSpider

一个超级轻量的百度图片爬虫

390   781   781  

creeper

:paw_prints: Creeper - The Next Generation Crawler Framework (Go)

61   775   775  

scrapyrt

HTTP API for Scrapy spiders

157   775   775  

fetchbot

A simple and flexible web crawler that follows the robots.txt policies...

99   774   774  

BaiduSpider

BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图...

186   765   765  

icrawler

A multi-thread crawler framework with many builtin image crawlers prov...

163   759   759  

Bili23-Downloader

跨平台的 B 站视频下载工具,支持 Windows、Linux、macOS 三平台,下载 B...

54   756   756  

ComputerStudent

计算机专业系统性学习资料(python,c,c++,计算机组成,计算机网络,编译原...

292   728   728  

easy-scraping-tutorial

Simple but useful Python web scraping tutorial code.

551   723   723  

xxl-crawler

A lightweight web crawler framework.(Java爬虫框架)

309   715   715  

bookcorpus

Crawl BookCorpus

94   694   694  

xeHentai

Doujinshi downloader 绅士漫画下载

84   692   692  

PyPtt

直接連線登入的 PTT library,支援 PTT, PTT2

91   678   678  

course-crawler

🎓 中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下...

199   677   677  

ArrowDL

ArrowDL (Arrow Downloader) is a download manager for Windows, MacOS an...

38   670   670  

crawler

K 哥爬虫代码分享,JS 逆向,爬虫进阶。关注公众号:K哥爬虫

238   662   662  

chatWeb

ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main...

100   661   661  

FileMasta

A search application to explore, discover and share online files

73   658   658  

SCrawler

:rainbow_flag: Media downloader from any sites, including Twitter, Red...

35   654   654  

skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability...

52   653   653  

Scavenger

Crawler (Bot) searching for credential leaks on paste sites.

122   649   649  

spider_collection

python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,...

164   647   647  

gOSINT

OSINT Swiss Army Knife

79   643   643  

Weibo-Analyst

Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分...

171   640   640  

NetDiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间...

156   628   628  

fbcrawl

A Facebook crawler

231   624   624  

DouYin

API of DouYin for Humans used to Crawl Popular Videos and Musics

260   621   621  

dotcommon

What do people have in their dotfiles?

31   620   620  

go_jobs

带你了解一下Golang的市场行情

123   612   612  

google-play-scraper

Google play scraper for Python inspired by <facundoolano/google-play-s...

155   582   582  

newcrawler

Free Web Scraping Tool with Java

115   581   581