Most popular crawler repositories and open source projects

indonesian-NLP-resources kirralabs

data resource untuk NLP bahasa indonesia

231 49 9

crawlab-lite crawlab-team Vue

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

230 75 10

goose-parser redco JavaScript

Universal scraping tool, which allows you to extract data using multiple environments

229 12 12

facebook-data-extraction 18520339 Python

Experience for effectively fetching Facebook data by Querying Graph API with Account-based Token and Operating undetectable scraping Bots to extract C...

229 61 9

darc JarryShaw Python

Darkweb Crawler Project

227 38 10

KoreaNewsCrawler lumyjuwon Python

A korean news crawler built to ingest large amounts of news data.

225 105 8

black-widow offensive-hub Python

GUI based offensive penetration testing tool (Open Source)

225 46 14

google-group-crawler icy Shell

[Deprecated] Get (almost) original messages from google group archives. Your data is yours.

224 38 11

WebVideoBot tim232385 Java

Web crawler.

223 43 32

FooProxy 01ly Python

稳健高效的评分制-针对性- IP代理池 + API服务，可以自己插入采集器进行代理IP的爬取，针对你的爬虫的一个或多个目标网站分别生成有效的IP代理数据库，支持Mongo...

222 59 222

N2H4 forkonlp R

네이버 뉴스 수집을 위한 도구

222 76 16

weibo_wordcloud gaussic Python

根据关键词抓取微博数据，再生成词云

221 70 4

scrapy-zhihu-github zhijunio Python

scrapy examples for crawling zhihu and github

221 102 26

MetaFinder Josue87 Python

Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata

221 36 2

91porn-crawler blue-troy Java

91 porn crawler. 自动爬取并下载你想要的91porn热门视频。Automatically download your "favorite" 91porn hot movies.

220 39 220

proxyhub ForceFledgling Python

An advanced [Finder | Checker | Server] tool for proxy servers, supporting both HTTP(S) and SOCKS protocols. 🎭

216 17 5

scrapedin-linkedin-crawler linkedtales JavaScript

Crawler for LinkedIn full profiles 2019

215 71 12

JavPy TheodoreKrypton JavaScript

Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese AV!

214 34 8

goribot zhshch2002 Go

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

211 30 211

WebScrapper nuhmanpk Python

Powerful Telegram bot for web scraping and crawling. Fast, easy, and loved by thousands!

208 114 6

scrapemate gosom Go

Golang Crawling and scraping framework

205 27 1

crawler Norconex Java

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to v...

204 71 30

ChainWalker 0xsha Go

Rapid Smart Contract Crawler

204 22 5

awesome-python-primer zkqiang Python

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

203 26 3

laosj songtianyi Go

golang light-weight image crawler

202 38 18

instagram-crawler mgleon08 Ruby

Crawl instagram photos, posts and videos for download.

202 16 6

NoSmoke macacajs JavaScript

A cross platform UI crawler which scans view trees then generate and execute UI test cases.

202 58 4

authority-data yiliyassh Python

官方权威数据：统计年签，统计公报，互联网行业报告，工信部数据，ICT报告等 Official authoritative data (Chinese)

201 31 8

packagist-mirror webysther PHP

📦✂️📋📦 Create a mirror of packagist.org metadata for use locally with composer

200 68 2

VideoServer GF-Allen JavaScript

以Node.js基于express以及爬虫实现的视频资源后端

199 6 1

slrp nfx Go

rotating open proxy multiplexer

199 29 4

search subins2000 PHP

An Open Source Search Engine

198 116 30

pkulaw_spider yinhao0214 Python

爬取北大法宝网http://www.pkulaw.cn/Case/

198 59 14

ghs seart-group Java

GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them

197 24 0

NewsCrawler Jacen789 Python

新闻爬虫，爬取新浪、搜狐、新华网即时财经新闻。

196 34 10

gflare-tk beb7 Python

Open-Source Python Based SEO Web Crawler

195 21 4

cocrawler cocrawler Python

CoCrawler is a versatile web crawler built using modern tools and concurrency.

194 24 20

web-bee codesofun Java

🐝 Web vertical crawler framework for fun

193 37 22

zhihu-crawler-people elliotxx Python

A simple distributed crawler for zhihu && data analysis

193 87 11

ir-search djfksjd Python

🇰🇷 한국 정부 지원사업 전수조사 에이전트 스킬 [Claude Code·Codex·agy(Antigravity)·Cursor·Gemini CLI·Grok Build 지원] K-Startup·기업마당·NIPA·KOCCA·SMTE...

193 46 0

digger hetianyi Go

Digger is a powerful and flexible web crawler implemented by pure golang

191 70 9

zhihu_fun AnyISalIn JavaScript

基于 Selenium 的知乎关键词爬虫

186 34 10

leetcode-spider Ma63d JavaScript

用 node.js 爬你自己的 leetcode 解题源码

186 48 3

crawler-for-github-trending poozhu JavaScript

🕷️ A node crawler for github trending.

185 19 4

kuaishou-crawler oGsLP Python

As you can see, a kuaishou crawler

185 64 11

nCov2019_data_crawler LiuTianyong Python

疫情数据爬虫，2019新型冠状病毒数据仓库，轨迹数据，同乘数据，报道

184 32 4

gogetcrawl karust Go

Extract web archive data using Wayback Machine and Common Crawl

184 17 2

telegram-groups-crawler edogab33 Python

A Telegram crawler made in Python to automatically search groups and channels and collect any type of data from them.

184 54 7

sensitivefilescan aipengjie Python

183 69 11

datmusic-api alashow PHP

182 50 17

crawler

Repositories (1456)