Most popular crawler repositories and open source projects

javbus-api ovnrain TypeScript

一个自我托管的 JavBus API 服务

386 69 3

InstagramCrawler tzuhsial Python

A non API python program to crawl public photos, posts or followers

385 101 1

webpalm XORbit01 Go

🕸️ Crawl in the web network

382 40 2

supercrawler brendonboshell JavaScript

A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limi...

381 63 9

crawler-js-hook-framework-public JSREI

JS逆向Hook工具集，开源部分工具到这里

381 105 10

nebula dennis-tra Go

🌌 An agnostic network crawler exposing comprehensive peer information and network topology information.

381 51 8

TTBot 01ly Python

今日头条机器人，支持用户登陆、关注、取消关注、获取关注粉丝、发文、发悟空问答、点赞、评论、采集各种类型新闻讯息等，使用今日头条网页版API实现

377 145 377

tsec Asoul

台灣上市上櫃股票爬蟲 Taiwan Stock Exchange Crawler

377 169 64

google-news-scraper lewisdonovan JavaScript

Lightweight scraper for Google News

376 72 9

nudecrawler yaroslaff Python

Crawl telegra.ph searching for nudes!

375 31 9

news-crawl commoncrawl Java

News crawling with StormCrawler - stores content as WARC

375 41 29

copymanga-downloader misaka10843 Python

使用python+copymanga API来下载copymanga(拷贝漫画)中的漫画(无速率限制)，支持批量+选话下载和获取您收藏的漫画并下载及半自动获取订阅下载！(全平台支持(pypi...

372 24 0

JSSoup chishui JavaScript

JavaScript + BeautifulSoup = JSSoup

371 33 11

crawler crwlrsoft PHP

Library for Rapid (Web) Crawler and Scraper Development

369 13 3

weixin-spider xzkzdx Python

微信公众号爬虫，公众号历史文章，文章评论，文章阅读及在看数据，可视化web页面，可部署于Windows服务器。基于Python3之flask/mysql/redis/mitmproxy/pywin32等...

368 90 368

CrawlerTutorial leVirve Python

爬蟲極簡教學（fetch, parse, search, multiprocessing, API）- PTT 為例

368 97 18

QQMusicSpider yangjianxin1 Python

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

367 72 7

chinese-fund-crawler jackluson Python

中国场外基金数据爬取&汇总分析

365 156 1

scrapy-zyte-smartproxy scrapy-plugins Python

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

363 92 53

Rcrawler salimk R

An R web crawler and scraper

362 91 39

vyntr outpoot TypeScript

Independent search engine. Includes web crawling, search indexing, dictionary API, and more. https://vyntr.com

362 33 1

hQuery.php duzun PHP

An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.

360 71 21

zhihu-login zkqiang Python

知乎模拟登录，支持提取验证码和保存 Cookies

355 141 0

telegram-crawler MarshalX Python

🕷 Automatically detects changes to the official Telegram sites, beta clients, MTProto servers and mini apps

355 48 15

spidy rivermont Python

The simple, easy to use command line web crawler.

354 68 20

Antibot-Detector scrapfly JavaScript

Real-time detection of anti-bot systems, CAPTCHAs & fingerprinting techniques. Identifies Cloudflare, Akamai, DataDome, reCAPTCHA, hCaptcha, Shape Se...

350 37 5

91porn-api colikno JavaScript

🌭💦 91porn爬虫在线无限制API接口（永久有效，口令每日更新）及在线web预览

345 28 14

lightnovel_epub JeffersonQin Python

🍭 epub generator for (light)novels (轻)小说 epub 生成器，支持站点：轻之国度、轻小说文库

343 22 4

xcrawler yan68 PHP

快速、简洁且强大的PHP爬虫框架

341 48 14

sitemap-generator-cli lgraubner JavaScript

Creates an XML-Sitemap by crawling a given site.

341 44 5

crawley s0rg Go

The unix-way web crawler

340 18 1

ppspider xiyuan-fengyu TypeScript

web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppetee...

338 73 10

tiktok-downloader krypton-byte Python

Tiktok Downloader/Scraper using requests & bs4

336 89 7

polite dmi3kno R

Be nice on the web

334 12 6

RustySEO mascanho TypeScript

SEO/GEO toolkit to analyse, crawl, parse and optimise websites & logs (Nginx & Apache)

330 60 11

WeChat-Channels-Video-File-Decryption Evil0ctal WebAssembly

一个可在线运行的微信视频号加密视频解密工具和 API 服务，基于逆向工程分析实现。本项目使用微信官方的 WebAssembly (WASM) 模块来生成 Isaac64 PRNG 密钥流，...

327 102 2

wencai GraySilver JavaScript

This is a wencai crawler.（i问财的策略回测接口的Pythonic工具包）

325 122 19

awesome-java-crawler rockswang

本仓库收集整理爬虫相关资源，开发语言以Java为主

324 75 23

scrapper amerkurev Python

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

324 52 3

Laravel-Crawler-Detect JayBizzle PHP

A Laravel wrapper for CrawlerDetect - the web crawler detection library

323 28 12

oddish puppylpg Python

Crawl csgo skin info from `buff.163.com` and steam, then find the most suitable one to buy from the former and to sell to the latter.

322 77 2

extractor lightfeed TypeScript

Use LLMs to robustly extract web data

319 9 0

4chan-downloader Exceen Python

Python3 script to continuously download all images/videos of multiple 4chan threads simultaneously - without installation

314 45 15

crawler infinilabs Go

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

312 80 22

crawler_shopee_public hsuanchi Python

蝦皮非同步爬蟲 + 競品賣家分析

312 100 6

aliexpress-product-scraper sudheer-ranga JavaScript

Get Aliexpress product details as a json response including feedbacks, variants, shipping info, description, images, etc.,

309 107 12

Python-Web-Scraping-Tutorial oxylabs Python

In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move...

306 35 2

js-reverse freedom-wy HTML

JS逆向研究

304 84 10

gplay-scraper Mohammedcha Python

GPlay Scraper is a powerful Python Google Play scraper library for extracting comprehensive app data from the Google Play Store. Scrape Google Play St...

304 30 11

Fast-LianJia-Crawler CaoZ Python

直接通过链家 API 抓取数据的极速爬虫，宇宙最快~~ 🚀

301 100 16

crawler

Repositories (1456)