Most popular crawler repositories and open source projects

scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python...

10050   47723   47723  

EasySpider

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器...

4681   38030   38030  

lux

👾 Fast and simple video download library and CLI tool written in Go

3078   28736   28736  

colly

Elegant Scraper and Crawler Framework for Golang

1780   23786   23786  

proxy_pool

Python爬虫代理IP池(proxy pool)

4702   18255   18255  

pyspider

A Powerful Spider(Web Crawler) System in Python.

3681   15945   15945  

newspaper

News, full-text, and article metadata extraction in Python 3. Advanced...

2044   12913   12913  

examples-of-web-crawlers

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、...

3645   12168   12168  

crawlab

Distributed web crawler admin platform for spiders management regardle...

1813   11576   11576  

webmagic

A scalable web crawler framework for Java.

4175   11507   11507  

Photon

Incredibly fast crawler designed for OSINT.

1409   9785   9785  

avbook

AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆...

2024   8965   8965  

crawlee

Crawlee—A web scraping and browser automation library for Node.js that...

374   8610   8610  

Python

Python脚本。模拟登录知乎, 爬虫,操作excel,微信公众号,远程开机

4176   8463   8463  

spider-flow

新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。

1565   8149   8149  

katana

A next-generation crawling and spidering framework.

346   6770   6770  

autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

696   6656   6656  

awesome-crawler

A collection of awesome web crawler,spider in different languages

711   6646   6646  

node-crawler

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

891   6462   6462  

pholcus

[Crawler for Golang] Pholcus is a distributed, high concurrency and po...

1502   6197   6197  

awesome-web-scraping

List of libraries, tools and APIs for web scraping and data processing...

761   5870   5870  

ferret

Declarative web scraping

303   5783   5783  

WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

1630   5541   5541  

headless-chrome-crawler

Distributed crawler powered by Headless Chrome

433   5384   5384  

scrapy-redis

Redis-based components for Scrapy.

1581   5307   5307  

haipproxy

:sparkling_heart: High available distributed ip proxy pool, powerd by...

945   5286   5286  

browser-fingerprinting

Analysis of Bot Protection systems with available countermeasures 🚿....

225   4247   4247  

myGPTReader

A community-driven way to read and chat with AI bots - powered by chat...

421   4095   4095  

ECommerceCrawlers

实战🐍多种网站、电商数据爬虫🕷。包含🕸:淘宝商品、微信公众号、大众点评、...

1219   3969   3969  

dom-crawler

Eases DOM navigation for HTML and XML documents

123   3804   3804  

scylla

Intelligent proxy pool for Humans™

465   3746   3746  

DotnetSpider

DotnetSpider, a .NET standard web crawling library. It is lightweight,...

1002   3673   3673  

toapi

Every web site provides APIs.

235   3523   3523  

proxypool

自动抓取tg频道、订阅地址、公开互联网上的ss、ssr、vmess、trojan节点信息...

2558   3454   3454  

ProxyBroker

Proxy [Finder | Checker | Server]. HTTP(S) & SOCKS :performing_arts:

951   3408   3408  

arachni

Web Application Security Scanner Framework

732   3405   3405  

Douyin_TikTok_Download_API

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok...

612   3161   3161  

TorBot

Dark Web OSINT Tool

554   3135   3135  

Crawler_Illegal_Cases_In_China

Collection of China illegal cases about web crawler 本项目用来整理所有...

250   3101   3101  

NGCBot

一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知...

392   2933   2933  

geziyor

Geziyor, blazing fast web crawling & scraping framework for Go. Suppor...

150   2670   2670  

DecryptLogin

DecryptLogin: APIs for loginning some websites by using requests.

737   2669   2669  

gospider

Gospider - Fast web spider written in Go

322   2655   2655  

QueryList

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框...

435   2573   2573  

RED_HAWK

All in one tool for Information Gathering, Vulnerability Scanning and...

823   2532   2532  

gecco

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

890   2510   2510  

GoogleScraper

A Python module to scrape several search engines (like Google, Yandex,...

761   2504   2504  

crawlergo

A powerful browser crawler for web vulnerability scanners

446   2499   2499  

instagram-scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by i...

398   2495   2495  

Python3-Spider

Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团...

972   2491   2491