Topic

crawler

Repositories (1431)

puppeteer-sharp
puppeteer-sharp hardkoded C#

Headless Chrome .NET API

3.9k
feapder
feapder Boris-code Python

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、Batc...

3.7k
RED_HAWK
RED_HAWK Tuhinshubhra PHP

All in one tool for Information Gathering, Vulnerability Scanning and Crawling. A must have tool for all penetration testers

3.6k
mdcx
mdcx sqzw-x Python

Movie metadata scraper

3.6k
toapi
toapi elliotgao2 Python

Every web site provides APIs.

3.6k
cariddi
cariddi edoardottt Go

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

3.4k
Python3-Spider
Python3-Spider wkunzhi Python

Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

3.4k
NGCBot
NGCBot ngc660sec

一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知,安全客,奇安信攻防社区】,👯Kfc文案,⚡漏洞查询,⚡手机号归属地查询,⚡知识库查...

3.3k
crawlergo
crawlergo Qianlitp Go

A powerful browser crawler for web vulnerability scanners

3k
gospider
gospider jaeles-project Go

Gospider - Fast web spider written in Go

2.9k
DecryptLogin
DecryptLogin CharlesPikachu Python

DecryptLogin: APIs for loginning some websites by using requests.

2.9k
owllook
owllook howie6879 Python

owllook-小说搜索引擎

2.8k
google-play-scraper
google-play-scraper facundoolano JavaScript

Node.js scraper to get data from Google Play

2.8k
crawler
crawler spatie PHP

https://spatie.be/docs/crawler

2.8k
GoogleScraper
GoogleScraper NikolaiT HTML

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.

2.8k
geziyor
geziyor geziyor Go

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

2.8k
FinalRecon
FinalRecon thewhiteh4t Python

All In One Web Recon

2.7k
QueryList
QueryList jae-jae PHP

:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

2.7k
gecco
gecco xtuhcy Java

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

2.5k
instagram-scraper
instagram-scraper realsirjoe Python

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

2.5k
xalpha
xalpha refraction-ray Python

基金投资管理回测引擎

2.5k
lianjia-beike-spider
lianjia-beike-spider jumper2014 Python

链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个中国主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, js...

2.5k
grab
grab lorien Python

Web Scraping Framework

2.5k
news-please
news-please fhamborg Python

news-please - an integrated web crawler and information extractor for news that just works

2.4k
spider
spider spider-rs Rust

Web crawler and scraper for Rust

2.4k
Leaked-GPTs
Leaked-GPTs friuns2 Python

Leaked GPTs Prompts Bypass the 25 message limit or to try out GPTs without a Plus subscription.

2.4k
Crawler-Detect
Crawler-Detect JayBizzle PHP

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

2.3k
abot
abot sjdirect C#

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

2.3k
vulnx
vulnx anouarbensaad Python

vulnx 🕷️ an intelligent Bot, Shell can achieve automatic injection, and help researchers detect security vulnerabilities CMS system. It can perform a...

2.1k
goclone
goclone goclone-dev Go

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

2.1k
skycaiji
skycaiji zorlan PHP

蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程...

2.1k
gocrawl
gocrawl PuerkitoBio Go

Polite, slim and concurrent web crawler.

2.1k
gain
gain elliotgao2 Python

Web crawling framework based on asyncio.

2k
SCrawler
SCrawler AAndyProgram Visual Basic .NET

🏳️‍🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, BlueSky, TikTok, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub...

2k
rendora
rendora rendora Go

Dynamic server-side rendering using headless Chrome

2k
dirhunt
dirhunt Nekmo Python

Find web directories without bruteforce

2k
DXY-COVID-19-Crawler
DXY-COVID-19-Crawler BlankerL Python

2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API

2k
lxSpider
lxSpider lixi5338619 Python

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、各种指数、维普万方、...

1.9k
ast-hook-for-js-RE
ast-hook-for-js-RE JSREI JavaScript

浏览器内存漫游解决方案(探索中...)

1.9k
article-extractor
article-extractor extractus JavaScript

To extract main article from given URL with Node.js

1.9k
PSpider
PSpider xianhu Python

简单易用的Python爬虫框架,QQ交流群:597510560

1.8k
BT-btt
BT-btt u3c3

磁力網站U3C3介紹以及域名更新

1.8k
go_spider
go_spider hu17889 Go

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized c...

1.8k
x-crawl
x-crawl coder-hxl TypeScript

Flexible Node.js AI-assisted crawler library

1.8k
WaterCrawl
WaterCrawl watercrawl TypeScript

Transform Web Content into LLM-Ready Data

1.8k
NewPipeExtractor
NewPipeExtractor TeamNewPipe Java

NewPipe's core library for extracting data from streaming sites

1.8k
diskover-community
diskover-community diskoverdata PHP

Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch

1.8k
bilix
bilix HFrost0 Python

⚡️Lightning-fast async download tool for bilibili and more

1.8k
ruia
ruia howie6879 Python

Async Python 3.6+ web scraping micro-framework based on asyncio

1.7k
AutoCrawler
AutoCrawler YoongiKim Python

Google, Naver multiprocess image web crawler (Selenium)

1.7k