Topic

crawler

Repositories (1232)

RED_HAWK
RED_HAWK Tuhinshubhra PHP

All in one tool for Information Gathering, Vulnerability Scanning and Crawling. A must have tool for all penetration testers

2.5k
gecco
gecco xtuhcy Java

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

2.5k
instagram-scraper
instagram-scraper realsirjoe Python

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

2.5k
Python3-Spider
Python3-Spider wkunzhi Python

Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️

2.5k
weibo-crawler
weibo-crawler dataabc Python

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频

2.5k
lianjia-beike-spider
lianjia-beike-spider jumper2014 Python

链家网和贝壳网房价爬虫,采集北京上海广州深圳等21个中国主要城市的房价数据(小区,二手房,出租房,新房),稳定可靠快速!支持csv,MySQL, MongoDB,Excel, js...

2.5k
work_crawler
work_crawler kanasimi JavaScript

Download comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫...

2.5k
grab
grab lorien Python

Web Scraping Framework

2.4k
crawler
crawler spatie PHP

An easy to use, powerful crawler implemented in PHP. Can execute Javascript.

2.4k
news-please
news-please fhamborg Python

news-please - an integrated web crawler and information extractor for news that just works

2.3k
abot
abot sjdirect C#

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

2.3k
gain
gain gaojiuli Python

Web crawling framework based on asyncio.

2k
skycaiji
skycaiji zorlan PHP

蓝天采集器是一款开源免费的爬虫系统,仅需点选编辑规则即可采集数据,可运行在本地、虚拟主机或云服务器中,几乎能采集所有类型的网页,无缝对接各类CMS建站程...

2k
gocrawl
gocrawl PuerkitoBio Go

Polite, slim and concurrent web crawler.

2k
DXY-COVID-19-Crawler
DXY-COVID-19-Crawler BlankerL Python

2019新型冠状病毒疫情实时爬虫及API | COVID-19/2019-nCoV Realtime Infection Crawler and API

2k
rendora
rendora rendora Go

Dynamic server-side rendering using headless Chrome

2k
vulnx
vulnx anouarbensaad Python

vulnx 🕷️ an intelligent Bot, Shell can achieve automatic injection, and help researchers detect security vulnerabilities CMS system. It can perform a...

1.9k
dirhunt
dirhunt Nekmo Python

Find web directories without bruteforce

1.9k
feapder
feapder Boris-code Python

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、Batc...

1.9k
spider
spider spider-rs Rust

Web crawler and scraper for Rust

1.9k
lxSpider
lxSpider lixi5338619 Python

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、各种指数、维普万方、...

1.8k
go_spider
go_spider hu17889 Go

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized c...

1.8k
FinalRecon
FinalRecon thewhiteh4t Python

The Last Web Recon Tool You'll Need

1.8k
Crawler-Detect
Crawler-Detect JayBizzle PHP

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

1.8k
PSpider
PSpider xianhu Python

简单易用的Python爬虫框架,QQ交流群:597510560

1.8k
bilix
bilix HFrost0 Python

⚡️Lightning-fast async download tool for bilibili and more

1.8k
xalpha
xalpha refraction-ray Python

基金投资管理回测引擎

1.8k
SCrawler
SCrawler AAndyProgram Visual Basic .NET

🏳️‍🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, BlueSky, TikTok, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub...

1.7k
x-crawl
x-crawl coder-hxl TypeScript

Flexible Node.js AI-assisted crawler library

1.7k
ruia
ruia howie6879 Python

Async Python 3.6+ web scraping micro-framework based on asyncio

1.7k
AutoCrawler
AutoCrawler YoongiKim Python

Google, Naver multiprocess image web crawler (Selenium)

1.7k
diskover-community
diskover-community diskoverdata PHP

Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch

1.6k
CatVodTVSpider
CatVodTVSpider CatVodTVOfficial Java
1.6k
NewPipeExtractor
NewPipeExtractor TeamNewPipe Java

NewPipe's core library for extracting data from streaming sites

1.6k
scrapoxy
scrapoxy fabienvauchelles JavaScript

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

1.6k
lightcrawler
lightcrawler github JavaScript

Crawl a website and run it through Google lighthouse

1.5k
goclone
goclone imthaghost Go

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

1.5k
fscrawler
fscrawler dadoonet Java

Elasticsearch File System Crawler (FS Crawler)

1.4k
SwiftLinkPreview
SwiftLinkPreview LeonardoCardoso Swift

It makes a preview from an URL, grabbing all the information such as title, relevant texts and images.

1.4k
mlscraper
mlscraper lorey Python

🤖 Scrape data from HTML websites automatically by just providing examples

1.4k
wombat
wombat felipecsl Ruby

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

1.3k
OpenWPM
OpenWPM openwpm Python

A web privacy measurement framework

1.3k
jd-autobuy
jd-autobuy adyzng Python

Python爬虫,京东自动登录,在线抢购商品

1.3k
go-dork
go-dork dwisiswant0 Go

The fastest dork scanner written in Go.

1.2k
fakebrowser
fakebrowser kkoooqq JavaScript

🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.

1.2k
AppCrawler
AppCrawler seveniruby Scala

基于appium的app自动遍历工具

1.2k
Beanbun
Beanbun kiddyuchina PHP

Beanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性,基于 Workerman。

1.2k
bilili
bilili yutto-dev Python

:beers: bilibili video (including bangumi) and danmaku downloader | B站视频(含番剧)、弹幕下载器

1.2k
tumblr-crawler
tumblr-crawler dixudx Python

Easily download all the photos/videos from tumblr blogs. 下载指定的 Tumblr 博客中的图片,视频

1.1k
fess
fess codelibs Java

Fess is very powerful and easily deployable Enterprise Search Server.

1.1k