一个用于抓取和分析 X (Twitter) 用户数据和推文的工具。
一个超级轻量的百度图片爬虫
A Tumblr Blog Backup Application
ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main content, then answer your questions based on the content, or summarize the key...
Golang短视频去水印:抖音,皮皮虾,火山,微视,最右,快手,全民小视频,皮皮搞笑,西瓜视频,虎牙,梨视频,acfun,好看视频...
HTTP API for Scrapy spiders
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places...
Crawl BookCorpus
爬虫逆向案例,已完成:TLS指纹|瑞数|震坤行 | 网易易盾 | 微信小程序反编译逆向(百达星系) | 同花顺 | rpc解密 | 加速乐 | 极验滑块验证码 | 巨量算数 | Boss...
ArrowDL (Arrow Downloader) is a download manager for Windows, MacOS and Linux
Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分析工具, 实现功能: 1.微博评论数据爬取; 2.分词与关键词提取; 3.词云与词频统计; 4.情...
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to...
Simple but useful Python web scraping tutorial code.
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code...
🎓 中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下载。
Java API For Chrome and Firefox
[Unmaintained] A simple and clean video/music/image downloader 👾
🛑 image collector, which supports custom acquisition source configuration and is compatible with MacOS and Windows operating systems.
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Doujinshi downloader 绅士漫画下载
SEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...
:paw_prints: Creeper - The Next Generation Crawler Framework (Go)
An ergonomic Rust HTTP Client with TLS fingerprint
Crawler (Bot) searching for credential leaks on paste sites.
js cookie逆向利器:js cookie变动监控可视化工具 & js cookie hook打条件断点
BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百...
《爬虫逆向进阶实战》书籍代码库
A lightweight web crawler framework.(Java爬虫框架)
🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON.
:newspaper: Let ChatGPT Summarize Hacker News for You
SiteOne Crawler is a cross-platform website crawler and analyzer for SEO, security, accessibility, and performance optimization—ideal for developers,...
The best PTT library
A Tumblr and Twitter Blog Backup Application
Wscan is a web security scanner that focuses on web security, dedicated to making web security accessible to everyone.
Open source SEO audit tool.
A Facebook crawler
A search application to explore, discover and share online files
Search google, bing, yahoo, and other search engines with python
OSINT Swiss Army Knife
获取免费socks/https/http代理的网站集合
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
API of DouYin for Humans used to Crawl Popular Videos and Musics
NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Python的基础练习代码与各种爬虫代码
High-performance asynchronous Douyin(抖音) TikTok Xiaohongshu(小红书) Kuaishou(快手) Weibo(微博) Instagram YouTube(油管) Twitter(X) Captcha Solver(验...
Locally saves webpages to your hard disk with images, css, js & links as is.
爬取菜鸟教程网站并转PDF__python_crawer_by_chrome
带你了解一下Golang的市场行情
What do people have in their dotfiles?
Jie stands out as a comprehensive security assessment and exploitation tool meticulously crafted for web applications. Its robust suite of features en...