🔥 The Web Data API for AI - Power AI agents with clean web data
Scrapy, a fast high-level web crawling & scraping framework for Python.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:Service...
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
👾 Fast and simple video download library and CLI tool written in Go
Elegant Scraper and Crawler Framework for Golang
Python scraper based on AI
Python ProxyPool for web spider
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs,...
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
A Powerful Spider(Web Crawler) System in Python.
A next-generation crawling and spidering framework.
🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are...
Incredibly fast crawler designed for OSINT.
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
A scalable web crawler framework for Java.
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Python脚本。模拟登录知乎, 爬虫,操作excel,微信公众号,远程开机
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Jap...
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, P...
为你 7*24 在线搞钱的“云上牛马”团队
List of libraries, tools and APIs for web scraping and data processing.
A collection of awesome web crawler,spider in different languages
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
基于搜狗微信搜索的微信公众号爬虫接口
[Crawler for Golang] Pholcus is a distributed, high concurrency and powerful web crawler software.
Declarative web scraping
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Distributed crawler powered by Headless Chrome
Redis-based components for Scrapy.
Python API for JMComic | 提供Python API访问禁漫天堂,同时支持网页端和移动端 | 禁漫天堂GitHub Actions下载器🚀
:sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis
实战🐍多种网站、电商数据爬虫🕷。包含🕸:淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景...
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵...
Collection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作...
新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频
A community-driven way to read and chat with AI bots - powered by chatGPT.
DotnetSpider, a .NET standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework
Proxy [Finder | Checker | Server]. HTTP(S) & SOCKS :performing_arts:
Eases DOM navigation for HTML and XML documents
Intelligent proxy pool for Humans™ to extract content from the internet and build your own Large Language Models in this new AI era
Web Application Security Scanner Framework
Automatically crawls proxy nodes on the public internet, de-duplicates and tests for usability and then provides a list of nodes
Download comics novels 小说漫画下载工具 小説漫画のダウンローダ 小說漫畫下載:腾讯漫画 大角虫漫画 有妖气 咪咕 SF漫画 哦漫画 看漫画 漫画柜 汗汗酷漫 動漫...
Dark Web OSINT Tool
Headless Chrome .NET API