Just a simple web crawler which return crawled links as IObservable using reactive extension and async await.
基于Nodejs,superagent,cheerio的在线web爬虫项目,支持生成API
A search engine for Open Data
🌌 High productivity semi-automatic crawler generator 🛠️🧰
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy...
一个基于 Tampermonkey 插件平台开发的爬虫。主要目的是最大限度模拟用户环境,避免被反爬虫系统识破。
Downloads news articles from Google news and uses pre-trained NLP models to perform sentiment analysis
爱发电爬虫(afdian.com)
Screen scraping and web crawling framework
日常代码爬虫、gui小工具等
Get the lyrics for the song currently playing on Spotify
Crawl a website and take screenshots
Unfx Proxy Parser - Nextgen proxy parser with deep links crawler. Follow to internal links, third-party links. Sorting results by countries.
Iota is a web scraper that can find all of the images and links/suburls on a webpage
A simple, open-source, easy to use, and free download manager for malware samples.
Spider ported to Node.js
Copy of http://phpcrawl.cuab.de/ for using with composer
基于go-gin框架建立减少冗余动作项目,如:下载一些工具
ProxyCrawl Python library for scraping and crawling
Read an Amazon wishlist programmatically with Python
获取滚动新闻
Crawl all your citations from Google Scholar
Libraries and scripts for crawling the TYPO3 page tree. Used for re-caching, re-indexing, publishing applications etc.
微信公众号爬虫,以API方式提供公众号文章获取,包括阅读量、点赞等
Python script that searches GitHub, F-Droid and IzzySoft's F-Droid repo for apps with Shizuku support. Updated daily.
Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v...
talospider - A simple,lightweight scraping micro-framework
[Updated] A simple python crawler for my tutorial blog at http://www.jianshu.com/p/8fb5bc33c78e
Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com, Yandex.com
🎧 Get json type billboard hot 100 chart
Repo Python
Free Facebook pages MetaData Scraping Library - Unlimited Calls
Tapestry - 基于 Agent Skill Bundle 的轻量级书签知识库
可视化爬虫(支持:哔哩哔哩 | 抖音 | 小红书 | 贴吧 | 微博 | 知乎 | 快手),异步、高效、直观地采集国内主流平台的媒体数据的前后端一体项目(Based on "Medi...
Crawl Instagram hashtags
Node.js price monitoring library, leveraging the power of x-ray and nightmare.
Scrape public Facebook pages, posts, reviews and comments
gRPC web crawler turbo charged for performance
ClaudeChrome - Native browser context awareness for agents.
使用RxJava2 和 Java 8的特性开发的图片爬虫
我的爬虫合集
A web search engine built with Python which uses TF-IDF and PageRank to sort search results.
simple crawler for Korean banks with Transactions
NewsCrap adalah alat scraping berita Google berbasis Command Line Interface (CLI) yang dirancang untuk riset, investigasi, dan pengumpulan data OSINT....
Kal El Network Stress Test and Penetration Testing Toolkit
Shopee coin getter is a script to collect daily shopee coins.
CLI to download all images/webms in a 4chan thread
RARBG command line interface for scraping the rarbg.to torrent search engine
Kabegame — An anime image crawler client with pluggable crawlers (from a GitHub plugin repo), wallpaper rotation by custom rules, and Wallpaper Engine...
Crawl a website to generate knowledge file for RAG