动漫之家漫画站电脑版原图爬虫
可配置的小说下载及电子书生成工具
hproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
News extraction and scraping. Article Parsing
A PasteBin scrapper that doesnt rely on the PasteBin scrape API
豆瓣电影爬虫——a crawler which is able to crawl movie detail and short comments, save them to database mysql, also include Sentiment analysis based on...
A crawler for scraping posts from medium.com
Automatically download all PDF files of searching results & their patent families found on Google Patents.
A command line tool based on the crypto-crawler library.
A DHT Crawler based on Goroutine
Python project to crawl and scrap the lesser known deep web or one can say dark web. Just provide the onion link and get started.
An infinite Pinterest crawler/scraper. Crawl image with inifnite-scroll!
Google Maps crawler using Selenium. All extracted data is forwarded to a SQS queue.
All In One, Fast, Easy Recon Tool
Shadowsocks. 科学上网, 仅供学习。是免费的服务器,可能存在科学上网不稳定。
crawl QR-codes from search engines and look for bitcoin private keys
Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
python requests + Django+ nodejs koa+ mysql to crawl eastmoney fund and stock data,for data analysis and visualiaztion .
🍰 A visual crawler management platform
优雅地玩知乎
(deprecated) :cat: koshort is a Python package for Korean internet spoken language crawling and processing... or maybe Korean domestic cat.
百度贴吧分布式爬虫,用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析
sciBASIC# is a kind of dialect language which is derive from the native VB.NET language, and written for the data scientist.
网络数据采集技术—Java网络爬虫 (书稿完整代码,涉及网络爬虫的各种技术和知识点)
研究学习各种拦截:反爬虫、拦截ad、防广告注入、斗黄牛等
Cross-platform persistent and distributed web crawler :crab:
爬虫管理系统,支持集群,弹性伸缩。支持运行feapder、scrapy、selenium、playwright等各种框架及脚本
PHP Metacritic API - Mirror from my GitLab
Iota is a web scraper which can find all of the images and links/suburls on a webpage
Just a simple web crawler which return crawled links as IObservable using reactive extension and async await.
徒手实现定时爬取知乎,从中发掘有价值的信息,并可视化爬取的数据作网页展示。
🌌 High productivity semi-automatic crawler generator 🛠️🧰
A document viewer; fuzzy match incremental search.
大麦网抢票脚本案例
An intelligent web service to automatically detect web content and extract information from it.
日常代码爬虫、gui小工具等
基于Nodejs,superagent,cheerio的在线web爬虫项目,支持生成API
Google资深工程师深度讲解Go语言 爬虫项目。
ProxyCrawl Python library for scraping and crawling
Screen scraping and web crawling framework
Grabs current REWE discounts and saves them in a markdown file || Holt sich aktuelle REWE-Angebote und exportiert sie in eine Markdown-Liste
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy...
Copy of http://phpcrawl.cuab.de/ for using with composer
Easy-to-follow examples in Python, Node.js, and C# for web automation & multi-accounting with Kameleo anti-detect browser.
Get the lyrics for the song currently playing on Spotify
A crawler for the IPFS network, code for our paper (https://arxiv.org/abs/2002.07747). Also holds scripts to evaluate the obtained data and make simil...
A SoFIFA webcrawler and Machine Learning prediction
TumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.
Python script to download slideshare pdf. This script able to download slide and converted into pdf automatically.
Scrape public Facebook pages, posts, reviews and comments