Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors for Python.
python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,壁纸爬取,xvideos视频爬取,有声书爬取,微博爬虫,安居客信息爬取+数据可视化,哔哩...
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Crawl a website and run it through Google lighthouse
Videodl: A lightweight video downloader written in pure python. (轻量级视频下载器,优先高清无水印,支持抖音,快手,小红书,B站,TikTok,YouTube,FIFA+...
ScopeSentry-Cyberspace mapping, subdomain enumeration, port scanning, sensitive information discovery, vulnerability scanning, distributed nodes
抖音爬虫——采集账号主页、喜欢、收藏、音乐原声、话题、搜索、合集、作品、关注、粉丝等公开数据。
Elasticsearch File System Crawler (FS Crawler)
A web privacy measurement framework
浏览过的精彩逆向文章汇总,值得一看
小红书数据采集、网站图片、视频资源批量下载工具,颜值超高的数据采集工具(批量下载,视频提取,图片)Telegram:https://t.me/+ZtLSwuIKTo44MDY1
🤖 Scrape data from HTML websites automatically by just providing examples
It makes a preview from an URL, grabbing all the information such as title, relevant texts and images.
K 哥爬虫代码分享,JS 逆向,爬虫进阶。关注公众号:K哥爬虫
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy...
An ergonomic Python HTTP Client with TLS fingerprint
Python爬虫,京东自动登录,在线抢购商品
The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.
The fastest dork scanner written in Go.
Beanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性,基于 Workerman。
📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.
收集各种免费的 Python 爬虫项目
一个好用的哔哩哔哩漫画下载器,拥有图形界面,支持关键词搜索漫画和二维码登入,黑科技下载未解锁章节,多线程下载,多种保存格式,本地漫画管理,一键检查更新...
基于appium的app自动遍历工具
massive SQL injection vulnerability scanner
🤖 Fake fingerprints to bypass anti-bot systems. Simulate mouse and keyboard operations to make behavior like a real person.
👧 美女写真套图爬虫(二)
:beers: bilibili video (including bangumi) and danmaku downloader | B站视频(含番剧)、弹幕下载器
Easily download all the photos/videos from tumblr blogs. 下载指定的 Tumblr 博客中的图片,视频
A Simple Mihomo GUI. 一个简易的 Mihomo 桌面客户端
Fess is very powerful and easily deployable Enterprise Search Server.
Write web scrapers in Ruby using a clean, AI-assisted DSL. Kimurai uses AI to figure out where the data lives, then caches the selectors and scrapes w...
📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
Crawly, a high-level web crawling & scraping framework for Elixir.
Run a high-fidelity browser-based web archiving crawler in a single Docker container
A tool for pixiv.net. 人人可用的P站爬虫
A scalable, mature and versatile web crawler based on Apache Storm
✌️ Python3 BitTorrent DHT crawler
Google play scraper for Python inspired by <facundoolano/google-play-scraper>
A high performance web crawler / scraper in Elixir.
SpiderSuite releases, wiki and roadmap
Scalable Python web scraping scripts for +40 popular domains
小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对
一个方便安全研究人员获取每日安全日报的爬虫和推送程序,目前爬取范围包括先知社区、安全客、Seebug Paper、跳跳糖、奇安信攻防社区、棱角社区以及绿盟、腾讯玄...
zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
A multi-thread crawler framework with many builtin image crawlers provided.