Topic

crawler

Repositories (1431)

weibo_wordcloud
weibo_wordcloud gaussic Python

根据关键词抓取微博数据,再生成词云

221
google-group-crawler
google-group-crawler icy Shell

[Deprecated] Get (almost) original messages from google group archives. Your data is yours.

220
91porn-crawler
91porn-crawler blue-troy Java

91 porn crawler. 自动爬取并下载你想要的91porn热门视频。Automatically download your "favorite" 91porn hot movies.

220
MetaFinder
MetaFinder Josue87 Python

Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata

219
darc
darc JarryShaw Python

Darkweb Crawler Project

216
scrapedin-linkedin-crawler
scrapedin-linkedin-crawler linkedtales JavaScript

Crawler for LinkedIn full profiles 2019

214
proxyhub
proxyhub ForceFledgling Python

An advanced [Finder | Checker | Server] tool for proxy servers, supporting both HTTP(S) and SOCKS protocols. 🎭

212
goribot
goribot zhshch2002 Go

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

211
JavPy
JavPy TheodoreKrypton JavaScript

Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese AV!

210
RustySEO
RustySEO mascanho TypeScript

SEO/GEO toolkit to analyse, crawl, parse and optimise websites & logs (Nginx & Apache)

209
FindJobs-Agent
FindJobs-Agent he-yufeng Python

LLM-powered toolkit for skill analysis, AI interviews, resume scoring, and job structuring. Automates professional skill taxonomy and interview proces...

204
NoSmoke
NoSmoke macacajs JavaScript

A cross platform UI crawler which scans view trees then generate and execute UI test cases.

203
WeChat-Channels-Video-File-Decryption
WeChat-Channels-Video-File-Decryption Evil0ctal WebAssembly

一个可在线运行的微信视频号加密视频解密工具和 API 服务,基于逆向工程分析实现。本项目使用微信官方的 WebAssembly (WASM) 模块来生成 Isaac64 PRNG 密钥流,...

203
ChainWalker
ChainWalker 0xsha Go

Rapid Smart Contract Crawler

203
laosj
laosj songtianyi Go

golang light-weight image crawler

202
instagram-crawler
instagram-crawler mgleon08 Ruby

Crawl instagram photos, posts and videos for download.

202
VideoServer
VideoServer GF-Allen JavaScript

以Node.js基于express以及爬虫实现的视频资源后端

200
crawlers
crawlers Norconex Java

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to v...

200
packagist-mirror
packagist-mirror webysther PHP

📦✂️📋📦 Create a mirror of packagist.org metadata for use locally with composer

200
search
search subins2000 PHP

An Open Source Search Engine

198
authority-data
authority-data yiliyassh Python

官方权威数据:统计年签,统计公报,互联网行业报告,工信部数据,ICT报告等 Official authoritative data (Chinese)

198
scrapemate
scrapemate gosom Go

Golang Crawling and scraping framework

197
WebScrapper
WebScrapper nuhmanpk Python

Powerful Telegram bot for web scraping and crawling. Fast, easy, and loved by thousands!

195
gflare-tk
gflare-tk beb7 Python

Open-Source Python Based SEO Web Crawler

195
zhihu-crawler-people
zhihu-crawler-people elliotxx Python

A simple distributed crawler for zhihu && data analysis

194
NewsCrawler
NewsCrawler Jacen789 Python

新闻爬虫,爬取新浪、搜狐、新华网即时财经新闻。

194
slrp
slrp nfx Go

rotating open proxy multiplexer

194
web-bee
web-bee codesofun Java

🐝 Web vertical crawler framework for fun

193
cocrawler
cocrawler cocrawler Python

CoCrawler is a versatile web crawler built using modern tools and concurrency.

193
SpideyX
SpideyX RevoltSecurities Python

SpideyX a multipurpose Web Penetration Testing tool with asynchronous concurrent performance with multiple mode and configurations.

192
awesome-python-primer
awesome-python-primer zkqiang Python

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向

192
digger
digger hetianyi Go

Digger is a powerful and flexible web crawler implemented by pure golang

191
pkulaw_spider
pkulaw_spider yinhao0214 Python

爬取北大法宝网http://www.pkulaw.cn/Case/

190
ghs
ghs seart-group Java

GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them

189
zhihu_fun
zhihu_fun AnyISalIn JavaScript

基于 Selenium 的知乎关键词爬虫

186
kuaishou-crawler
kuaishou-crawler oGsLP Python

As you can see, a kuaishou crawler

186
leetcode-spider
leetcode-spider Ma63d JavaScript

用 node.js 爬你自己的 leetcode 解题源码

185
crawler-for-github-trending
crawler-for-github-trending poozhu JavaScript

🕷️ A node crawler for github trending.

185
nCov2019_data_crawler
nCov2019_data_crawler LiuTianyong Python

疫情数据爬虫,2019新型冠状病毒数据仓库,轨迹数据,同乘数据,报道

184
sensitivefilescan
sensitivefilescan aipengjie Python
183
datmusic-api
datmusic-api alashow PHP
182
DotnetCrawler
DotnetCrawler mehmetozkaya C#

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library des...

181
Web-Data-Scraper
Web-Data-Scraper umbrellaDocumentation JavaScript

Web Data Scraper - no-code internet scraping. Extract and export to CSV, Excel, JSON, Google Sheets, and Webhook.

180
evine
evine saeeddhqan Go

Interactive CLI Web Crawler

179
awesome-ai-reverse
awesome-ai-reverse darbra

ai reverse 一把梭

178
ungoliant
ungoliant oscar-project Rust

:spider: The pipeline for the OSCAR corpus

177
telegram-groups-crawler
telegram-groups-crawler edogab33 Python

A Telegram crawler made in Python to automatically search groups and channels and collect any type of data from them.

176
ScrapingOutsourcing
ScrapingOutsourcing bytebuff Julia

ScrapingOutsourcing专注分享爬虫代码 尽量每周更新一个

175
Squidwarc
Squidwarc N0taN3rd JavaScript

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

175
gogetcrawl
gogetcrawl karust Go

Extract web archive data using Wayback Machine and Common Crawl

175