Topic

crawler

Repositories (1232)

leetcode-spider
leetcode-spider Ma63d JavaScript

用 node.js 爬你自己的 leetcode 解题源码

185
zhihu-crawler-people
zhihu-crawler-people elliotxx Python

A simple distributed crawler for zhihu && data analysis

184
sensitivefilescan
sensitivefilescan aipengjie Python
184
datmusic-api
datmusic-api alashow PHP
182
Crawler-for-Github-Trending
Crawler-for-Github-Trending poozhu JavaScript

🕷️ A node crawler for github trending.

180
xvideos
xvideos rodrigogs JavaScript

xvideos API library

180
packagist-mirror
packagist-mirror webysther PHP

📦✂️📋📦 Create a mirror of packagist.org metadata for use locally with composer

180
rotating-tor-http-proxy
rotating-tor-http-proxy zhaow-de Shell

A multi-arch image provides one HTTP proxy endpoint with many concurrent tunnels to the Tor network.

178
nCov2019_data_crawler
nCov2019_data_crawler LiuTianyong Python

疫情数据爬虫,2019新型冠状病毒数据仓库,轨迹数据,同乘数据,报道

177
DotnetCrawler
DotnetCrawler mehmetozkaya C#

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library des...

176
ir
ir guilhermecgs Python

Projeto de calculo de Imposto de Renda em operacoes na bovespa automaticamente. Tags:canal eletronico do investidor, CEI, selenium, bovespa, IRPF, IR,...

175
spoon
spoon Jiramew Python

🥄 A package for building specific Proxy Pool for different Sites.

174
authority-data
authority-data yiliyassh Python

官方权威数据:统计年签,统计公报,互联网行业报告,工信部数据,ICT报告等 Official authoritative data (Chinese)

173
kuaishou-crawler
kuaishou-crawler oGsLP Python

As you can see, a kuaishou crawler

172
search
search subins2000 PHP

An Open Source Search Engine

172
ChainWalker
ChainWalker 0xsha Go

Rapid Smart Contract Crawler

171
Squidwarc
Squidwarc N0taN3rd JavaScript

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

170
cocrawler
cocrawler cocrawler Python

CoCrawler is a versatile web crawler built using modern tools and concurrency.

169
ScrapingOutsourcing
ScrapingOutsourcing ScrapingBoot Julia

ScrapingOutsourcing专注分享爬虫代码 尽量每周更新一个

168
ghs
ghs seart-group Java

GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them

168
python-dcdownloader
python-dcdownloader dev-techmoe Python

由Python编写的全异步实现的动漫之家(dmzj)漫画批量下载器(爬虫)

165
yispider
yispider 2young2simple Go

一款分布式爬虫平台,帮助你更好的管理和开发爬虫。 内置一套爬虫定义规则(模版),可使用模版快速定义爬虫,也可当作框架手动开发爬虫。(兴趣使然的项目,用的...

163
mm131
mm131 qwertyuiop6 Python

MM131网站图片爬取 :rotating_light:

163
crypto-crawler-rs
crypto-crawler-rs crypto-crawler Rust

A rock-solid cryptocurrency crawler library.

163
fun_crawler
fun_crawler ZhangBohan Python

Crawl some picture for fun

162
HttpCode.Core
HttpCode.Core stulzq C#

简单、易用、高效 一个有态度的开源.Net Http请求框架!可以用制作爬虫,api请求等等。

159
crawler-china-mainland-universities
crawler-china-mainland-universities codeudan JavaScript

中国大陆大学列表爬虫

159
TorCrawl.py
TorCrawl.py MikeMeliz Python

Crawl and extract (regular or onion) webpages through TOR network

159
soksaccounts
soksaccounts chenjiandongx Python

🔥 Shadowsocks 账号爬虫

157
WebScrapper
WebScrapper nuhmanpk Python

Powerful Telegram bot for web scraping and crawling. Fast, easy, and loved by thousands!

156
DouyuBarrage-Pro
DouyuBarrage-Pro Crawler995 TypeScript

(2020年最新)斗鱼弹幕抓取及可视化管理平台第二版,提供弹幕抓取、弹幕实时发送速度可视化、抓取记录查询、弹幕下载、自定义关键词统计、铁粉统计、高光时刻自动...

156
evine
evine saeeddhqan Go

Interactive CLI Web Crawler

154
NLP-Twitter
NLP-Twitter h4m5t Python

推特爬虫

154
ngMeta
ngMeta vinaygopinath JavaScript

Dynamic meta tags in your AngularJS single page application

153
tir
tir pouriya Python

Have time.ir in shell!

153
onecomic
onecomic hardwarecode Python

一本漫画

152
urlbuster
urlbuster cytopia Python

Powerful mutable web directory fuzzer to bruteforce existing and/or hidden files or directories.

151
crawler
crawler trandoshan-io Go

Go process used to crawl websites

150
pkulaw_spider
pkulaw_spider FanhuaandLuomu Python

爬取北大法宝网http://www.pkulaw.cn/Case/

149
bilibili_member_crawler
bilibili_member_crawler cwjokaka Python

B站用户爬虫 好耶~是爬虫

147
pachong
pachong jin10086 Jupyter Notebook

一些爬虫的代码

146
jlitespider
jlitespider luohaha Java

A lite distributed Java spider framework :-)

145
pylinkvalidator
pylinkvalidator bartdag Python

pylinkvalidator is a standalone and pure python link validator and crawler that traverses a web site and reports errors (e.g., 500 and 404 errors) enc...

145
NewsCrawler
NewsCrawler Jacen789 Python

新闻爬虫,爬取新浪、搜狐、新华网即时财经新闻。

144
KTSpeechCrawler
KTSpeechCrawler EgorLakomkin Python

Automatically constructing corpus for automatic speech recognition from YouTube videos

143
crawley
crawley s0rg Go

The unix-way web crawler

143
courlan
courlan adbar Python

Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters

142
npm-search
npm-search algolia TypeScript

🗿 npm ↔️ Algolia replication tool :skier: :snail: :artificial_satellite:

142
pixiv_func_mobile
pixiv_func_mobile git-xiaocao Dart

功能齐全的Pixiv第三方客户端 免代理 支持查看动图查看小说

142
fontObfuscator
fontObfuscator solarhell Python

字体混淆服务

141