Most popular crawler repositories and open source projects

Github-spider

Github 仓库及用户分析爬虫

88   241   241  

ComicCrawler

An image crawler written in Python.

46   238   238  

4chan-downloader

Python3 script to continuously download all images/webms of multiple 4...

32   237   237  

wscan

一款开源的安全评估工具支持常见的 web 安全问题扫描和自定义 POC。此外,...

27   236   236  

woid

Simple news aggregator displaying top stories in real time

122   235   235  

RuiJi.Net

crawler framework, distributed crawler extractor

45   234   234  

QQMusicSpider

基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论...

61   229   229  

js-reverse

JS逆向研究

83   227   227  

goose-parser

Universal scraping tool, which allows you to extract data using multip...

13   227   227  

EmailFinder

Search emails from a domain through search engines

53   226   226  

weibo-topic-spider

微博超级话题爬虫,微博词频统计+情感分析+简单分类,新增肺炎超话爬取数据

60   224   224  

FooProxy

稳健高效的评分制-针对性- IP代理池 + API服务,可以自己插入采集器进行代...

59   222   222  

WebVideoBot

Web crawler.

44   221   221  

ZhihuSpider

多线程知乎用户爬虫,基于python3

81   221   221  

91porn-crawler

91 porn crawler. 自动爬取并下载你想要的91porn热门视频。Automatically d...

39   220   220  

Sitemap-Generator-Crawler

PHP script to recursively crawl websites and generate a sitemap. Zero...

89   219   219  

black-widow

GUI based offensive penetration testing tool (Open Source)

45   215   215  

google-group-crawler

[Deprecated] Get (almost) original messages from google group archives...

37   214   214  

go-movies

golang spider Crawler 爬虫 电影

66   214   214  

InfinityCrawler

A simple but powerful web crawler library for .NET

33   213   213  

gf-secrets

Secret and/or credential patterns used for gf.

47   213   213  

goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang...

30   211   211  

ptt-alertor

:loudspeaker: Ptt 文章通知機器人!Notify Ptt Article in Realtime

24   210   210  

indonesian-NLP-resources

data resource untuk NLP bahasa indonesia

53   209   209  

AllNewsSpider

澎湃新闻,新浪新闻,腾讯新闻,搜狐新闻,新闻联播,泰晤士报,纽约时报,...

45   209   209  

news-crawl

News crawling with StormCrawler - stores content as WARC

26   207   207  

scrapedin-linkedin-crawler

Crawler for LinkedIn full profiles 2019

70   207   207  

crawlab-lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

69   206   206  

N2H4

네이버 뉴스 수집을 위한 도구

74   205   205  

laosj

golang light-weight image crawler

39   205   205  

KoreaNewsCrawler

대량의 뉴스 데이터를 수집하기 위해 만들어진 뉴스 크롤러입니다.

102   202   202  

VideoServer

以Node.js基于express以及爬虫实现的视频资源后端

6   201   201  

dorkscout

DorkScout - Golang tool to automate google dork scan against the entie...

18   201   201  

weibo_wordcloud

根据关键词抓取微博数据,再生成词云

68   197   197  

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

33   197   197  

robots-txt

Determine if a page may be crawled from robots.txt, robots meta tags a...

29   196   196  

NoSmoke

A cross platform UI crawler which scans view trees then generate and e...

61   195   195  

Pinkerton

🕵️ Pinkerton is an JavaScript file crawler and secret finder developed...

30   194   194  

facebook-data-extraction

Experience for effectively fetching Facebook data by Querying Graph AP...

60   192   192  

JavPy

Enjoy driving on a Javascriptive (originally Pythonic) way to Japanese...

35   190   190  

crawler-js-hook-framework-public

JS逆向Hook工具集,开源部分工具到这里

82   190   190  

crawler_shopee_public

蝦皮非同步爬蟲 + 競品賣家分析

64   189   189  

instagram-crawler

Crawl instagram photos, posts and videos for download.

15   188   188  

digger

Digger is a powerful and flexible web crawler implemented by pure gola...

72   187   187  

web-bee

🐝 Web vertical crawler framework for fun

35   187   187  

CSharpCrawler

C#爬虫示例程序,想学习爬虫入门知识的可以看过来。后续会慢慢加入更多爬虫...

49   186   186  

leetcode-spider

用 node.js 爬你自己的 leetcode 解题源码

50   185   185  

zhihu_fun

基于 Selenium 的知乎关键词爬虫

39   185   185  

zhihu-crawler-people

A simple distributed crawler for zhihu && data analysis

91   184   184  

sensitivefilescan

75   183   183