Most popular crawler repositories and open source projects

linkcrawler

Cross-platform persistent and distributed web crawler :link:

9   112   112  

ThesaurusSpider

下载搜狗、百度、QQ输入法的词库文件的 python 爬虫,可用于构建不同行业的...

44   112   112  

APSoft-Web-Scanner-v2

Powerful dork searcher and vulnerability scanner for windows platform

35   112   112  

bee-university

Project thu thập điểm chuẩn đại học 2014 - 2018 và phân tích dữ liệu

24   111   111  

scrapy-puppeteer

Scrapy + Puppeteer

29   111   111  

starfish-ql

✴️ An experimental graph database

4   110   110  

proxy-pool

爬虫代理IP池服务,可供其他爬虫程序通过restapi获取

55   110   110  

gflare-tk

Open-Source Python Based SEO Web Crawler

14   110   110  

WeiboCrawler

无cookie版微博爬虫,可以连续爬取一个或多个新浪微博用户信息、用户微博及...

19   110   110  

crawler

爬虫, http代理, 模拟登陆!

46   108   108  

zyte-smartproxy-headless-proxy

A complimentary proxy to help to use SPM with headless browsers

37   108   108  

tracker-radar-collector

🕸 Modular, multithreaded, puppeteer-based crawler

40   108   108  

bose

✨ BOSE IS SWISS ARMY KNIFE 🔪 FOR BOT DEVELOPMENT. THE ULTIMATE BOT D...

1   107   107  

collector

Collect XSS vulnerable parameters from entire domain.

29   106   106  

images-web-crawler

This package is a complete tool for creating a large dataset of images...

24   105   105  

CrawlerPack

Java 網路資料爬蟲包

69   104   104  

antispider

56   104   104  

crawler_detect

Ruby gem to detect bots and crawlers via the user agent

10   104   104  

WeiboSpider

微博爬虫,一个基于Scrapy框架的轻量微博爬虫,Sina Weibo Spider

22   104   104  

4scanner

Continuously search imageboards threads for images/webms and download...

18   103   103  

webb

Python: An all-in-one Web Crawler, Web Parser and Web Scrapping librar...

41   102   102  

PHPCreeper

A new generation of multi-process asynchronous event-driven spider eng...

14   102   102  

Scrapy_IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

40   101   101  

goscraper

Golang pkg to quickly return a preview of a webpage (title/description...

40   99   99  

pappet

A command-line tool to crawl websites using puppeteer.

8   98   98  

LinkedIn-Scraper

A LinkedIn Scraper to scrape up to 10k LinkedIn profiles from company...

38   98   98  

google-maps-scraper

👋 HOLA! ENJOY OUR GOOGLE MAPS SCRAPER 🚀 TO EFFORTLESSLY EXTRACT DATA...

14   98   98  

asyncpy

使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架

23   97   97  

Weibo-Album-Crawler

新浪微博相册大图多线程爬虫。

38   97   97  

Terpene-Profile-Parser-for-Cannabis-Strains

Parser and database to index the terpene profile of different strains...

20   97   97  

copymanga-downloader

使用python编译exe/bash/命令行参数来下载copymanga(拷贝漫画)中的漫画,支...

9   96   96  

Taiwan-news-crawlers

Scrapy-based Crawlers for news of Taiwan

17   95   95  

google-arts-crawler

Google Arts & Culture high quality image downloader

16   95   95  

scaleable-crawler-with-docker-cluster

a scaleable and efficient crawelr with docker cluster , crawl million...

27   94   94  

gopa-abandoned

GOPA, a spider written in Go.(NOTE: this project moved to https://git...

30   94   94  

MetaFinder

Search for documents in a domain through Search Engines (Google, Bing...

20   94   94  

dcard-spider

A spider on Dcard. Strong and speedy.

20   93   93  

aliexscrape

Get Aliexpress product details in JSON

30   93   93  

bathyscaphe

Fast, highly configurable, cloud native dark web crawler.

21   93   93  

SpotifyScraper

Spotify Scraper to extract all the information from spotify, download...

8   91   91  

slrp

rotating open proxy multiplexer

9   91   91  

crawlie

A simple Elixir library for writing decently-performing crawlers with...

11   89   89  

news-crawler

A news crawler for BBC News, Reuters and New York Times.

38   89   89  

Price-monitor

某东商品价格监控:自定义商品价格,降价邮件/微信提醒。技术:Python爬虫/...

43   89   89  

shopify-spy

Extract structured data from Shopify websites.

47   88   88  

LFITester

LFITester is a Python3 program that automates the detection and exploi...

18   86   86  

Bilibili_manga_download

带图形界面的哔哩哔哩漫画下载工具

2   86   86  

es6-crawler-detect

:spider: This is an ES6 adaptation of the original PHP library Crawler...

28   86   86  

html-table-extractor

extract data from html table

22   86   86  

instastories-backup

Backup your friends' Instagram Stories forever and get to keep them ev...

20   85   85