Most popular crawler repositories and open source projects

APSoft-Web-Scanner-v2

Powerful dork searcher and vulnerability scanner for windows platform

35   112   112  

bee-university

Project thu thập điểm chuẩn đại học 2014 - 2018 và phân tích dữ liệu

24   111   111  

scrapy-puppeteer

Scrapy + Puppeteer

29   111   111  

starfish-ql

✴️ An experimental graph database

4   110   110  

proxy-pool

爬虫代理IP池服务,可供其他爬虫程序通过restapi获取

55   110   110  

gflare-tk

Open-Source Python Based SEO Web Crawler

14   110   110  

WeiboCrawler

无cookie版微博爬虫,可以连续爬取一个或多个新浪微博用户信息、用户微博及...

19   110   110  

crawler

爬虫, http代理, 模拟登陆!

46   108   108  

zyte-smartproxy-headless-proxy

A complimentary proxy to help to use SPM with headless browsers

37   108   108  

tracker-radar-collector

🕸 Modular, multithreaded, puppeteer-based crawler

40   108   108  

bose

✨ BOSE IS SWISS ARMY KNIFE 🔪 FOR BOT DEVELOPMENT. THE ULTIMATE BOT D...

1   107   107  

collector

Collect XSS vulnerable parameters from entire domain.

29   106   106  

images-web-crawler

This package is a complete tool for creating a large dataset of images...

24   105   105  

CrawlerPack

Java 網路資料爬蟲包

69   104   104  

antispider

56   104   104  

crawler_detect

Ruby gem to detect bots and crawlers via the user agent

10   104   104  

WeiboSpider

微博爬虫,一个基于Scrapy框架的轻量微博爬虫,Sina Weibo Spider

22   104   104  

4scanner

Continuously search imageboards threads for images/webms and download...

18   103   103  

pdf-crawler

SimFin's open source PDF crawler

38   103   103  

webb

Python: An all-in-one Web Crawler, Web Parser and Web Scrapping librar...

41   102   102  

PHPCreeper

A new generation of multi-process asynchronous event-driven spider eng...

14   102   102  

Scrapy_IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

40   101   101  

goscraper

Golang pkg to quickly return a preview of a webpage (title/description...

40   99   99  

pappet

A command-line tool to crawl websites using puppeteer.

8   98   98  

LinkedIn-Scraper

A LinkedIn Scraper to scrape up to 10k LinkedIn profiles from company...

38   98   98  

google-maps-scraper

👋 HOLA! ENJOY OUR GOOGLE MAPS SCRAPER 🚀 TO EFFORTLESSLY EXTRACT DATA...

14   98   98  

asyncpy

使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架

23   97   97  

Weibo-Album-Crawler

新浪微博相册大图多线程爬虫。

38   97   97  

Terpene-Profile-Parser-for-Cannabis-Strains

Parser and database to index the terpene profile of different strains...

20   97   97  

copymanga-downloader

使用python编译exe/bash/命令行参数来下载copymanga(拷贝漫画)中的漫画,支...

9   96   96  

Taiwan-news-crawlers

Scrapy-based Crawlers for news of Taiwan

17   95   95  

google-arts-crawler

Google Arts & Culture high quality image downloader

16   95   95  

scaleable-crawler-with-docker-cluster

a scaleable and efficient crawelr with docker cluster , crawl million...

27   94   94  

gopa-abandoned

GOPA, a spider written in Go.(NOTE: this project moved to https://git...

30   94   94  

MetaFinder

Search for documents in a domain through Search Engines (Google, Bing...

20   94   94  

dcard-spider

A spider on Dcard. Strong and speedy.

20   93   93  

aliexscrape

Get Aliexpress product details in JSON

30   93   93  

SpotifyScraper

Spotify Scraper to extract all the information from spotify, download...

8   91   91  

slrp

rotating open proxy multiplexer

9   91   91  

crawlie

A simple Elixir library for writing decently-performing crawlers with...

11   89   89  

news-crawler

A news crawler for BBC News, Reuters and New York Times.

38   89   89  

Price-monitor

某东商品价格监控:自定义商品价格,降价邮件/微信提醒。技术:Python爬虫/...

43   89   89  

shopify-spy

Extract structured data from Shopify websites.

47   88   88  

LFITester

LFITester is a Python3 program that automates the detection and exploi...

18   86   86  

Bilibili_manga_download

带图形界面的哔哩哔哩漫画下载工具

2   86   86  

es6-crawler-detect

:spider: This is an ES6 adaptation of the original PHP library Crawler...

28   86   86  

html-table-extractor

extract data from html table

22   86   86  

instastories-backup

Backup your friends' Instagram Stories forever and get to keep them ev...

20   85   85  

scrapy_helper

Dynamic configurable crawl (动态可配置化爬虫)

34   85   85  

movie-elasticsearch

使用 SpringBoot2.0+ElasticSearch 实现的开源电影搜索引擎

32   85   85