Most popular crawler repositories and open source projects

4scanner

Continuously search imageboards threads for images/webms and download...

18   103   103  

zyte-smartproxy-headless-proxy

A complimentary proxy to help to use SPM with headless browsers

36   103   103  

pdf-crawler

SimFin's open source PDF crawler

38   103   103  

webb

Python: An all-in-one Web Crawler, Web Parser and Web Scrapping librar...

41   102   102  

PHPCreeper

A new generation of multi-process asynchronous event-driven spider eng...

14   102   102  

Scrapy_IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

40   101   101  

pricetrack

Price tracker monitors of products and alerts you when prices drop. Su...

43   101   101  

goscraper

Golang pkg to quickly return a preview of a webpage (title/description...

40   99   99  

pappet

A command-line tool to crawl websites using puppeteer.

8   98   98  

LinkedIn-Scraper

A LinkedIn Scraper to scrape up to 10k LinkedIn profiles from company...

38   98   98  

google-maps-scraper

👋 HOLA! ENJOY OUR GOOGLE MAPS SCRAPER 🚀 TO EFFORTLESSLY EXTRACT DATA S...

14   98   98  

Weibo-Album-Crawler

新浪微博相册大图多线程爬虫。

38   97   97  

Terpene-Profile-Parser-for-Cannabis-Strains

Parser and database to index the terpene profile of different strains...

20   97   97  

asyncpy

使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架

23   97   97  

copymanga-downloader

使用python编译exe/bash/命令行参数来下载copymanga(拷贝漫画)中的漫画,支...

9   96   96  

Taiwan-news-crawlers

Scrapy-based Crawlers for news of Taiwan

17   95   95  

google-arts-crawler

Google Arts & Culture high quality image downloader

16   95   95  

proxifier

A fast, modern and intelligent proxy rotator perfect for crawling and...

15   95   95  

scaleable-crawler-with-docker-cluster

a scaleable and efficient crawelr with docker cluster , crawl million...

27   94   94  

gopa-abandoned

GOPA, a spider written in Go.(NOTE: this project moved to https://git...

30   94   94  

MetaFinder

Search for documents in a domain through Search Engines (Google, Bing...

20   94   94  

dcard-spider

A spider on Dcard. Strong and speedy.

20   93   93  

aliexscrape

Get Aliexpress product details in JSON

30   93   93  

awesome-python-primer

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫...

18   92   92  

SpotifyScraper

Spotify Scraper to extract all the information from spotify, download...

8   91   91  

GoodreadsScraper

Scrape data from Goodreads using Scrapy and Selenium :books:

21   91   91  

slrp

rotating open proxy multiplexer

9   91   91  

images-web-crawler

This package is a complete tool for creating a large dataset of images...

24   90   90  

Bili23-Downloader

下载 Bilibili 视频/番剧/电影/纪录片 等资源

4   90   90  

news-crawler

A news crawler for BBC News, Reuters and New York Times.

38   89   89  

Price-monitor

某东商品价格监控:自定义商品价格,降价邮件/微信提醒。技术:Python爬虫/...

43   89   89  

crawlie

A simple Elixir library for writing decently-performing crawlers with...

10   87   87  

es6-crawler-detect

:spider: This is an ES6 adaptation of the original PHP library Crawler...

28   86   86  

LFITester

LFITester is a Python3 program that automates the detection and exploi...

18   86   86  

instastories-backup

Backup your friends' Instagram Stories forever and get to keep them ev...

20   85   85  

scrapy_helper

Dynamic configurable crawl (动态可配置化爬虫)

34   85   85  

movie-elasticsearch

使用 SpringBoot2.0+ElasticSearch 实现的开源电影搜索引擎

32   85   85  

SeleniumDemo

Selenium automation test framework

93   84   84  

Amazon-Price-Alert

Price tracker of Amazon

28   84   84  

pagser

Pagser is a simple, extensible, configurable parse and deserialize htm...

7   84   84  

bathyscaphe

Fast, highly configurable, cloud native dark web crawler.

24   83   83  

is-google

Verify that a request is from Google crawlers using Google's DNS verif...

7   82   82  

Hands-on-WebScraping

This repo is a part of blog series on several web scraping projects wh...

73   82   82  

weibo-scraper

Simple Weibo Scraper

18   82   82  

Proxy-List-Scrapper

Proxy List Scrapper

18   82   82  

XVideos-PornHub-RedTube-API

This script scrapes the HTML from different web pages to get the infor...

31   81   81  

bots-zoo

22   80   80  

ceiba-dl

NTU CEIBA 資料下載工具

11   80   80  

random_user_agent

A package to get list of user agents based on filters such as operatin...

12   80   80  

puppeteer-walker

a puppeteer walker 🕷 🕸

11   79   79