Most popular crawler repositories and open source projects

sponge

sponge is a website crawler and links downloader command-line tool

2   40   40  

MahjongKit

Riichi Mahjong Kit: (1) Game log crawler (sqlite3, json, bs4); (2) Gam...

9   40   40  

lezhin-comics-downloader

📥 Downloader for lezhin comics

3   40   40  

scrapingant-client-python

ScrapingAnt API client for Python.

5   40   40  

rotating-tor-http-proxy

A multi-arch image provides one HTTP proxy endpoint with many concurre...

15   40   40  

ICLR2023-OpenReviewData

Crawl & Visualize ICLR 2023 Data from OpenReview

3   40   40  

cewler

CeWLeR - Custom Word List generator Redefined. CeWL alternative in Pyt...

2   40   40  

extension

web scraping extension

4   39   39  

insecres

A console tool that finds insecure resources on HTTPS sites

2   39   39  

SpiderWho

A very fast whois crawler

14   39   39  

podcastcrawler

PHP library to find podcasts

10   39   39  

WebCrawler

一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。

14   39   39  

LuoguCrawler

一个python爬虫来爬取洛谷各种信息

15   39   39  

AppCrawler

Android应用市场网络爬虫

19   39   39  

crawel

基于Apify+node+react搭建的有点意思的爬虫平台

16   39   39  

Domainker

BugBounty Tool

21   38   38  

TripAdvisor_crawler

Python Crawler: Scrape Data From Tripadvisor

10   38   38  

BaiduImageCrawler

A multithreaded tool for downloading search results of Baidu image sea...

20   38   38  

leboncoin-crawler

Crawler for leboncoin.fr

17   38   38  

CrawlerSamples

This is a Puppeteer+AngleSharp crawler console app samples, used C# 7....

13   38   38  

tiktok-crawler

This is a Tiktok Crawler App.

16   38   38  

DeadPool

该项目是一个使用celery作为主体框架的爬虫应用,能够灵活的添加爬虫任务,...

15   38   38  

SeleniumLogin

Login some website using selenium.

17   38   38  

logo-scrape

🕷🚀 Scrapes/Crawls the logo from a provided url(s)/website for your No...

10   38   38  

crawlerdetect

Golang module to detect bots and crawlers via the user agent

5   38   38  

papercut

Papercut is a scraping/crawling library for Node.js built on top of JS...

2   38   38  

novel-downloader

万能小说下载器

12   38   38  

ProxyScan

🔎 scan the internet to find "private" proxies.

6   38   38  

CygnusX1

A multithreaded tool for searching and downloading images from popula...

8   38   38  

chan-downloader

CLI to download all images/webms in a 4chan thread

4   38   38  

EH-PDF

將一個 E-Hentai 畫廊下載並轉換成 PDF,方便在 Kindle 上閱讀 以及在 iPad...

2   38   38  

BiliBili-Manga-Downloader

一个好用的哔哩哔哩漫画下载器,拥有图形界面,支持关键词搜索漫画,多线程...

4   38   38  

CobWeb-lnx

CobWeb is a Python library for web scraping. The library consists of t...

2   38   38  

scrapy-zyte-api

Zyte API integration for Scrapy

21   38   38  

lolcrawler

Headless web crawler for bugbounty and penetration-testing/redteaming

9   37   37  

crawlhtmltopdf

一个将runoob.com转换为PDF的爬虫

11   37   37  

xray_pool

基于 Xray-core、glider 的代理池工具

6   37   37  

auto_crawler_ptt_beauty_image

Auto Crawler Ptt Beauty Image Use Python Schedule

18   37   37  

Spider

web crawler

25   37   37  

UniversityRecruitment-sSurvey

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?

18   37   37  

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with...

10   37   37  

medium-stat-box

Practical pinned gist which show your latest medium status 📌

18   37   37  

d00r

Simple directory brute-force tool written with python.

8   37   37  

fii

API para recuperar informações sobre FII

10   37   37  

PixivCrawlerIII

A python3 crawler for crawling Pixiv ranking top and any illustrator a...

10   36   36  

MMDownloader

마루마루 다운로더 신규 프로젝트

8   36   36  

cetty

基于事件分发的爬虫框架

11   36   36  

golearn

🔥 Golang basics and actual-combat (including: crawler, distributed-s...

11   36   36  

vw-crawler

:beetle:简单轻便的Java爬虫框架,只要会一点简单的正则表达式和简单的css...

18   36   36  

NodeSpider

[DEPRECATED] Simple, flexible, delightful web crawler/spider package

3   36   36