Most popular crawler repositories and open source projects

bthello

Python3 DHT 磁力种子爬虫 种子解析 种子搜索 演示地址

26   45   45  

Web-crawler-engineer-for-Python

Web-crawler-engineer-for-Python

23   44   44  

crawler-jsoup-maven

This is a crawler(reptile)

24   44   44  

dbworld-search

:mag: 简单的搜索引擎, django 框架

18   44   44  

seenreq

Generate an object for testing if a request is sent, request is Mikeal...

9   44   44  

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-base...

12   44   44  

shopify-app-store-scraper

Crawler behind the Shopify App Marketplace dataset

12   44   44  

jadwalsholatorg

Parsed data from website https://jadwalsholat.org

19   44   44  

FunUtils

Some codes i wrote to help me with me with my daily errands ;)

51   43   43  

copyheaders

方便的从浏览器复制浏览器头

9   43   43  

jason-the-miner

⛏ A versatile Web scraper for Node.js

11   43   43  

anilist-crawler

Crawl data from anilist API and store in MariaDB.

3   43   43  

douyin-crawler

抖音爬虫. 通过手机代理爬取用户的作品和用户的喜欢

11   43   43  

bluebird

Unofficial Python client for Twitter

11   43   43  

gogetcrawl

Extract web archive data using Wayback Machine and Common Crawl

7   43   43  

FF14AutoSignIn

FF14 国服官网自动签到脚本

7   43   43  

Broken-Link-Crawler

:robot: Python bot that crawls your website looking for dead stuff

13   42   42  

nhentai-imgcollect

:rocket: 使用PyQt5图形界面的Python多线程nhentai爬虫

10   42   42  

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

7   42   42  

steam-discount

steam 特惠游戏榜单(自动刷新)

15   42   42  

WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative...

9   42   42  

spider.npm

网络爬虫类库,基本可以实现自定义规则大部分网站

7   41   41  

python-facebook-bot

Get facebook events from location with Python 3

16   41   41  

Crawler

:snake:A collection of simple Python crawlers.

15   41   41  

AntiCloudFlare

对抗cloudflare载入页反爬虫防护(已失效)

21   41   41  

crawler

nodejs 爬虫框架. crawler framework for nodejs

9   41   41  

PyTse

TseTmc Crawler

8   41   41  

wx-crawl

微信公众号文章爬虫

12   41   41  

stock_linebot_public

The project for Linebot

9   41   41  

python-crawler

爬虫学习仓库,适合零基础的人学习,对新手比较友好

14   41   41  

PageParser

网页解析器,用于网络爬虫解析页面, 不懂网页解析也能写爬虫

16   41   41  

aio-vextractor

解析视频 网站/APP/H5 页面视频信息。支持抖音、腾讯视频、YouTube、Instag...

11   41   41  

ronin-web

ronin-web is a collection of useful web helper methods and commands.

6   41   41  

PSGameSpider

自动爬取所有PlayStationStore中的所有游戏封面,自动生成网页并索引 # # #...

5   41   41  

js-cookie-monitor-debugger-hook

js cookie逆向利器:js cookie变动监控可视化工具 & js cookie hook打条件...

10   41   41  

aristotle

highly customizable news collector

5   40   40  

php-crawler

:spider: A simple crawler (spider) writen in php just for fun, with ze...

19   40   40  

laundry

Data laundering tools

4   40   40  

USTBCrawlers

那些年,我爬过的北科。一个由浅入深的定向爬虫教程。

7   40   40  

ncrawler

Web Crawler written in C#

14   40   40  

HttpProxy

JAVA实现的IP代理池,支持HTTP与HTTPS两种方式

26   40   40  

sponge

sponge is a website crawler and links downloader command-line tool

2   40   40  

MahjongKit

Riichi Mahjong Kit: (1) Game log crawler (sqlite3, json, bs4); (2) Gam...

9   40   40  

Google-Patents-Scraper

Automatically download all PDF files of searching results & their pate...

15   40   40  

lezhin-comics-downloader

📥 Downloader for lezhin comics

3   40   40  

rotating-tor-http-proxy

A multi-arch image provides one HTTP proxy endpoint with many concurre...

15   40   40  

ICLR2023-OpenReviewData

Crawl & Visualize ICLR 2023 Data from OpenReview

3   40   40  

cewler

CeWLeR - Custom Word List generator Redefined. CeWL alternative in Pyt...

2   40   40  

insecres

A console tool that finds insecure resources on HTTPS sites

2   39   39  

SpiderWho

A very fast whois crawler

14   39   39