Most popular crawler repositories and open source projects

learnPython

Python的基础练习代码与各种爬虫代码

302   549   549  

FictionDown

小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|...

110   548   548  

gOSINT

OSINT Swiss Army Knife

85   544   544  

Scavenger

Crawler (Bot) searching for credential leaks on paste sites.

111   523   523  

hacker-news-digest

:newspaper: Let ChatGPT Summarize Hacker News for You

74   522   522  

nintendo-switch-eshop

Crawler for Nintendo Switch eShop

83   511   511  

Scan-T

a new crawler based on python with more function including Network fin...

233   510   510  

TumblThree

A Tumblr and Twitter Blog Backup Application

61   509   509  

crawljax

Crawljax

227   493   493  

scrapple

A framework for creating semi-automatic web content extractors

41   489   489  

opensearchserver

Open-source Enterprise Grade Search Engine Software

194   488   488  

python-fxxk-spider

收集各种免费的 Python 爬虫项目

123   481   481  

Html2Article

Html网页正文提取

181   476   476  

vault

swiss army knife for hackers

95   471   471  

python-automation-scripts

Simple yet powerful automation stuffs.

158   466   466  

mmjpg

👩 美女写真套图爬虫(一)

246   462   462  

webster

a reliable high-level web crawling & scraping framework for Node.js.

57   457   457  

freshonions-torscraper

Fresh Onions is an open source TOR spider / hidden service onion crawl...

144   457   457  

ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials o...

42   453   453  

DownZemAll

DownZemAll! is a download manager for Windows, MacOS and Linux

22   433   433  

signature_algorithm

各种App、小程序、网站的请求签名或加密算法。 现已有:自如、小红书、蛋壳...

73   418   418  

SpiderSuite

Advance web spider/crawler for cyber security professionals

48   414   414  

Youtube-Projects

This repository contains all the code I use in my YouTube tutorials.

236   410   410  

music-recover

:musical_note: 缓存文件转换为 MP3 文件

121   406   406  

jivesearch

A search engine that doesn't track you.

53   402   402  

Python3Webcrawler

🌈Python3网络爬虫实战:QQ音乐歌曲、京东商品信息、房天下、破解有道翻译、...

103   402   402  

tsrtc

台灣股票即時爬蟲。Taiwan Stock Exchange Real Time Crawler

143   400   400  

videodl

Videodl: A lightweight video downloader written by pure python.

132   392   392  

pywebcopy

Locally saves webpages to your hard disk with images, css, js & links...

85   391   391  

ICLR2019-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials o...

36   389   389  

TTBot

今日头条机器人,支持用户登陆、关注、取消关注、获取关注粉丝、发文、发悟...

145   377   377  

CrawlerForReader

Android 本地网络小说爬虫,基于jsoup及xpath

136   374   374  

lxBook

《爬虫逆向进阶实战》书籍代码库

114   370   370  

InstagramCrawler

A non API python program to crawl public photos, posts or followers

110   368   368  

weixin-spider

微信公众号爬虫,公众号历史文章,文章评论,文章阅读及在看数据,可视化we...

90   368   368  

dude

dude uncomplicated data extraction: A simple framework for writing web...

22   367   367  

gospider

golang实现的爬虫框架,使用者只需关心页面规则,提供web管理界面。基于col...

104   363   363  

sitemap-generator

Easily create XML sitemaps for your website.

129   362   362  

seo-audits-toolkit

SEO & Security Audit for Websites. Lighthouse & Security Headers crawl...

79   356   356  

zhihu-login

知乎模拟登录,支持提取验证码和保存 Cookies

140   355   355  

ghcrawler

Crawl GitHub APIs and store the discovered orgs, repos, commits, ...

90   352   352  

supercrawler

A web crawler. Supercrawler automatically crawls websites. Define cust...

66   351   351  

JSSoup

JavaScript + BeautifulSoup = JSSoup

37   349   349  

91porn-api

🌭💦 91porn爬虫在线无限制API接口(永久有效,口令每日更新) 及 在线web预...

34   346   346  

linkedin-profile-scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in J...

114   346   346  

magic_google

Google search results crawler, get google search results that you need

111   345   345  

tsec

台灣上市上櫃股票爬蟲 Taiwan Stock Exchange Crawler

169   344   344  

hQuery.php

An extremely fast web scraper that parses megabytes of invalid HTML in...

71   342   342  

xcrawler

快速、简洁且强大的PHP爬虫框架

51   338   338  

Moodle-DL

Moodle-DL downloads course content fast from Moodle (eg. lecture pdfs)

56   338   338