Most popular crawler repositories and open source projects

instastories-backup ondrejsojka Python

Backup your friends' Instagram Stories forever and get to keep them even after 24 hours.

82 18 82

crawlzone crawlzone PHP

Crawlzone is a fast asynchronous internet crawling framework for PHP.

82 9 82

xSMTP aziz0x48 Python

xSMTP 🦟 Lightning fast, multithreaded smtp scanner targeting open-relay and unsecured servers in multiple network ranges.

82 33 82

bilibili-manga-download-script lanyeeee TypeScript

一个用于哔哩哔哩漫画 B漫的下载脚本

81 1 81

simpyder Jannchie Python

超高速异步协程Python爬虫

80 23 80

achoz kcubeterm Python

Search through all your personal data efficiently like web search.

80 6 80

tieba-zhuaqu ankanch Python

百度贴吧分布式爬虫，用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析

80 27 80

car-prices go-crawler Go

Golang爬虫爬取汽车之家二手车产品库

80 34 80

slideshare-downloader yodiaditya Python

Slideshare to PDF downloader. Using Selenium and auto scroll-down to get the entire slides completely.

79 32 79

ceiba-dl lantw44 Python

NTU CEIBA 資料下載工具

79 11 79

scrapy-examples feiskyer Python

Some scrapy and web.py exmaples

79 29 79

ctrip_spider evanleungc Python

Scrape Learning (ctrip)

79 32 79

puppeteer-walker lrlna JavaScript

a puppeteer walker 🕷 🕸

79 11 79

arachnid watzon Crystal

Powerful web scraping framework for Crystal

78 12 78

fetchman da2vin Python

fetchman is a simple crawler system/简单好用的爬虫框架

78 21 78

tumblr_crawler 2024baibai Python

tumblr解析网站

78 40 78

page-redirect-code-location-hook JSREI HTML

JS逆向技巧：页面跳转JS代码定位通杀方案

78 19 78

fund-crawler nullpointer TypeScript

基于NodeJS的基金数据爬虫，爬取的数据存于github的@nullpointer/fund-data。

78 39 78

tg_crawler vhdmsm Python

Just a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.

77 25 77

eastmoney minicloudsky JavaScript

python requests + Django+ nodejs koa+ mysql to crawl eastmoney fund and stock data,for data analysis and visualiaztion .

77 25 77

tumblr-crawler-cli tzw0745 Python

Tumblr Download Tool with High Speed and Customization. 高性能&高定制化的Tumblr下载工具。

77 13 77

open-gov-crawlers public-law Python

Parse government documents into well formed JSON

77 8 77

YourLesson Lewin671 Python

深圳大学抢课系统

77 9 77

librengine liameno C++

Privacy Web Search Engine (not meta, own crawler)

76 4 76

venom PreferredAI Java

Your preferred open source focused crawler for the deep web.

76 5 76

social-scraper behitek Python

Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)

75 47 75

crawler_examples liuslnlp Python

Some classic web crawler projects.一些经典的爬虫

75 29 75

tw-stock-telegram-bot x3388638 JavaScript

台股機器人，提供即時個股及大盤報價、走勢、新聞、盤後資料等 Telegram bot to query real-time TW stock quotes, charts, news, and other related informatio...

75 13 75

Pasta Kr0ff Python

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

75 6 75

qr-pirate mzollin Python

crawl QR-codes from search engines and look for bitcoin private keys

75 34 75

Google-Patents-Scraper wenyalintw Python

Automatically download all PDF files of searching results & their patent families found on Google Patents.

75 21 75

newspaperjs flickz HTML

News extraction and scraping. Article Parsing

74 20 74

python-testing-crawler python-testing-crawler Python

A crawler for automated functional testing of a web application

74 5 74

xueqiu_spider_LQH_LZQ 1491270550 Python

雪球爬虫高效爬取近期沪深A股股票评论并自动生成PDF版情感分析报告

74 11 74

Instagram-downloader fernandod1 Python

Instagram user's photos and videos downloader. Download all media files from any username. Working 2022!

74 16 74

Website-Crawler pc8544 Java

Extract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website Crawler

74 8 74

JobApplicationBot drkostas HTML

A bot that automatically sends emails to new ads posted in any desired xe.gr search url.

74 6 74

light-crawler zhang2333 JavaScript

a simplified directed customizable website crawler

74 21 74

ComicSpider QuantumLiu Python

动漫之家漫画站电脑版原图爬虫

74 15 74

site-mirror-py generals-space Python

[码云](https://gitee.com/generals-space/site-mirror-py) 通用爬虫, 仿站工具, 整站下载

73 27 73

ipgw-py-manager Neboer Python

NEU new ipgw python manager

73 11 73

midnight_sea RicYaben Python

Midnight Sea: navigating in the waters of dark web markets

73 13 73

TikScraperPHP pablouser1 JavaScript

Wrapper for TikTok API

73 21 73

cache-warmup eliashaeussler PHP

🔥 PHP library to warm up caches of URLs located in XML sitemaps

73 13 73

steam-discount EXP-Tools Python

steam 特惠游戏榜单（自动刷新）

73 35 73

Wedge LZ0211 JavaScript

可配置的小说下载及电子书生成工具

72 22 72

carbonbot crypto-crawler Rust

A command line tool based on the crypto-crawler library.

72 8 72

aio-scrapy ConlinH Python

Implement scrapy with asyncio

71 10 71

Pinterest-Crawler SajjadAemmi Python

Download HD images from pinterest by your favorite keywords

71 8 71

meta-spy DEENUU1 Python

👾 CLI MetaSpy (Facebook, Instagram) scraper and crawler - instagram account, facebook accounts, pages and search

70 18 70

crawler

Repositories (1431)