Topic

crawler

Repositories (1431)

instastories-backup
instastories-backup ondrejsojka Python

Backup your friends' Instagram Stories forever and get to keep them even after 24 hours.

82
crawlzone
crawlzone crawlzone PHP

Crawlzone is a fast asynchronous internet crawling framework for PHP.

82
xSMTP
xSMTP aziz0x48 Python

xSMTP 🦟 Lightning fast, multithreaded smtp scanner targeting open-relay and unsecured servers in multiple network ranges.

82
bilibili-manga-download-script
bilibili-manga-download-script lanyeeee TypeScript

一个用于 哔哩哔哩漫画 B漫 的下载脚本

81
simpyder
simpyder Jannchie Python

超高速异步协程Python爬虫

80
achoz
achoz kcubeterm Python

Search through all your personal data efficiently like web search.

80
tieba-zhuaqu
tieba-zhuaqu ankanch Python

百度贴吧分布式爬虫,用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析

80
car-prices
car-prices go-crawler Go

Golang爬虫 爬取汽车之家 二手车产品库

80
slideshare-downloader
slideshare-downloader yodiaditya Python

Slideshare to PDF downloader. Using Selenium and auto scroll-down to get the entire slides completely.

79
ceiba-dl
ceiba-dl lantw44 Python

NTU CEIBA 資料下載工具

79
scrapy-examples
scrapy-examples feiskyer Python

Some scrapy and web.py exmaples

79
ctrip_spider
ctrip_spider evanleungc Python

Scrape Learning (ctrip)

79
puppeteer-walker
puppeteer-walker lrlna JavaScript

a puppeteer walker 🕷 🕸

79
arachnid
arachnid watzon Crystal

Powerful web scraping framework for Crystal

78
fetchman
fetchman da2vin Python

fetchman is a simple crawler system/简单好用的爬虫框架

78
tumblr_crawler
tumblr_crawler 2024baibai Python

tumblr解析网站

78
page-redirect-code-location-hook
page-redirect-code-location-hook JSREI HTML

JS逆向技巧:页面跳转JS代码定位通杀方案

78
fund-crawler
fund-crawler nullpointer TypeScript

基于NodeJS的基金数据爬虫,爬取的数据存于github的@nullpointer/fund-data。

78
tg_crawler
tg_crawler vhdmsm Python

Just a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.

77
eastmoney
eastmoney minicloudsky JavaScript

python requests + Django+ nodejs koa+ mysql to crawl eastmoney fund and stock data,for data analysis and visualiaztion .

77
tumblr-crawler-cli
tumblr-crawler-cli tzw0745 Python

Tumblr Download Tool with High Speed and Customization. 高性能&高定制化的Tumblr下载工具。

77
open-gov-crawlers
open-gov-crawlers public-law Python

Parse government documents into well formed JSON

77
YourLesson
YourLesson Lewin671 Python

深圳大学抢课系统

77
librengine
librengine liameno C++

Privacy Web Search Engine (not meta, own crawler)

76
venom
venom PreferredAI Java

Your preferred open source focused crawler for the deep web.

76
social-scraper
social-scraper behitek Python

Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)

75
crawler_examples
crawler_examples liuslnlp Python

Some classic web crawler projects.一些经典的爬虫

75
tw-stock-telegram-bot
tw-stock-telegram-bot x3388638 JavaScript

台股機器人,提供即時個股及大盤報價、走勢、新聞、盤後資料等 Telegram bot to query real-time TW stock quotes, charts, news, and other related informatio...

75
Pasta
Pasta Kr0ff Python

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

75
qr-pirate
qr-pirate mzollin Python

crawl QR-codes from search engines and look for bitcoin private keys

75
Google-Patents-Scraper
Google-Patents-Scraper wenyalintw Python

Automatically download all PDF files of searching results & their patent families found on Google Patents.

75
newspaperjs
newspaperjs flickz HTML

News extraction and scraping. Article Parsing

74
python-testing-crawler
python-testing-crawler python-testing-crawler Python

A crawler for automated functional testing of a web application

74
xueqiu_spider_LQH_LZQ
xueqiu_spider_LQH_LZQ 1491270550 Python

雪球爬虫 高效爬取近期沪深A股股票评论并自动生成PDF版情感分析报告

74
Instagram-downloader
Instagram-downloader fernandod1 Python

Instagram user's photos and videos downloader. Download all media files from any username. Working 2022!

74
Website-Crawler
Website-Crawler pc8544 Java

Extract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website Crawler

74
JobApplicationBot
JobApplicationBot drkostas HTML

A bot that automatically sends emails to new ads posted in any desired xe.gr search url.

74
light-crawler
light-crawler zhang2333 JavaScript

a simplified directed customizable website crawler

74
ComicSpider
ComicSpider QuantumLiu Python

动漫之家漫画站电脑版原图爬虫

74
site-mirror-py
site-mirror-py generals-space Python

[码云](https://gitee.com/generals-space/site-mirror-py) 通用爬虫, 仿站工具, 整站下载

73
ipgw-py-manager
ipgw-py-manager Neboer Python

NEU new ipgw python manager

73
midnight_sea
midnight_sea RicYaben Python

Midnight Sea: navigating in the waters of dark web markets

73
TikScraperPHP
TikScraperPHP pablouser1 JavaScript

Wrapper for TikTok API

73
cache-warmup
cache-warmup eliashaeussler PHP

🔥 PHP library to warm up caches of URLs located in XML sitemaps

73
steam-discount
steam-discount EXP-Tools Python

steam 特惠游戏榜单(自动刷新)

73
Wedge
Wedge LZ0211 JavaScript

可配置的小说下载及电子书生成工具

72
carbonbot
carbonbot crypto-crawler Rust

A command line tool based on the crypto-crawler library.

72
aio-scrapy
aio-scrapy ConlinH Python

Implement scrapy with asyncio

71
Pinterest-Crawler
Pinterest-Crawler SajjadAemmi Python

Download HD images from pinterest by your favorite keywords

71
meta-spy
meta-spy DEENUU1 Python

👾 CLI MetaSpy (Facebook, Instagram) scraper and crawler - instagram account, facebook accounts, pages and search

70