Most popular crawler repositories and open source projects

peeling-onions ntddk Perl

A repository to store Deep Web (onion domain) crawler, scraper, and NLP tools for Tor network.

20 7 20

ppspider_example xiyuan-fengyu TypeScript

ppspider爬虫例子，B站视频信息及评论爬取，qq音乐信息及评论爬取，推特主题评论和用户信息爬取

20 13 20

googleart_scraper asanakoy Python

Scrape images from googleart

20 3 20

sse-option-crawler casprwang Python

SSE 50 index options crawler 上证50期权数据爬虫

20 8 20

goApp kmood Go

golang 的一些开源项目，垃圾清理小工具、华为官网抢购程序、房产爬虫、报名监听

20 3 20

torrent-crawler rajat19 Python

crawls and stores list of torrent links

20 7 20

WebArchiver ArchiveTeam Python

Decentralized web archiving

20 3 20

Fast-KTSpeechCrawler Prem-kumar27 Python

Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler

20 2 20

crawl_xuexi jianboy Python

学习强国APP上机器学习课程，学习慕课视频批量下载

20 18 20

anime-tracker AXeL-dev TypeScript

:spider_web: All in one place to track your favorite animes

20 1 20

k-webtoon-crawler sh-cho Python

Korean webtoon crawler with Python 3. 한국 웹툰 크롤러.

20 2 20

Instagram_Crawler SOMJANG Jupyter Notebook

인스타그램 크롤러 (Python, Selenium)

20 19 20

web-crawler writepython Python

Python Web Crawler with Selenium and PhantomJS

19 14 19

2017_PyConTW_Talk chairco JavaScript

19 4 19

scrapher Laurentvw PHP

A web scraper for PHP to easily extract data from web pages

19 13 19

magento2-module-primer 8WireDigital PHP

Full Page Cache Priming tool for Magento 2

19 7 19

crawler tower1229 JavaScript

Nodejs crawler for cnbeta.com

19 10 19

baiduyun_spider yangruihan Python

Python + MongoDB 开发的百度云资源爬虫

19 15 19

crawl benjaminestes Jupyter Notebook

A concurrent crawler that minimizes memory use. Output suitable for use with BigQuery.

19 7 19

google-scholar-crawler linhung0319 Python

A crawler to crawl google scholar search page

19 13 19

sentinel-cendertron wx-chevalier TypeScript

Cendertron = Crawler + cendertron, Crawl AJAX-heavy client-side Single Page Applications (SPAs), deploying with docker, focusing on scraping requests(...

19 5 19

NEEA-TOEFL-Testseat-Crawler jianqiaomo Python

托福考位爬虫 NEEA TOEFL Testseat Crawler

19 0 19

gdpr-scanner mammuth Go

A tool to check a list of domains for violations against the GDPR :mag:

19 2 19

mvcrawler yddeng Go

动漫聚合小站

19 3 19

hepcrawl inspirehep Python

Scrapy project for feeds into INSPIRE-HEP

19 31 19

AliCouponHunter Tadelsucht Python

Aliexpress coupon search | Find cheapest item and show possible coupon freebies

19 7 19

spiderman bkeepers Ruby

your friendly neighborhood web crawler

19 4 19

playwright-webcrawler LeMoussel Python

Parallel crawler powered by Playwright-Python

19 7 19

plusfish google C++

Plusfish is a classic web application vulnerability scanner/fuzzer and aimed at security professionals

19 9 19

Broken-Links-Crawler-Action ScholliYT Python

GitHub Action to check a website for broken links

19 2 19

mbfc_crawler JeffreyATW Ruby

Crawls Media Bias/Fact Check and saves output to JSON.

18 6 18

MercadoLivreProductsCrawler lucassmacedo PHP

PHP Console Crawler to Download Products from a Store on MercadoLivre.com.br

18 6 18

onion-crawler LoomisLoud Python

Tor website crawler (specific for Alphabay at the time)

18 14 18

node-dcard-scraper wahengchang JavaScript

it is an example of implementing cheerio scraper of extracting images in dcard

18 5 18

crawler alinebastos JavaScript

Web Crawler created with Node.js and Puppeteer

18 1 18

webhunger jerrycshen Java

WebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing witho...

18 4 18

json-web-crawler Knovour JavaScript

Use JSON to list all elements (with css 3 and jquery selector) that you want to crawl.

18 2 18

grapy Lupino Python

Grapy, a fast high-level web crawling framework for Python 3.3 or later base on asyncio.

18 8 18

youtube-trends-spider twtrubiks Python

crawler youtube trends use selenium on python

18 11 18

Email-Extractor Ashwin-op Python

A spider to crawl webpages

18 3 18

websight paambaati TypeScript

🕷A simple but *really* fast crawler built with Node.js & TypeScript

18 13 18

google-play-crawler ranjeet867 Python

Crawler for google play to crawl all the app related data

18 17 18

magnet-crawler Cyrus97 Python

一个磁力链接的爬虫。

18 13 18

Sharingan s045pd Python

We will try to find your visible basic footprint from social media as much as possible - 😤 more sites is comming soon

18 6 18

XML-Parser ElyaConrad JavaScript

A Node.js XML DOM, Parser & Stringifier.

18 7 18

my-favourite-appliances josecelano PHP

Laravel CRUD sample

18 5 18

newspaper-crawler rafatbiin Python

Scrapy based crawler which crawls newspaper.

18 3 18

Google-Clone-Script HiddenPirates TSQL

A search engine like Google made using PHP MySQL and JavaScript

18 17 18

SearchGar roshanlam Jupyter Notebook

SearchGar - An actual Search Engine made using Python

18 2 18

WMIRROR wuseman Shell

wmirror allows you to download any website from the Internet to a local directory, building recursively all directories, getting HTML, images, and oth...

18 2 18

crawler

Repositories (1431)