Most popular crawler repositories and open source projects

2017_PyConTW_Talk

4   19   19  

botanalyse

botsonar analyse open api

3   19   19  

magento2-module-primer

Full Page Cache Priming tool for Magento 2

7   19   19  

crawler

Nodejs crawler for cnbeta.com

10   19   19  

baiduyun_spider

Python + MongoDB 开发的百度云资源爬虫

15   19   19  

crawl

A concurrent crawler that minimizes memory use. Output suitable for us...

7   19   19  

google-scholar-crawler

A crawler to crawl google scholar search page

13   19   19  

sentinel-cendertron

Cendertron = Crawler + cendertron, Crawl AJAX-heavy client-side Single...

5   19   19  

NEEA-TOEFL-Testseat-Crawler

托福考位爬虫 NEEA TOEFL Testseat Crawler

0   19   19  

scrapher

A web scraper for PHP to easily extract data from web pages

13   18   18  

mbfc_crawler

Crawls Media Bias/Fact Check and saves output to JSON.

6   18   18  

MercadoLivreProductsCrawler

PHP Console Crawler to Download Products from a Store on MercadoLivre....

6   18   18  

onion-crawler

Tor website crawler (specific for Alphabay at the time)

14   18   18  

node-dcard-scraper

it is an example of implementing cheerio scraper of extracting images...

5   18   18  

go-scrapy

Web crawling and scraping framework for Golang

4   18   18  

crawler

Web Crawler created with Node.js and Puppeteer

1   18   18  

json-web-crawler

Use JSON to list all elements (with css 3 and jquery selector) that yo...

2   18   18  

grapy

Grapy, a fast high-level web crawling framework for Python 3.3 or late...

8   18   18  

youtube-trends-spider

crawler youtube trends use selenium on python

11   18   18  

Email-Extractor

A spider to crawl webpages

3   18   18  

websight

🕷A simple but *really* fast crawler built with Node.js & TypeScript

14   18   18  

google-play-crawler

Crawler for google play to crawl all the app related data

17   18   18  

Academic-Paper-Title-Recommendation

Supervised text summarization (title generation/recommendation) based...

1   18   18  

magnet-crawler

一个磁力链接的爬虫。

13   18   18  

Sharingan

We will try to find your visible basic footprint from social media as...

6   18   18  

my-favourite-appliances

Laravel CRUD sample

5   18   18  

newspaper-crawler

Scrapy based crawler which crawls newspaper.

3   18   18  

master-to-pythonista

A list of awesome beginners-friendly projects.

19   18   18  

Google-Clone-Script

A search engine like Google made using PHP MySQL and JavaScript

17   18   18  

crowlet

Tiny sitemap crawler for cache warming, and website status monitoring

1   18   18  

froxy

Hide your IP with free proxies using Froxy 🔄

1   18   18  

udemyscraper

A Udemy Course Scraper built with bs4 and selenium, that fetches udemy...

10   18   18  

WMIRROR

wmirror allows you to download any website from the Internet to a loca...

2   18   18  

ActoCrawler

🕸️ Swift Concurrency-powered crawler engine on top of Actomaton.

1   18   18  

crawl-original-google-images

python scripts for crawling original image from Google Images

2   18   18  

billboard-json

🎧 Get json type billboard hot 100 chart

2   18   18  

wind-bell

风铃虫是一款轻量级的爬虫工具,似风铃一样灵敏,如蜘蛛一般敏捷,能感知任...

7   18   18  

doogle

Doogle is a search engine and web crawler which can search indexed web...

5   18   18  

crawler-set

各种网站爬虫合集,持续更新中....

16   17   17  

photo-spider-scrapy

10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码

10   17   17  

Douban-Crawler

抓取豆瓣小组相关信息(小组、用户、帖子)。

8   17   17  

images-downloader

A Node.js module for downloading a single image or multiple images to...

10   17   17  

XML-Parser

A Node.js XML DOM, Parser & Stringifier.

8   17   17  

news-sentiment-analysis

The spider crawls moneycontrol.com and economictimes.com to fetch news...

5   17   17  

app-crawler

crawling App by uiautomator2 & mitmproxy

8   17   17  

papercut

Papercut is a scraping/crawling library for Node.js built on top of JS...

1   17   17  

Deep_miner

Webcrawler written in Python. This crawler does dig in till the 3 leve...

5   17   17  

SearchGar

SearchGar - An actual Search Engine made using Python

2   17   17  

konect-extr

Network dataset extraction library – part of the KONECT project by Jér...

10   17   17  

AnimeDl

⚡️An API for downloading or streaming your favorite anime.

1   17   17