Most popular scraping repositories and open source projects

Instagram-downloader

Instagram user's photos and videos downloader. Download all media file...

16   62   62  

conformist

Bend CSVs to your will with declarative schemas.

6   61   61  

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC...

10   61   61  

pomp

Screen scraping and web crawling framework

10   60   60  

ksoup

Kotlin Wrapper for Jsoup

6   60   60  

mangalivre-api

API não-oficial do mangá livre feita com Node.js e Express.js.

10   58   58  

Pahe.ph-Scraper

Pahe.ph [Pahe.in] Movies Website Scraper

16   58   58  

WebScrapper

Telegram Bot to scrap webpages using Requests, html5lib and Beautifuls...

43   57   57  

PythonScrapyBasicSetup

Basic setup with random user agents and IP addresses for Python Scrapy...

14   57   57  

angel.co-companies-list-scraping

34   57   57  

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

21   57   57  

Tor_Spider

Python project to crawl and scrap the lesser known deep web or one can...

16   57   57  

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler dev...

10   56   56  

actor-facebook-scraper

Scrape public Facebook pages, posts, reviews and comments

32   56   56  

pycaching

A Python 3 interface for working with Geocaching.com website.

42   55   55  

learn.scrapinghub.com

Scrapinghub Learning Center. Report issues in Jira: Report issues in J...

24   55   55  

selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract d...

11   55   55  

Euro2016_TerminalApp

:soccer: Instantly find :trophy:EURO 2016 live-streams & highlights, n...

10   54   54  

local-api-client-typescript

Official JavaScript/TypeScript library for interacting with Kameleo Cl...

1   54   54  

Miyou

An anime discovery, streaming site made with React.js. It uses AniList...

20   53   53  

mtnt

Code for the collection and analysis of the MTNT dataset

4   53   53  

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The offici...

20   53   53  

amazon_scraper

Amazon products scraper with using of rotating proxies and headless Ch...

16   53   53  

SearchEngineScrapy

Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com,...

18   53   53  

scraper-fourone-jobs

This is a anti-scraping cracker for extracting apply information of on...

11   52   52  

local-api-client-python

Official Python library for interacting with Kameleo Client

3   52   52  

media-search-engine

Search geolocations for (social) media posts in databases like Belling...

8   52   52  

whatsapp-tracking

Scraping the status of WhatsApp contacts

13   51   51  

api-client

API client to develop tools for competitive programming

16   51   51  

puppeteer-botcheck

🕵‍♂ Bot detection tests for Puppeteer. Hide and seek!

6   51   51  

playlist2links

This bash script allows to extract video links from a youtube playlist

10   51   51  

sample-web-scraping-with-electron

Sample project for web scraping with Electron

14   50   50  

pge-outages

Tracking PG&E outages

7   50   50  

foundation

🧱 A uniform template to use as a foundation for Puppeteer bot constru...

7   50   50  

chegg-scraper

Download Chegg homework-help questions to self-sufficient HTML files

17   50   50  

dart-scraper

한국 금융감독원에서 운영하는 다트(Dart) 시스템을 이용한 기업 재무제표...

21   49   49  

pypatent

Search for and retrieve US Patent and Trademark Office Patent Data

15   47   47  

linkedin-scrapper

LinkedIn scrapper is advanced search result scrapper script build with...

21   47   47  

scrapers

scrapers for building your own image databases

6   46   46  

ogpParser

Open Graph Protocol Parser for Node.js

9   46   46  

datasette-scraper

Add website scraping abilities to Datasette

1   46   46  

torrent-tracker-scraper

A UDP torrent tracker scraper library written in Python 3

14   45   45  

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-base...

12   44   44  

xdsl-exporter

xDSL Prometheus Exporter

2   44   44  

jason-the-miner

⛏ A versatile Web scraper for Node.js

11   43   43  

oversmash

Overwatch API library for player details and career stats

6   43   43  

bluebird

Unofficial Python client for Twitter

11   43   43  

go-ps4

Search your favorite PS4 games from Playstation Store using the Comman...

6   42   42  

image-collector

Download images from Google Image Search

22   42   42  

hext

Domain-specific language for extracting structured data from HTML docu...

3   42   42