Most popular scraping repositories and open source projects

scrape-github-trending

Tutorial for web scraping / crawling with Node.js.

7   42   42  

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

7   42   42  

AngleParse

HTML parsing and processing tool for PowerShell.

6   42   42  

map-email-scraper

A open source tool for collating publically available contact informat...

12   42   42  

WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative...

9   42   42  

trex

youtube & tiktok analysis + youchoose recommendation custmizer. backen...

13   41   41  

movie-posters-convnet

Unsupervised clustering of movie posters with features extracted from...

2   41   41  

activesoup

A headless pure-python browser for the web

5   41   41  

socials

👨‍👩‍👦 Social account detection and extraction in Python, e.g. for c...

8   41   41  

myanimelist-data-set-creator

Collection of some simple python scripts to create https://myanimelist...

6   40   40  

RARBG-scraper

With Selenium headless browsing and CAPTCHA solving

11   40   40  

Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Cap...

4   40   40  

Google-Patents-Scraper

Automatically download all PDF files of searching results & their pate...

15   40   40  

info-bot

🤖 A Versatile Telegram Bot

15   40   40  

outscraper-python

The library provides convenient access to the Outscraper API from appl...

12   40   40  

instagram-without-api

A simple PHP code to get unlimited instagram public pictures by every...

7   40   40  

TikDown

Fast TikTok NO Watermark Video Downloader (username or url)

10   40   40  

Deals-Scraper

Deals Scraper is a Canadian tool to find good deals on websites like F...

8   39   39  

local-api-client-csharp

This .NET Standard package provides convenient access to the Local API...

1   39   39  

shopify-spy

Extract structured data from Shopify websites.

18   39   39  

configs

Public, free to use, repository with diggers configs for scraping / ex...

12   39   39  

webmagician-ui

An admin UI project for a configurable web crawler platform

15   39   39  

html-table-to-json

Generate JSON representations of HTML tables

7   39   39  

Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and...

2   38   38  

CC_Scrapper

Telegram CC Scrapper - Debit/Credit Card [channel public or private /...

21   38   38  

freenom-auto-renew-domains

A scraper built with puppeteer that auto renew free domains on Freenom...

23   37   37  

linkeBot

🔎 um bot de Web Scraping para mostrar vagas do LinkedIn

6   37   37  

Pasta

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

4   37   37  

tvseries

TV Series is a tool that scrapes Episode Synopsis' of popular TV Serie...

25   37   37  

beautifulsoup-tutorial

:sparkles: :ramen: Scrape webpage metadata using BeautifulSoup.

16   37   37  

chirps

Twitter bot powering @arichduvet

9   36   36  

freesoccer

:soccer: Free API with results from national soccer competitions

8   36   36  

fulldom-server

Proxy-like server that will show you the DOM of a page after JS runs

6   36   36  

News_Summary

Dataset and scripts for scraping the news articles from popular source...

26   36   36  

worldometer

Worldometer Scraping & API 🌎 Get world metrics from worldometers.info

4   36   36  

lc-webscraping

Introduction to web scraping

27   36   36  

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor inst...

4   36   36  

Thar

Mining all surnames used in Nepal.

7   35   35  

proxi

Proxy pool. Finds and checks proxies with rest api for querying result...

4   34   34  

google-scraper

This class can retrieve search results from Google.

27   33   33  

webradio-metadata

Collection of scraping recipes to get metadata about what is being str...

15   33   33  

poketo

Node library for scraping manga sites

6   33   33  

extract-social-media

Extract social media links and account names from websites.

15   33   33  

jmd_imagescraper

Image scraping library for creating deep learning datasets

15   33   33  

ioweb

Web Scraping Framework

11   33   33  

strigil

Strigil is an OSINT tool for collecting and aggregating social media d...

5   32   32  

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data...

6   32   32  

pythonista-chromeless

Serverless selenium which dynamically execute any given code.

10   32   32  

scrapy-scrapingbee

JavaScript support and proxy rotation for Scrapy with ScrapingBee.

3   32   32  

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with...

12   31   31