Most popular scraping repositories and open source projects

scrape-github-trending

Tutorial for web scraping / crawling with Node.js.

7   42   42  

AngleParse

HTML parsing and processing tool for PowerShell.

6   42   42  

feedsearch-crawler

Crawl sites for RSS, Atom, and JSON feeds.

7   42   42  

WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative...

9   42   42  

map-email-scraper

A open source tool for collating publically available contact informat...

12   42   42  

movie-posters-convnet

Unsupervised clustering of movie posters with features extracted from...

2   41   41  

activesoup

A headless pure-python browser for the web

5   41   41  

socials

👨‍👩‍👦 Social account detection and extraction in Python, e.g. for craw...

8   41   41  

trex

youtube & tiktok analysis + youchoose recommendation custmizer. backen...

13   41   41  

myanimelist-data-set-creator

Collection of some simple python scripts to create https://myanimelist...

6   40   40  

RARBG-scraper

With Selenium headless browsing and CAPTCHA solving

11   40   40  

info-bot

🤖 A Versatile Telegram Bot

15   40   40  

Google-Patents-Scraper

Automatically download all PDF files of searching results & their pate...

15   40   40  

Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Cap...

4   40   40  

outscraper-python

The library provides convenient access to the Outscraper API from appl...

12   40   40  

TikDown

Fast TikTok NO Watermark Video Downloader (username or url)

10   40   40  

instagram-without-api

A simple PHP code to get unlimited instagram public pictures by every...

7   40   40  

configs

Public, free to use, repository with diggers configs for scraping / ex...

12   39   39  

webmagician-ui

An admin UI project for a configurable web crawler platform

15   39   39  

html-table-to-json

Generate JSON representations of HTML tables

7   39   39  

local-api-client-csharp

This .NET Standard package provides convenient access to the Local API...

1   39   39  

shopify-spy

Extract structured data from Shopify websites.

18   39   39  

Deals-Scraper

Deals Scraper is a Canadian tool to find good deals on websites like F...

8   39   39  

Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and...

2   38   38  

CC_Scrapper

Telegram CC Scrapper - Debit/Credit Card [channel public or private /...

21   38   38  

tvseries

TV Series is a tool that scrapes Episode Synopsis' of popular TV Serie...

25   37   37  

beautifulsoup-tutorial

:sparkles: :ramen: Scrape webpage metadata using BeautifulSoup.

16   37   37  

Pasta

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

4   37   37  

linkeBot

🔎 um bot de Web Scraping para mostrar vagas do LinkedIn

6   37   37  

freenom-auto-renew-domains

A scraper built with puppeteer that auto renew free domains on Freenom...

23   37   37  

chirps

Twitter bot powering @arichduvet

9   36   36  

freesoccer

:soccer: Free API with results from national soccer competitions

8   36   36  

fulldom-server

Proxy-like server that will show you the DOM of a page after JS runs

6   36   36  

News_Summary

Dataset and scripts for scraping the news articles from popular source...

26   36   36  

lc-webscraping

Introduction to web scraping

27   36   36  

torchestrator

Spin up Tor containers and then proxy HTTP requests via these Tor inst...

4   36   36  

worldometer

Worldometer Scraping & API 🌎 Get world metrics from worldometers.info

4   36   36  

Thar

Mining all surnames used in Nepal.

7   35   35  

proxi

Proxy pool. Finds and checks proxies with rest api for querying result...

4   34   34  

google-scraper

This class can retrieve search results from Google.

27   33   33  

webradio-metadata

Collection of scraping recipes to get metadata about what is being str...

15   33   33  

poketo

Node library for scraping manga sites

6   33   33  

extract-social-media

Extract social media links and account names from websites.

15   33   33  

ioweb

Web Scraping Framework

11   33   33  

strigil

Strigil is an OSINT tool for collecting and aggregating social media d...

5   32   32  

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data...

6   32   32  

scrapy-scrapingbee

JavaScript support and proxy rotation for Scrapy with ScrapingBee.

3   32   32  

pythonista-chromeless

Serverless selenium which dynamically execute any given code.

10   32   32  

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with...

12   31   31  

node-red-contrib-nbrowser

Provides a virtual web browser (a.k.a. "headless browser") appearing a...

11   31   31