Most popular scraping repositories and open source projects

web-scraper-chrome-extension

Web data extraction tool implemented as chrome extension

72   242   242  

facebook-group-members-scraper

Facebook Group Members Extractor. Download Facebook group members in C...

61   241   241  

bbb-face-recognizer

Face recognition system using MTCNN, FACENET, SVM and FAST API to trac...

33   239   239  

linkedIn-scraper

A playwright bot which is implemented to scrape linkedin and store adv...

24   238   238  

jsoup-annotations

Jsoup Annotations POJO

18   237   237  

comcrawl

A python utility for downloading Common Crawl data

42   237   237  

anime-dl

Anime-dl is a command-line program to download anime from CrunchyRoll...

38   231   231  

goose-parser

Universal scraping tool, which allows you to extract data using multip...

13   228   228  

idt

Image Dataset Tool (idt) is a cli tool designed to make the otherwise...

27   228   228  

educative.io-downloader

Free Palestine. 📖 This tool is to download course from educative.io f...

128   222   222  

Humanoid

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

25   222   222  

transistor

Transistor, a Python web scraping framework for intelligent use cases.

20   214   214  

SouqScraper

Simple scripts for Level UP your scraping Skills, and source code for...

168   213   213  

Grawler

Grawler is a tool written in PHP which comes with a web interface that...

55   213   213  

instatools

🧰 A collection of automation tools for Instagram 📱| Written in Pytho...

25   206   206  

facebook-data-extraction

Experience for effectively fetching Facebook data by Querying Graph AP...

61   205   205  

Dorkify

Perform Google Dork search with Dorkify

37   201   201  

linkedin-learning-downloader

Linkedin Learning videos downloader

103   200   200  

blinkist-scraper

📚 Python tool to download book summaries and audio from Blinkist.com,...

36   200   200  

jsonframe-cheerio

simple multi-level scraper json input/output for Cheerio

24   199   199  

lazy-json-pages

📜 Framework-agnostic API scraper to load items from any paginated JSO...

2   194   194  

SpotiFile

Spotify scraper

21   187   187  

estela

estela, an elastic web scraping cluster 🕸

15   185   185  

Email-extractor

The main functionality is to extract all the emails from one or severa...

76   183   183  

UdemyCourseGrabber

Your will to enroll in Udemy course is here, but the money isn't? Sear...

28   180   180  

DotnetCrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying...

66   176   176  

JsonGenius

Get structured JSON data from any page.

10   175   175  

Python-Selenium-Action

Run Selenium with Python via Github Actions using Headless or Non-Head...

45   175   175  

Gmail-Creation-Automation-Python

This script allows you to automate the creation of Gmail accounts usin...

52   174   174  

fantasy-basketball

Scraping statistics, predicting NBA player performance with neural ne...

49   172   172  

search-engine-google

:spider: Google client for SERPS

60   170   170  

awesome-python-primer

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫...

25   169   169  

shadow-useragent

Pick the most common user-agents on the Internet 👻

12   168   168  

languagepod101-scraper

Python scraper for Language Pods such as Japanesepod101.com :japanese_...

26   167   167  

scrapers

Code relating to scraping public police data.

37   164   164  

xquery

Extract data or evaluate value from HTML/XML documents using XPath

27   158   158  

Leetcode-Questions-Scraper

Scrape Algorithm Questions from leetcode and generate html and epub fi...

44   156   156  

tweetdrop

Generate dispersable airdrops from Twitter threads.

24   156   156  

WebScrapper

Powerful Telegram bot for web scraping and crawling. Fast, easy, and l...

94   156   156  

xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and...

33   156   156  

llm-reader

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina...

14   152   152  

jimutmap

API to get enormous amount of high resolution satellite images from sa...

18   152   152  

jazz

The Scripting Engine that Combines Speed, Safety, and Simplicity

10   147   147  

FredsRoadtripStoryteller

Hear local historical markers as you travel on your road-trip. 100% Sh...

14   146   146  

sasori

Sasori is a dynamic web crawler powered by Puppeteer, designed for lig...

16   144   144  

sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

7   143   143  

curl-post-requests

Learn how to send POST requests with cURL.

0   142   142  

Kemono-and-Coomer-Downloader

The Kemono and Coomer Downloader simplifies downloading posts from Kem...

23   141   141  

od-database

Distributed crawler, database and web frontend for public directories...

23   141   141  

double-agent

A test suite of common scraper detection techniques. See how detectabl...

10   140   140