Most popular scraping repositories and open source projects

garlic velocitatem JavaScript

🧄🧛 protect your website from being scraped by bots.

52 0 2

beautifulsoup-tutorial hackersandslackers Python

:sparkles: :ramen: Scrape webpage metadata using BeautifulSoup.

51 17 3

react-node-web-scraper codegratia JavaScript

Final Year project, scraping data of e-commerce stores and display in ReactJS app.

51 24 1

Scraping-Dynamic-JavaScript-Ajax-Websites-With-BeautifulSoup oxylabs Python

A guide on how to scrape JavaScript rendered websites with Python and BeautifulSoup.

51 8 0

CaseHarvester dismantl Python

AWS-based application for scraping the Maryland Judiciary Case Search

51 12 1

tiktok-trending-data-api ogohogo JavaScript

Scraping the TikTok Discovery Data API every 1 hour using Github Actions to view changes

51 7 1

configs Diggernaut

Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores

50 16 8

consentcrawl dumkydewilde Python

Automatically check for GDPR/CCPA and cookie consent by running a Playwright headless browser to check for marketing and analytics scripts firing befo...

50 9 3

dilbert-viewer rharish101 Rust

A simple comic viewer for Dilbert by Scott Adams

50 5 2

instagram-without-api orsifrancesco PHP

A simple PHP code to get unlimited instagram public pictures by every user without api, without credentials.

50 13 3

News_Summary sunnysai12345 Jupyter Notebook

Dataset and scripts for scraping the news articles from popular sources along with the summary of the article.

49 28 2

AngleParse kamome283 C#

HTML parsing and processing tool for PowerShell.

49 6 1

clearcote-browser clearcotelabs Python

Open-source stealth Chromium 149 with engine-level fingerprint spoofing - de-Googled, drop-in Playwright, fully buildable and verifiable from source.

48 4 0

puppeteer-humanize force-adverse TypeScript

🕺 Humanizer functions for Puppeteer

48 9 3

local-api-client-python kameleo-io Python

Official Python library for interacting with Kameleo Client

48 6 48

freenom-auto-renew-domains Sorok-Dva TypeScript

A scraper built with puppeteer that auto renew free domains on Freenom and send discord message using bot

48 17 3

flutter_notification_listener jiusanzhou Kotlin

Listen for and interact with Android notifications from Flutter.

48 64 2

DeepSearchJobs wakil69 Python

DeepSearchJobs is a job-discovery engine that uncovers hidden, niche, and low-competition opportunities not found on major platforms. It uses smart sc...

47 2 0

python-vistopia chazeon Python

看理想 Python 客户端 / 下载器，下载看理想的音频和文稿

47 14 2

youtube-comment-scraper ahmedshahriar Jupyter Notebook

This script will dump youtube video comments to a CSV from youtube video links. Video links can be placed inside a variable or list or CSV

47 16 1

dom-content-extraction oiwn Rust

DOM Based Content Extraction via Text Density

46 2 1

scaling-to-distributed-crawling ZenRows HTML

Repository for the Mastering Web Scraping in Python: Scaling to Distributed Crawling blogpost with the final code.

46 8 4

jimov_api koikiss-dev TypeScript

This project is an open-source API for retrieving multimedia content such as anime, movies and series, news, and manga in both Spanish and English.

46 24 3

github-trending-cli psalias2006 Python

A simple CLI tool to browse GitHub's trending repositories from your terminal.

46 1 1

oversmash filp TypeScript

Overwatch API library for player details and career stats

45 7 5

jason-the-miner mawrkus JavaScript

⛏ A versatile Web scraper for Node.js

45 11 6

go-ps4 lucasepe Go

Search your favorite PS4 games from Playstation Store using the Command Line

45 6 1

image-collector x-sk217 Python

Download images from Google Image Search

45 23 1

nothing-browser BunElysiaReact C++

Does nothing... except everything that matters.

45 4 0

torchestrator lspahija Kotlin

Spin up Tor containers and then proxy HTTP requests via these Tor instances

45 8 4

info-bot irevenko Python

🤖 A Versatile Telegram Bot

45 13 4

permaculture jwnigel Python

Permaculture design app built on scraped plant databases. Drag-n-drop GUI with detailed design plan generator.

45 6 6

async-pubmed-scraper IliaZenkov Python

PubMed scraper for async search on a list of keywords and concurrent extraction of all found URLs, returning a DataFrame/CSV containing all article da...

45 17 2