Topic

scraping

Repositories (1766)

Python-Selenium-Action
Python-Selenium-Action MarketingPipeline Python

Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!

187
Threat-Actor-Usernames-Scrape
Threat-Actor-Usernames-Scrape spmedia

A collection of intel and usernames scraped from various cybercrime sources & forums. DarkForums, HackForums, Patched, Cracked, BreachForums, OGUser,...

187
UdemyCourseGrabber
UdemyCourseGrabber keethesh Python

Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [i...

186
DotnetCrawler
DotnetCrawler mehmetozkaya C#

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library des...

181
Web-Data-Scraper
Web-Data-Scraper umbrellaDocumentation JavaScript

Web Data Scraper - no-code internet scraping. Extract and export to CSV, Excel, JSON, Google Sheets, and Webhook.

180
local-deepsearch-academic
local-deepsearch-academic iblameandrew Python

An implementation of Google Deep Search 🕵️ with support for 1000+ references, local inference, chatting with your scraping session using RAPTOR, and...

179
JsonGenius
JsonGenius semanser Go

Get structured JSON data from any page.

176
instagram-media-scraper
instagram-media-scraper ahmedrangel JavaScript

A simple Node.js code to get public information and media from every Instagram post or reel URL without API. Working 2025

176
shadow-useragent
shadow-useragent lobstrio Python

Pick the most common user-agents on the Internet 👻

173
fantasy-basketball
fantasy-basketball KengoA Jupyter Notebook

Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with gene...

172
languagepod101-scraper
languagepod101-scraper nedlir Python

Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian...

170
scrapers
scrapers Police-Data-Accessibility-Project Python

Code relating to scraping public police data.

170
search-engine-google
search-engine-google serp-spider PHP

:spider: Google client for SERPS

168
tiktok-trending-data
tiktok-trending-data antiops

Scraping the TikTok discovery web API every 15 minutes using Github Actions to view changes

167
apify-sdk-python
apify-sdk-python apify Python

The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, loc...

166
FredsRoadtripStoryteller
FredsRoadtripStoryteller realityexpander Kotlin

Hear local historical markers as you travel on your road-trip. 100% Shared Compose UI, Kotlin native cross-platform codebase. Includes Cocoapods, Goog...

166
Leetcode-Questions-Scraper
Leetcode-Questions-Scraper Bishalsarang Python

Scrape Algorithm Questions from leetcode and generate html and epub file

163
agentql-mcp
agentql-mcp tinyfish-io Shell

Model Context Protocol server that integrates AgentQL's data extraction capabilities.

161
xquery
xquery antchfx Go

Extract data or evaluate value from HTML/XML documents using XPath

156
tweetdrop
tweetdrop Anish-Agnihotri TypeScript

Generate dispersable airdrops from Twitter threads.

156
jimutmap
jimutmap Jimut123 Python

API to get enormous amount of high resolution satellite images from satellites.pro quickly through multi-threading! create map your own map dataset. B...

151
go-crawler
go-crawler lizongying Go

A web crawling framework implemented in Golang, it is simple to write and delivers powerful performance. It comes with a wide range of practical middl...

150
jazz
jazz jazzdotdev Rust

The Scripting Engine that Combines Speed, Safety, and Simplicity

148
decipher-research-agent
decipher-research-agent mtwn105 TypeScript

Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.

148
rebrowser-bot-detector
rebrowser-bot-detector rebrowser JavaScript

Modern tests to detect automated browser behavior. Cover most important leaks from Puppeteer and Playwright.

148
sasori
sasori karthikuj JavaScript

Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.

146
GoodreadsScraper
GoodreadsScraper havanagrawal Python

Scrape data from Goodreads using Scrapy and Selenium :books:

145
rayobrowse
rayobrowse rayobyte-data Python

Stealth Chromium browser for large-scale web scraping.

144
sqrape
sqrape cathalgarvey Go

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

143
od-database
od-database simon987 Python

Distributed crawler, database and web frontend for public directories indexing

143
proxifier
proxifier rookmoot Go

A fast, modern and intelligent proxy rotator perfect for crawling and scraping public data.

143
Movies-and-Series-Scraper
Movies-and-Series-Scraper yousefkotp Python

A console application to scrape a valid watching links for any movie or series with exact season and episode number, you can also download a whole sea...

143
arxiv-miner
arxiv-miner valayDave Python

arxiv_miner is a toolkit for mining research papers on CS ArXiv.

141
html2rss
html2rss html2rss Ruby

📰 Build RSS 2.0 feeds from websites (and JSON APIs) automatically or with a few CSS selectors.

141
double-agent
double-agent unblocked-web TypeScript

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

139
nimquery
nimquery GULPF Nim

Nim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)

138
lambda-scraper
lambda-scraper teticio JavaScript

Use AWS Lambda functions as a proxy pool to scrape web pages.

138
Upwork-AI-jobs-applier
Upwork-AI-jobs-applier kaymen99 Python

AI tool for automating Upwork job applications using AI agents to find and qualify jobs, write personalized cover letters, and prepare for interviews...

136
instagram-users-scraper
instagram-users-scraper floriandiud TypeScript

Instagram Scraper. Scrape Instagram followers, following list, and post authors. Download CSV files with Instagram users from followers, following, ta...

136
curl-post-requests
curl-post-requests oxylabs

Learn how to send POST requests with cURL.

136
wget-lua
wget-lua ArchiveTeam C

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

135
WebReaper
WebReaper pavlovtech C#

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

135
ctenopharyngodon-idella
ctenopharyngodon-idella touero Java

Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

134
scrapy-scrapingbee
scrapy-scrapingbee ScrapingBee Python

JavaScript support and proxy rotation for Scrapy with ScrapingBee.

131
nintendeals
nintendeals fedecalendino Python

Library with a set of tools for scraping information about Nintendo games and its prices across all regions (NA, EU and JP).

129
pastepwn
pastepwn d-Rickyy-b Python

Python framework to scrape Pastebin pastes and analyze them

129
MachineLearning
MachineLearning yug95 Jupyter Notebook

Machine learning for beginner(Data Science enthusiast)

129
seleniumcrawler
seleniumcrawler voliveirajr Python

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

128
Instagram-to-discord
Instagram-to-discord fernandod1 Python

Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!

128
robox
robox danclaudiupop Python

Simple library for exploring/scraping the web or testing a website you’re developing

128