Topic

scraping

Repositories (1626)

web-scraper-chrome-extension
web-scraper-chrome-extension ispras JavaScript

Web data extraction tool implemented as chrome extension

242
facebook-group-members-scraper
facebook-group-members-scraper floriandiud TypeScript

Facebook Group Members Extractor. Download Facebook group members in CSV.

241
instatools
instatools new92 Python

🧰 A collection of automation tools for Instagram 📱| Written in Python 🐍 | Don't forget to ⭐ the repo !

240
bbb-face-recognizer
bbb-face-recognizer rrazvd Python

Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

239
linkedIn-scraper
linkedIn-scraper ManiMozaffar Python

A playwright bot which is implemented to scrape linkedin and store advertisement data in a database and telegram channel

238
jsoup-annotations
jsoup-annotations fcannizzaro Java

Jsoup Annotations POJO

237
comcrawl
comcrawl michaelharms Python

A python utility for downloading Common Crawl data

237
anime-dl
anime-dl Xonshiz Python

Anime-dl is a command-line program to download anime from CrunchyRoll and Funimation.

231
goose-parser
goose-parser redco JavaScript

Universal scraping tool, which allows you to extract data using multiple environments

228
idt
idt deliton Python

Image Dataset Tool (idt) is a cli tool designed to make the otherwise repetitive and slow task of creating image datasets into a fast and intuitive pr...

228
educative.io-downloader
educative.io-downloader shihabmridha TypeScript

Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.

222
Humanoid
Humanoid evyatarmeged JavaScript

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

222
transistor
transistor bomquote Python

Transistor, a Python web scraping framework for intelligent use cases.

214
SouqScraper
SouqScraper enghamzasalem Python

Simple scripts for Level UP your scraping Skills, and source code for Level UP playlist on Youtube

213
Grawler
Grawler A3h1nt PHP

Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them...

213
facebook-data-extraction
facebook-data-extraction 18520339 Python

Experience for effectively fetching Facebook data by Querying Graph API with Account-based Token and Operating undetectable scraping Bots to extract C...

205
Dorkify
Dorkify hhhrrrttt222111 Python

Perform Google Dork search with Dorkify

201
linkedin-learning-downloader
linkedin-learning-downloader liranbg Python

Linkedin Learning videos downloader

200
blinkist-scraper
blinkist-scraper leoncvlt Python

📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output

200
jsonframe-cheerio
jsonframe-cheerio gahabeen JavaScript

simple multi-level scraper json input/output for Cheerio

199
lazy-json-pages
lazy-json-pages cerbero90 PHP

📜 Framework-agnostic API scraper to load items from any paginated JSON API into a Laravel lazy collection via async HTTP requests.

194
SpotiFile
SpotiFile Michael-K-Stein Python

Spotify scraper

187
estela
estela bitmakerla TypeScript

estela, an elastic web scraping cluster 🕸

185
Email-extractor
Email-extractor DiegoCaraballo Python

The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de...

183
UdemyCourseGrabber
UdemyCourseGrabber keethesh Python

Your will to enroll in Udemy course is here, but the money isn't? Search no more! This python program searches for your desired course in more than [i...

180
Python-Selenium-Action
Python-Selenium-Action MarketingPipeline Python

Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!

178
DotnetCrawler
DotnetCrawler mehmetozkaya C#

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library des...

176
JsonGenius
JsonGenius semanser Go

Get structured JSON data from any page.

175
Gmail-Creation-Automation-Python
Gmail-Creation-Automation-Python khaouitiabdelhakim Python

This script allows you to automate the creation of Gmail accounts using the Selenium automation framework with the Chrome WebDriver. It navigates thro...

174
fantasy-basketball
fantasy-basketball KengoA Jupyter Notebook

Scraping statistics, predicting NBA player performance with neural networks and boosting algorithms, and optimising lineups for Draft Kings with gene...

172
search-engine-google
search-engine-google serp-spider PHP

:spider: Google client for SERPS

170
awesome-python-primer
awesome-python-primer zkqiang Python

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向

169
shadow-useragent
shadow-useragent lobstrio Python

Pick the most common user-agents on the Internet 👻

168
languagepod101-scraper
languagepod101-scraper nedlir Python

Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian...

167
scrapers
scrapers Police-Data-Accessibility-Project Python

Code relating to scraping public police data.

164
xquery
xquery antchfx Go

Extract data or evaluate value from HTML/XML documents using XPath

158
tweetdrop
tweetdrop Anish-Agnihotri TypeScript

Generate dispersable airdrops from Twitter threads.

157
WebScrapper
WebScrapper nuhmanpk Python

Powerful Telegram bot for web scraping and crawling. Fast, easy, and loved by thousands!

156
Leetcode-Questions-Scraper
Leetcode-Questions-Scraper Bishalsarang Python

Scrape Algorithm Questions from leetcode and generate html and epub file

156
xword-dl
xword-dl thisisparker Python

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and save them as .puz files ⬛⬜⬛

156
jimutmap
jimutmap Jimut123 Python

API to get enormous amount of high resolution satellite images from satellites.pro quickly through multi-threading! create map your own map dataset. B...

152
llm-reader
llm-reader m92vyas Python

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.

152
jazz
jazz jazzdotdev Rust

The Scripting Engine that Combines Speed, Safety, and Simplicity

147
FredsRoadtripStoryteller
FredsRoadtripStoryteller realityexpander Kotlin

Hear local historical markers as you travel on your road-trip. 100% Shared Compose UI, Kotlin native cross-platform codebase. Includes Cocoapods, Goog...

146
sasori
sasori karthikuj JavaScript

Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.

144
sqrape
sqrape cathalgarvey Go

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

143
curl-post-requests
curl-post-requests oxylabs

Learn how to send POST requests with cURL.

142
od-database
od-database simon987 Python

Distributed crawler, database and web frontend for public directories indexing

141
Kemono-and-Coomer-Downloader
Kemono-and-Coomer-Downloader e43b Python

The Kemono and Coomer Downloader simplifies downloading posts from Kemono and Coomer websites, allowing users to download individual or multiple posts...

141
ctenopharyngodon-idella
ctenopharyngodon-idella touero Java

Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

140