Topic

scraping

Repositories (1766)

scrapper
scrapper amerkurev Python

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

314
Gmail-Creation-Automation-Python
Gmail-Creation-Automation-Python khaouitiabdelhakim Python

This script allows you to automate the creation of Gmail accounts using the Selenium automation framework with the Chrome WebDriver. It navigates thro...

313
dendrite-python-sdk
dendrite-python-sdk dendrite-systems Python

Tools to build web AI agents that can authenticate, interact with and extract data from any website.

309
rota
rota alpkeskin Go

A high-performance proxy rotation engine with automated IP management and real-time health monitoring

308
facebook-group-members-scraper
facebook-group-members-scraper floriandiud TypeScript

Facebook Group Members Extractor. Download Facebook group members in CSV.

306
Python-Web-Scraping-Tutorial
Python-Web-Scraping-Tutorial oxylabs Python

In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move...

303
Sasila
Sasila da2vin Python

一个灵活、友好的爬虫框架

297
instatools
instatools new92 Python

🧰 A collection of automation tools for Instagram 📱| Written in Python 🐍 | Don't forget to ⭐ the repo !

289
Instagram-Follower-Scraper
Instagram-Follower-Scraper superryeti Python

Scrapes all the data of followers of any instagram account

288
llm-reader
llm-reader m92vyas Python

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.

288
PulsarRPA
PulsarRPA platonai Kotlin

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.

287
web-scraper-chrome-extension
web-scraper-chrome-extension ispras JavaScript

Web data extraction tool implemented as chrome extension

278
Antibot-Detector
Antibot-Detector scrapfly JavaScript

Real-time detection of anti-bot systems, CAPTCHAs & fingerprinting techniques. Identifies Cloudflare, Akamai, DataDome, reCAPTCHA, hCaptcha, Shape Se...

277
SerpScrap
SerpScrap ecoron Python

SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchre...

270
D4N155
D4N155 OWASP Shell

OWASP D4N155 - Intelligent and dynamic wordlist using OSINT

269
scan-for-webcams
scan-for-webcams JettChenT Python

scan for webcams on the internet

268
linkedIn-scraper
linkedIn-scraper ManiMozaffar Python

A playwright bot which is implemented to scrape linkedin and store advertisement data in a database and telegram channel

267
antch
antch antchfx Go

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

265
ninjemail
ninjemail david96182 Python

Python library for automated email account creation. Create multiple accounts easily with support for major email providers.

263
kimi
kimi alechilczenko Python

Attack Surface Discovery tool built on a microservice approach, utilizing multi-threading for fast, internet-scale asset indexing

258
arachnid
arachnid zrashwani PHP

Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites

254
spatula
spatula jamesturk Python

A modern Python library for writing maintainable web scrapers.

250
Kemono-and-Coomer-Downloader
Kemono-and-Coomer-Downloader e43b Python

The Kemono and Coomer Downloader simplifies downloading posts from Kemono and Coomer websites, allowing users to download individual or multiple posts...

246
bbb-face-recognizer
bbb-face-recognizer rrazvd Python

Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

243
MinerU-HTML
MinerU-HTML opendatalab Python

MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and traini...

237
comcrawl
comcrawl michaelharms Python

A python utility for downloading Common Crawl data

237
jsoup-annotations
jsoup-annotations fcannizzaro Java

Jsoup Annotations POJO

236
Humanoid
Humanoid evyatarmeged JavaScript

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

236
anime-dl
anime-dl Xonshiz Python

Anime-dl is a command-line program to download anime from CrunchyRoll and Funimation.

232
idt
idt deliton Python

Image Dataset Tool (idt) is a cli tool designed to make the otherwise repetitive and slow task of creating image datasets into a fast and intuitive pr...

231
goose-parser
goose-parser redco JavaScript

Universal scraping tool, which allows you to extract data using multiple environments

229
educative.io-downloader
educative.io-downloader shihabmridha TypeScript

Free Palestine. 📖 This tool is to download course from educative.io for offline usage. It uses your login credentials and download the course.

226
facebook-data-extraction
facebook-data-extraction 18520339 Python

Experience for effectively fetching Facebook data by Querying Graph API with Account-based Token and Operating undetectable scraping Bots to extract C...

226
spidercreator
spidercreator carlosplanchon Python

Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of Playwright-based spiders with minimal manual coding. I...

217
SouqScraper
SouqScraper enghamzasalem Python

Simple scripts for Level UP your scraping Skills, and source code for Level UP playlist on Youtube

216
transistor
transistor bomquote Python

Transistor, a Python web scraping framework for intelligent use cases.

211
Grawler
Grawler A3h1nt PHP

Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them...

211
Dorkify
Dorkify hhhrrrttt222111 Python

Perform Google Dork search with Dorkify

208
blinkist-scraper
blinkist-scraper leoncvlt Python

📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output

205
linkedin-learning-downloader
linkedin-learning-downloader liranbg Python

Linkedin Learning videos downloader

201
jsonframe-cheerio
jsonframe-cheerio gahabeen JavaScript

simple multi-level scraper json input/output for Cheerio

198
vlrggapi
vlrggapi axsddlr Python

An Unofficial REST API for vlr.gg, a site for Valorant Pro Esports match results and news.

197
Email-extractor
Email-extractor DiegoCaraballo Python

The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de...

196
xword-dl
xword-dl thisisparker Python

▞ Command line tool to scrape crosswords from online solvers and save them as .puz files ▚

196
estela
estela bitmakerla TypeScript

estela, an elastic web scraping cluster 🕸

196
lazy-json-pages
lazy-json-pages cerbero90 PHP

📜 Framework-agnostic API scraper to load items from any paginated JSON API into a Laravel lazy collection via async HTTP requests.

196
WebScrapper
WebScrapper nuhmanpk Python

Powerful Telegram bot for web scraping and crawling. Fast, easy, and loved by thousands!

195
awesome-python-primer
awesome-python-primer zkqiang Python

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向

192
examples
examples hyperbrowserai Jupyter Notebook

Examples for using Hyperbrowser

191
SpotiFile
SpotiFile Michael-K-Stein Python

Spotify scraper

190