Topic

scraping

Repositories (1766)

mtnt
mtnt pmichel31415 Python

Code for the collection and analysis of the MTNT dataset

56
sample-web-scraping-with-electron
sample-web-scraping-with-electron Tazeg JavaScript

Sample project for web scraping with Electron

56
actor-facebook-scraper
actor-facebook-scraper pocesar TypeScript

Scrape public Facebook pages, posts, reviews and comments

56
silkworm
silkworm BitingSnakes Python

Async web scraping framework on top of Rust. Works with Free-threaded Python (`PYTHON_GIL=0`).

56
Euro2016_TerminalApp
Euro2016_TerminalApp jctissier HTML

:soccer: Instantly find :trophy:EURO 2016 live-streams & highlights, now a Web App!

55
scraper-fourone-jobs
scraper-fourone-jobs kokokuo Python

This is a anti-scraping cracker for extracting apply information of one of Taiwan jobs recruiting website.

55
ogpParser
ogpParser ukyoda TypeScript

Open Graph Protocol Parser for Node.js

55
hext
hext html-extract C++

Domain-specific language for extracting structured data from HTML documents

55
Junior_Zone
Junior_Zone Moscarde Python

Vagas Jr. atualizadas diariamente. Telegram e Planilha Online

55
aniyoi-api
aniyoi-api miukyo TypeScript

REST API Anime Subtitle Indonesia | Streaming Anime Sub Indo

55
onlyfans-scraper
onlyfans-scraper kr4ude Python

A tool that allows you to scrape media from any Onlyfans account and more

54
UltraStar-CLI
UltraStar-CLI martiinii TypeScript

Download any song from biggest database of UltraStar songs for your karaoke party!

54
firecrawl-quickstarts
firecrawl-quickstarts alexfazio Jupyter Notebook

A collection of cookbooks to help developers get started quickly with the Firecrawl API.

54
diffbot-php-client
diffbot-php-client Swader PHP

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

53
learn.scrapinghub.com
learn.scrapinghub.com scrapinghub CSS

Scrapinghub Learning Center. Report issues in Jira: Report issues in Jira: https://scrapinghub.atlassian.net/projects/WEB

53
dart-scraper
dart-scraper josw123 Vue

한국 금융감독원에서 운영하는 다트(Dart) 시스템을 이용한 기업 재무제표 추출 프로그램

53
AI-Cursor-Scraping-Assistant
AI-Cursor-Scraping-Assistant TheWebScrapingClub Python

A powerful tool that leverages Cursor AI and MCP (Model Context Protocol) to easily generate web scrapers for various types of websites.

53
tiktok-comment-scrapper
tiktok-comment-scrapper RomySaputraSihananda Python

Get all comments from tiktok video url or id

53
socials
socials lorey Python

👨‍👩‍👦 Python library and CLI to turn URLs into structured social media profiles.

53
trex
trex tracking-exposed HTML

youtube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling

53
getter
getter kastaid Python

A powerful and customizable Telegram userbot built with Telethon

53
python-scrapfly
python-scrapfly scrapfly Python

Scrapfly Python SDK for headless browsers and proxy rotation

53
CraigslistScraper
CraigslistScraper ryanirl Python

Simple webscraper for Craigslist.

53
ScarperApi
ScarperApi Anshu78780 TypeScript

Its a Scarper api that will give you direct movie data in ur local machine without needing to watch and ads . It now supports netmirror.

52
scrapers
scrapers montoyamoraga Python

scrapers for building your own image databases

52
Filmweb2Letterboxd
Filmweb2Letterboxd JSerwatka JavaScript

Eksport ocen z Filmweb'u do pliku csv w formacie akceptowanym przez importer Letterboxd

52
torrent-tracker-scraper
torrent-tracker-scraper project-mk-ultra Python

A UDP torrent tracker scraper library written in Python 3

52
thecrowler
thecrowler pzaino Go

A Content Discovery and Development Platform. Empowering Cybersecurity, AI, Marketing, and Finance professionals and researchers to discover, analyze,...

52
react-node-web-scraper
react-node-web-scraper codegratia JavaScript

Final Year project, scraping data of e-commerce stores and display in ReactJS app.

52
garlic
garlic velocitatem JavaScript

🧄🧛 protect your website from being scraped by bots.

52
beautifulsoup-tutorial
beautifulsoup-tutorial hackersandslackers Python

:sparkles: :ramen: Scrape webpage metadata using BeautifulSoup.

51
hyper-sdk-js
hyper-sdk-js Hyper-Solutions TypeScript

JavaScript / TypeScript SDK for Bot Protection Bypass - Automate Akamai, Incapsula, Kasada, and DataDome. No browsers required. Solve challenges and g...

51
tiktok-trending-data-api
tiktok-trending-data-api ogohogo JavaScript

Scraping the TikTok Discovery Data API every 1 hour using Github Actions to view changes

51
CaseHarvester
CaseHarvester dismantl Python

AWS-based application for scraping the Maryland Judiciary Case Search

51
Scraping-Dynamic-JavaScript-Ajax-Websites-With-BeautifulSoup
Scraping-Dynamic-JavaScript-Ajax-Websites-With-BeautifulSoup oxylabs Python

A guide on how to scrape JavaScript rendered websites with Python and BeautifulSoup.

51
rebrowser-playwright
rebrowser-playwright rebrowser

A drop-in replacement for playwright patched with rebrowser-patches. It allows to pass modern automation detection tests.

51
hyper-sdk-go
hyper-sdk-go Hyper-Solutions Go

Go SDK for Bot Protection Bypass - Automate Akamai, Incapsula, Kasada, and DataDome. No browsers required. Solve challenges and generate valid sensors...

50
configs
configs Diggernaut

Public, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores

50
ai_papers_scrapper
ai_papers_scrapper george-gca Python

Download papers pdfs and other info from main AI conferences

50
News_Summary
News_Summary sunnysai12345 Jupyter Notebook

Dataset and scripts for scraping the news articles from popular sources along with the summary of the article.

50
dilbert-viewer
dilbert-viewer rharish101 Rust

A simple comic viewer for Dilbert by Scott Adams

49
freenom-auto-renew-domains
freenom-auto-renew-domains Sorok-Dva TypeScript

A scraper built with puppeteer that auto renew free domains on Freenom and send discord message using bot

48
local-api-client-python
local-api-client-python kameleo-io Python

Official Python library for interacting with Kameleo Client

48
AI_Manga_Reader
AI_Manga_Reader AI-Manga-Readers JavaScript

AI Manga Reader is a next-gen manga app powered by the MangaDex API, offering vast multi-language content and flexible reading modes. It uses AI-power...

48
instagram-without-api
instagram-without-api orsifrancesco PHP

A simple PHP code to get unlimited instagram public pictures by every user without api, without credentials.

48
youtube-comment-scraper
youtube-comment-scraper ahmedshahriar Jupyter Notebook

This script will dump youtube video comments to a CSV from youtube video links. Video links can be placed inside a variable or list or CSV

48
UpworkScraper
UpworkScraper roperi Python

UpworkScraper allows you to scrape your best matches job postings from Upwork.

47
DeepSearchJobs
DeepSearchJobs wakil69 Python

DeepSearchJobs is a job-discovery engine that uncovers hidden, niche, and low-competition opportunities not found on major platforms. It uses smart sc...

47
flutter_notification_listener
flutter_notification_listener jiusanzhou Kotlin

Flutter plugin to listen for and interact with all incoming notifications for Android. 一个监听手机通知的插件。

47
puppeteer-humanize
puppeteer-humanize force-adverse TypeScript

🕺 Humanizer functions for Puppeteer

47