Topic

scraping

Repositories (1766)

social-media-profiles-regexs
social-media-profiles-regexs lorey Python

:card_index: Extract social media profiles and more with regular expressions

645
socialreaper
socialreaper ScriptSmith Python

Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

642
reverse-api-engineer
reverse-api-engineer kalil0321 Python

Claude engineer that captures traffic, writes documentation and automatically generates API clients. Reverse engineer APIs!

641
pricewise
pricewise adrianhajdin TypeScript

Dive into web scraping and build a Next.js 13 eCommerce price tracker within a single video that teaches you data scraping, cron jobs, sending emails,...

641
docker-selenium-lambda
docker-selenium-lambda umihico Dockerfile

The simplest demo of chrome automation by python and selenium in AWS Lambda

621
newcrawler
newcrawler speed JavaScript

Free Web Scraping Tool with Java

587
PHPScraper
PHPScraper spekulatius PHP

A universal web-util for PHP.

586
LinkedInDumper
LinkedInDumper l4rm4nd Python

Python 3 script to dump/scrape/extract company employees from LinkedIn API

584
n8n-nodes-puppeteer
n8n-nodes-puppeteer drudge TypeScript

n8n node for browser automation using Puppeteer

569
juriscraper
juriscraper freelawproject HTML

An API to scrape American court websites for metadata.

568
Ominis-OSINT
Ominis-OSINT AnonCatalyst Python

This Python application is an OSINT (Open Source Intelligence) tool called "Ominis OSINT - Web Hunter." It performs online information gathering by qu...

560
spidermon
spidermon scrapinghub Python

Scrapy Extension for monitoring spiders execution.

554
jekyll
jekyll programminghistorian HTML

Jekyll-based static site for The Programming Historian

544
facebook_data_analyzer
facebook_data_analyzer Lackoftactics Ruby

Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, con...

542
scrape-linkedin-selenium
scrape-linkedin-selenium austinoboyle HTML

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

526
jikan-rest
jikan-rest jikan-me PHP

The REST API for Jikan

521
quick-start-guide
quick-start-guide oxylabs

Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.

516
Musoq
Musoq Puchaczov C#

SQL Runtime without any database

503
scrapple
scrapple AlexMathew Python

A framework for creating semi-automatic web content extractors

502
gogoanime-api
gogoanime-api riimuru JavaScript

Anime Streaming, Discovery API made with Cheerio and Express. Uses data from Gogoanime

498
nickjs
nickjs phantombuster JavaScript

Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)

497
quetre
quetre zyachel JavaScript

A libre front-end for Quora

496
search-engine-parser
search-engine-parser bisohns Python

Lightweight package to query popular search engines and scrape for result titles, links and descriptions

487
reddit-universal-scraper
reddit-universal-scraper ksanjeev284 Python

Universal Reddit Scraper - Works on any Subreddit or User

479
Kemono-Downloader
Kemono-Downloader Yuvi9587 Python

Kemono Downloader is a fast, powerful PyQt5 app for archiving content from a wide array of sites, including Kemono, Coomer, Bunkr, Erome, Saint2.su, n...

461
List-of-user-agents
List-of-user-agents tamimibrahim17 Python

List of major web + mobile browser user agent strings. +1 Bonus script to scrape :)

461
MetaDetective
MetaDetective franckferman Python

Unleash Metadata Intelligence with MetaDetective. Your Assistant Beyond Metagoofil.

457
ha-multiscrape
ha-multiscrape danieldotnl Python

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for eac...

428
tinking
tinking baptisteArno TypeScript

đź§¶ Extract data from any website without code, just clicks.

426
dude
dude roniemartinez Python

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

425
libremdb
libremdb zyachel TypeScript

A free & open source IMDb front-end.

425
GoogleBard
GoogleBard PawanOsman TypeScript

GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS

419
SpotAPI
SpotAPI Aran404 Python

A python wrapper for the public & private Spotify API

415
lambdasoup
lambdasoup aantron OCaml

Functional HTML scraping and rewriting with CSS in OCaml

406
Torrent-Api-py
Torrent-Api-py Ryuk-me Python

An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, MagnetDL,Libgen, YTS, Limetorrent, TorrentFunk, G...

401
reaper
reaper ScriptSmith Python

Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

393
post-tuto-deployment
post-tuto-deployment MarwanDebbiche Python

Build and deploy a machine learning app from scratch 🚀

392
scraperai
scraperai scraperai HTML

ScraperAI is an open-source, AI-powered tool designed to simplify web scraping for users of all skill levels.

390
web-scraping
web-scraping lkuffo Python

Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Selenium | LXML | BeautifulSoup

390
4cat
4cat digitalmethodsinitiative Python

The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.

382
crawler
crawler crwlrsoft PHP

Library for Rapid (Web) Crawler and Scraper Development

369
scrapy-zyte-smartproxy
scrapy-zyte-smartproxy scrapy-plugins Python

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

365
coronadatascraper
coronadatascraper covidatlas HTML

COVID-19 Coronavirus data scraped from government and curated data sources.

364
nudecrawler
nudecrawler yaroslaff Python

Crawl telegra.ph searching for nudes!

360
ScrapySharp
ScrapySharp rflechner C#

reborn of https://bitbucket.org/rflechner/scrapysharp

354
elixir-scrape
elixir-scrape Anonyfox Elixir

Scrape any website, article or RSS/Atom Feed with ease!

336
TheScrapper
TheScrapper champmq Python

Scrape emails, phone numbers and social media accounts from a website.

332
geeksforgeeks.pdf
geeksforgeeks.pdf dufferzafar Python

Topic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)

315
memorious
memorious alephdata Python

Lightweight web scraping toolkit for documents and structured data.

315
scrapper
scrapper amerkurev Python

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

314