Topic

scraping

Repositories (1626)

gogoanime-api
gogoanime-api riimuru JavaScript

Anime Streaming, Discovery API made with Cheerio and Express. Uses data from Gogoanime

498
Musoq
Musoq Puchaczov C#

SQL Syntax without any database

488
scrape-linkedin-selenium
scrape-linkedin-selenium austinoboyle HTML

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

483
jikan-rest
jikan-rest jikan-me PHP

The REST API for Jikan

482
search-engine-parser
search-engine-parser bisohns Python

Lightweight package to query popular search engines and scrape for result titles, links and descriptions

477
LinkedInDumper
LinkedInDumper l4rm4nd Python

Python 3 script to dump/scrape/extract company employees from LinkedIn API

477
quetre
quetre zyachel JavaScript

A libre front-end for Quora

457
List-of-user-agents
List-of-user-agents tamimibrahim17 Python

List of major web + mobile browser user agent strings. +1 Bonus script to scrape :)

441
dude
dude roniemartinez Python

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

428
rnet
rnet 0x676e67 Rust

A blazing-fast Python HTTP Client with TLS fingerprint

427
tinking
tinking baptisteArno TypeScript

🧶 Extract data from any website without code, just clicks.

421
HomeHarvest
HomeHarvest ZacharyHampton Python

Python package for scraping real estate property data

421
GoogleBard
GoogleBard PawanOsman TypeScript

GoogleBard - A reverse engineered API for Google Bard chatbot for NodeJS

421
juriscraper
juriscraper freelawproject HTML

An API to scrape American court websites for metadata.

410
Ominis-OSINT
Ominis-OSINT AnonCatalyst Python

This Python application is an OSINT (Open Source Intelligence) tool called "Ominis OSINT - Web Hunter." It performs online information gathering by qu...

396
n8n-nodes-puppeteer
n8n-nodes-puppeteer drudge TypeScript

n8n node for browser automation using Puppeteer

395
lambdasoup
lambdasoup aantron OCaml

Functional HTML scraping and rewriting with CSS in OCaml

394
post-tuto-deployment
post-tuto-deployment MarwanDebbiche Python

Build and deploy a machine learning app from scratch 🚀

394
reaper
reaper ScriptSmith Python

Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs

389
ha-multiscrape
ha-multiscrape danieldotnl Python

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for eac...

372
crawler
crawler crwlrsoft PHP

Library for Rapid (Web) Crawler and Scraper Development

366
scrapy-zyte-smartproxy
scrapy-zyte-smartproxy scrapy-plugins Python

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

364
coronadatascraper
coronadatascraper covidatlas HTML

COVID-19 Coronavirus data scraped from government and curated data sources.

364
MetaDetective
MetaDetective franckferman Python

🕵️ Unleash Metadata Intelligence with MetaDetective. Your Assistant Beyond Metagoofil.

358
web-scraping
web-scraping lkuffo Python

Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Selenium | LXML | BeautifulSoup

353
ScrapySharp
ScrapySharp rflechner C#

reborn of https://bitbucket.org/rflechner/scrapysharp

352
Torrent-Api-py
Torrent-Api-py Ryuk-me Python

An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galaxy, Zooqle, Kickass, Bitsearch, MagnetDL,Libgen, YTS, Limetorrent, TorrentFunk, G...

350
libremdb
libremdb zyachel TypeScript

A free & open source IMDb front-end.

336
elixir-scrape
elixir-scrape Anonyfox Elixir

Scrape any website, article or RSS/Atom Feed with ease!

330
geeksforgeeks.pdf
geeksforgeeks.pdf dufferzafar Python

Topic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)

315
memorious
memorious alephdata Python

Lightweight web scraping toolkit for documents and structured data.

313
crawler
crawler infinilabs Go

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

309
BotBrowser
BotBrowser MiddleSchoolStudent TypeScript

Stealth browser with a modified Chromium core, bypassing Cloudflare, Shape, PerimeterX, Datadome, Akamai, Kasada, hCaptcha, and reCAPTCHA reliably

308
nudecrawler
nudecrawler yaroslaff Python

Crawl telegra.ph searching for nudes!

305
4cat
4cat digitalmethodsinitiative Python

The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.

298
Sasila
Sasila da2vin Python

一个灵活、友好的爬虫框架

296
dendrite-python-sdk
dendrite-python-sdk dendrite-systems Python

Tools to build web AI agents that can authenticate, interact with and extract data from any website.

294
PulsarRPA
PulsarRPA platonai Kotlin

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.

287
Python-Web-Scraping-Tutorial
Python-Web-Scraping-Tutorial oxylabs Python

In this Python Web Scraping Tutorial, we will outline everything needed to get started with web scraping. We will begin with simple examples and move...

280
TheScrapper
TheScrapper champmq Python

Scrape emails, phone numbers and social media accounts from a website.

271
Instagram-Follower-Scraper
Instagram-Follower-Scraper superryeti Python

Scrapes all the data of followers of any instagram account

266
scrapper
scrapper amerkurev Python

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

266
antch
antch antchfx Go

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

263
SerpScrap
SerpScrap ecoron Python

SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchre...

261
arachnid
arachnid zrashwani PHP

Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites

255
scan-for-webcams
scan-for-webcams JettChenT Python

scan for webcams on the internet

254
spidex
spidex alechilczenko Python

Continuous reconnaissance network scanner designed for large-scale scans, collecting information on all Internet assets.

254
SpotAPI
SpotAPI Aran404 Python

A python wrapper for the public & private Spotify API

251
spatula
spatula jamesturk Python

A modern Python library for writing maintainable web scrapers.

248
D4N155
D4N155 OWASP Shell

OWASP D4N155 - Intelligent and dynamic wordlist using OSINT

247