Most popular scraping repositories and open source projects

api.consumet.org

A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Nov...

130   473   473  

search-engine-parser

Lightweight package to query popular search engines and scrape for res...

87   472   472  

scrapfly-scrapers

Scalable Python web scraping scripts for +40 popular domains

119   469   469  

quetre

A libre front-end for Quora

32   457   457  

jikan-rest

The REST API for Jikan

264   454   454  

List-of-user-agents

List of major web + mobile browser user agent strings. +1 Bonus script...

222   441   441  

dude

dude uncomplicated data extraction: A simple framework for writing web...

19   429   429  

rnet

A blazing-fast Python HTTP Client with TLS fingerprint

36   427   427  

HomeHarvest

Python package for scraping real estate property data

103   421   421  

tinking

🧶 Extract data from any website without code, just clicks.

27   421   421  

LinkedInDumper

Python 3 script to dump/scrape/extract company employees from LinkedIn...

45   420   420  

GoogleBard

GoogleBard - A reverse engineered API for Google Bard chatbot for Node...

56   418   418  

juriscraper

An API to scrape American court websites for metadata.

119   410   410  

Ominis-OSINT

This Python application is an OSINT (Open Source Intelligence) tool ca...

31   396   396  

lambdasoup

Functional HTML scraping and rewriting with CSS in OCaml

32   394   394  

post-tuto-deployment

Build and deploy a machine learning app from scratch 🚀

102   391   391  

reaper

Social media scraping / data collection tool for the Facebook, Twitter...

67   382   382  

scrapy-zyte-smartproxy

Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy

89   364   364  

coronadatascraper

COVID-19 Coronavirus data scraped from government and curated data sou...

177   364   364  

crawler

Library for Rapid (Web) Crawler and Scraper Development

13   360   360  

MetaDetective

🕵️ Unleash Metadata Intelligence with MetaDetective. Your Assistant Be...

35   358   358  

web-scraping

Más de 50 ejemplos de web scraping utilizando: Requests | Scrapy | Sel...

206   353   353  

ScrapySharp

reborn of https://bitbucket.org/rflechner/scrapysharp

76   352   352  

Torrent-Api-py

An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galax...

238   350   350  

ha-multiscrape

Home Assistant custom component for scraping (html, xml or json) multi...

16   340   340  

libremdb

A free & open source IMDb front-end.

32   336   336  

elixir-scrape

Scrape any website, article or RSS/Atom Feed with ease!

43   330   330  

geeksforgeeks.pdf

Topic wise PDFs of Geeks for Geeks articles. (Last updated in October...

125   315   315  

memorious

Lightweight web scraping toolkit for documents and structured data.

62   311   311  

crawler

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

82   308   308  

BotBrowser

Stealth browser with a modified Chromium core, bypassing Cloudflare, S...

45   308   308  

nudecrawler

Crawl telegra.ph searching for nudes!

28   305   305  

4cat

The 4CAT Capture and Analysis Toolkit provides modular data capture &...

62   298   298  

Sasila

一个灵活、友好的爬虫框架

69   296   296  

dendrite-python-sdk

Tools to build web AI agents that can authenticate, interact with and...

18   294   294  

PulsarRPA

Automate webpages at scale, scrape web data completely and accurately...

59   287   287  

Python-Web-Scraping-Tutorial

In this Python Web Scraping Tutorial, we will outline everything neede...

30   280   280  

TheScrapper

Scrape emails, phone numbers and social media accounts from a website.

54   271   271  

Instagram-Follower-Scraper

Scrapes all the data of followers of any instagram account

55   266   266  

antch

Antch, a fast, powerful and extensible web crawling & scraping framewo...

41   262   262  

SerpScrap

SEO python scraper to extract data from major searchengine result page...

61   261   261  

Musoq

Use SQL on various data sources

14   259   259  

arachnid

Crawl all unique internal links found on a given website, and extract...

59   255   255  

scan-for-webcams

scan for webcams on the internet

49   254   254  

SpotAPI

A python wrapper for the public & private Spotify API

10   251   251  

spidex

Continuous reconnaissance network scanner designed for large-scale sca...

40   250   250  

spatula

A modern Python library for writing maintainable web scrapers.

11   248   248  

D4N155

OWASP D4N155 - Intelligent and dynamic wordlist using OSINT

48   247   247  

scrapper

Web scraper with a simple REST API living in Docker and using a Headle...

37   243   243  

web-scraper-chrome-extension

Web data extraction tool implemented as chrome extension

72   242   242