Most popular scraping repositories and open source projects

SerpScrap

SEO python scraper to extract data from major searchengine result page...

59   219   219  

spidex

A little internet crawler

41   216   216  

spatula

A modern Python library for writing maintainable web scrapers.

6   213   213  

transistor

Transistor, a Python web scraping framework for intelligent use cases.

23   212   212  

mov-cli

A cli tool to browse and watch Movies/Shows/TV/Sports.

24   210   210  

crawler

Library for Rapid (Web) Crawler and Scraper Development

6   207   207  

jsonframe-cheerio

simple multi-level scraper json input/output for Cheerio

25   199   199  

linkedin-learning-downloader

Linkedin Learning videos downloader

110   199   199  

SouqScraper

Simple scripts for Level UP your scraping Skills, and source code for...

166   190   190  

scan-for-webcams

scan for webcams on the internet

39   189   189  

Grawler

Grawler is a tool written in PHP which comes with a web interface that...

57   188   188  

web-scraping

More than 50 web scraping examples using: Requests | Scrapy | Selenium...

129   185   185  

educative.io-downloader

📖 This tool is to download course from educative.io for offline usage....

126   184   184  

Dorkify

Perform Google Dork search with Dorkify

34   176   176  

fantasy-basketball

Scraping statistics, predicting NBA player performance with neural ne...

49   172   172  

UdemyCourseGrabber

Your will to enroll in Udemy course is here, but the money isn't? Sear...

28   171   171  

Instagram-Follower-Scraper

Scrapes all the data of followers of any instagram account

43   171   171  

libremdb

A free & open source IMDb front-end.

22   171   171  

shadow-useragent

Pick the most common user-agents on the Internet 👻

11   165   165  

blinkist-scraper

📚 Python tool to download book summaries and audio from Blinkist.com,...

30   159   159  

SpotiFile

Spotify scraper

16   159   159  

search-engine-google

:spider: Google client for SERPS

61   157   157  

4cat

The 4CAT Capture and Analysis Toolkit provides modular data capture &...

43   157   157  

xquery

Extract data or evaluate value from HTML/XML documents using XPath

28   156   156  

Humanoid

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

20   155   155  

DotnetCrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying...

54   153   153  

comcrawl

A python utility for downloading Common Crawl data

29   151   151  

Torrent-Api-py

An Unofficial API for 1337x, Piratebay, Nyaasi, Torlock, Torrent Galax...

211   151   151  

tweetdrop

Generate dispersable airdrops from Twitter threads.

21   150   150  

D4N155

OWASP D4N155 - Intelligent and dynamic wordlist using OSINT

44   148   148  

sqrape

Simple Query Scraping with CSS and Go Reflection (MOVED to Gitlab)

7   143   143  

PDAP-Scrapers

Code relating to scraping public police data.

30   143   143  

Email-extractor

The main functionality is to extract all the emails from one or severa...

67   139   139  

ha-multiscrape

Home Assistant custom component for scraping (html, xml or json) multi...

11   139   139  

jazz

The Scripting Engine that Combines Speed, Safety, and Simplicity

12   138   138  

languagepod101-scraper

Python scraper for Language Pods such as Japanesepod101.com :japanese_...

26   138   138  

Python-Selenium-Action

Run Selenium with Python via Github Actions using Headless or Non-Head...

30   136   136  

od-database

Distributed crawler, database and web frontend for public directories...

24   134   134  

double-agent

A test suite of common scraper detection techniques. See how detectabl...

9   127   127  

htmlSQL

htmlSQL is a experimental PHP library which allows you to access HTML...

45   126   126  

seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework t...

46   125   125  

web-scraper-chrome-extension

Web data extraction tool implemented as chrome extension

46   124   124  

TheScrapper

Scrape emails, phone numbers and social media accounts from a website.

30   123   123  

Instagram-to-discord

Monitor instagram user account and automatically post new images to di...

58   121   121  

nimquery

Nim library for querying HTML using CSS-selectors (like JavaScripts do...

8   119   119  

nintendeals

Library with a set of tools for scraping information about Nintendo ga...

14   118   118  

pastepwn

Python framework to scrape Pastebin pastes and analyze them

61   110   110  

estela

estela, an elastic web scraping cluster 🕸

5   110   110  

scrapy-puppeteer

Scrapy + Puppeteer

28   109   109  

rs-bed-covid-indo-api

API ketersediaan rumah sakit dan tempat tidur rumah sakit untuk pasien...

22   104   104