Most popular scraping repositories and open source projects

goGetJS

a tool for extracting, searching, and saving JavaScript files (with op...

8   42   42  

webmagician-ui

An admin UI project for a configurable web crawler platform

14   42   42  

myanimelist-data-set-creator

Collection of some simple python scripts to create https://myanimelist...

6   42   42  

Upwork-AI-jobs-applier

AI tool for automating Upwork job applications using AI agents to find...

18   42   42  

webtranspose

Web scraping API for building AI applications.

2   41   41  

python-scrapfly

Scrapfly Python SDK for headless browsers and proxy rotation

11   41   41  

TikDown

Fast TikTok NO Watermark Video Downloader (username or url)

12   41   41  

noscrape

This repository is deprecated

8   41   41  

movie-posters-convnet

Unsupervised clustering of movie posters with features extracted from...

3   41   41  

Architeuthis

MITM HTTP(S) proxy with integrated load-balancing, rate-limiting and...

1   41   41  

html-table-to-json

Generate JSON representations of HTML tables

9   41   41  

flutter_notification_listener

Flutter plugin to listen for and interact with all incoming notificat...

52   41   41  

lc-webscraping

Introduction to web scraping

28   40   40  

shup

A POSIX shell script to parse HTML

4   40   40  

TorScrapper

A Scraper made 100% in Python using BeautifulSoup and Tor. It can be u...

11   40   40  

scrapingant-client-python

ScrapingAnt API client for Python.

5   40   40  

Extracty

Extract structured data from any unstructured web page

4   40   40  

linkeBot

🔎 um bot de Web Scraping para mostrar vagas do LinkedIn

7   40   40  

linkedin-scraper

Enhanced LinkedIn Job Search Chrome Extension

7   39   39  

pyplexity

Cleaning tool for web scraped text

3   39   39  

CobWeb-lnx

CobWeb is a Python library for web scraping. The library consists of t...

2   38   38  

fulldom-server

Proxy-like server that will show you the DOM of a page after JS runs

6   38   38  

extract-social-media

Extract social media links and account names from websites.

16   38   38  

Whatsapp-Scraper

Scraps all the open chats, and their last n messages, and saves them i...

10   38   38  

scrapy-scrapingbee

JavaScript support and proxy rotation for Scrapy with ScrapingBee.

6   38   38  

etf4u

📊 Python tool to scrape real-time information about ETFs from the web...

5   38   38  

Rotating-Proxies-With-Python

Learn about how to rotate proxies by using Python.

4   38   38  

papercut

Papercut is a scraping/crawling library for Node.js built on top of JS...

2   38   38  

fake-http-header

A python package to generate random request fields for a http header.

1   38   38  

scrapy-zyte-api

Zyte API integration for Scrapy

21   38   38  

async-pubmed-scraper

PubMed scraper for async search on a list of keywords and concurrent e...

16   38   38  

sneakpeek

Sneakpeek is a framework that helps to quickly and conviniently develo...

0   37   37  

tvseries

TV Series is a tool that scrapes Episode Synopsis' of popular TV Serie...

25   37   37  

InstaBot

Simple and friendly Bot for Instagram, using Selenium and Scrapy with...

10   37   37  

gopher-parse-sitemap

A high effective golang library for parsing big-sized sitemaps and avo...

19   37   37  

chirps

Twitter bot powering @arichduvet

9   36   36  

freesoccer

:soccer: Free API with results from national soccer competitions

9   36   36  

google-scraper

This class can retrieve search results from Google.

22   36   36  

geetest-captcha-solver

Solve the Geetest slider captcha with Puppeteer

9   36   36  

puppeteer-humanize

🕺 Humanizer functions for Puppeteer

9   36   36  

scrapeops-scrapy-sdk

Scrapy extension that gives you all the scraping monitoring, alerting...

10   36   36  

raiplay-dl

The most advanced raiplay.it downloader

7   36   36  

mangahook-api

free open source manga api , including fetch all manga , single manga...

19   36   36  

dilbert-viewer

A simple comic viewer for Dilbert by Scott Adams

1   35   35  

webradio-metadata

Collection of scraping recipes to get metadata about what is being str...

13   35   35  

poketo

Node library for scraping manga sites

4   35   35  

node-red-contrib-nbrowser

Provides a virtual web browser (a.k.a. "headless browser") appearing a...

13   35   35  

tripadvisor-scraper

Scrape the hotel reviews of a whole city on TripAdvisor

15   35   35  

policy-data-analyzer

Building a model to recognize incentives for landscape restoration in...

8   35   35  

SneakerBot

Buy limited edition sneakers

9   35   35