Most popular scraping repositories and open source projects

public-roadmap

Public Roadmap for SerpApi, LLC (https://serpapi.com)

13   67   67  

moneyman

Automatically save transactions from all major Israeli banks and credi...

47   67   67  

Pasta

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

6   66   66  

foundation

🧱 A uniform template to use as a foundation for Puppeteer bot constru...

7   66   66  

medium-crawler

A crawler for scraping posts from medium.com

15   65   65  

selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract d...

11   65   65  

Google-Patents-Scraper

Automatically download all PDF files of searching results & their pate...

22   65   65  

maps-to-lead

Esse projeto tem como objetivo obter leads em formato JSON e enviar pa...

13   65   65  

dom_query

A Flexible Rust Crate for DOM Querying and Manipulation

6   64   64  

DexScreener-Scraping

When a specific token pair from DEX Screener is given, this script wil...

32   64   64  

Pinterest-infinite-crawler

An infinite Pinterest crawler/scraper. Crawl image with inifnite-scrol...

11   64   64  

rubium

Rubium is a lightweight alternative to Selenium/Capybara/Watir if you...

0   64   64  

PyLex

Perform lexical analysis on words, one word at a time.

2   64   64  

Tor_Spider

Python project to crawl and scrap the lesser known deep web or one can...

16   64   64  

worldometer

Get live, population, geography, projected, and historical data from a...

10   64   64  

daenerys

Scraping and Web Crawling Framework For Zhihu Live

30   63   63  

pythonista-chromeless

Serverless selenium which dynamically execute any given code.

10   63   63  

angel.co-companies-list-scraping

35   62   62  

datacrawl

A simple and easy to use web crawler for Python

11   62   62  

rebrowser-playwright-python

A drop-in replacement for playwright-python patched with rebrowser-pat...

6   62   62  

datasette-scraper

Add website scraping abilities to Datasette

1   62   62  

Porn-Novel-Scraper

A script that can be used to capture various porn novels for machine l...

12   61   61  

ksoup

Kotlin Wrapper for Jsoup

6   61   61  

pycaching

A Python 3 interface for working with Geocaching.com website.

46   61   61  

conformist

Bend CSVs to your will with declarative schemas.

6   60   60  

justetf-scraping

Scraping the justETF

18   60   60  

apify-client-python

Apify API client for Python

12   60   60  

webforai

The best HTML to Markdown library, A esm-native & Useful Utilities wit...

5   59   59  

pomp

Screen scraping and web crawling framework

10   59   59  

playlist2links

This bash script allows to extract video links from a youtube playlist

10   59   59  

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

19   59   59  

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-base...

11   59   59  

PythonScrapyBasicSetup

Basic setup with random user agents and IP addresses for Python Scrapy...

14   58   58  

whatsapp-tracking

Scraping the status of WhatsApp contacts

11   58   58  

coches-net-dashboard

Sample project that use Dagster, dbt, DuckDB and Dash to visualize car...

4   58   58  

local-api-examples

Easy-to-follow examples in Python, Node.js, and C# for web automation...

17   58   58  

Pahe.ph-Scraper

Pahe.ph [Pahe.in] Movies Website Scraper

15   58   58  

web_scraping_freecodecamp

Curso de web scraping con Python creado por Gustavo Juantorena para fr...

19   57   57  

sample-web-scraping-with-electron

Sample project for web scraping with Electron

17   57   57  

SearchEngineScrapy

Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com,...

16   56   56  

actor-facebook-scraper

Scrape public Facebook pages, posts, reviews and comments

32   56   56  

ogpParser

Open Graph Protocol Parser for Node.js

12   56   56  

actor-whitepaper

This whitepaper describes a new concept for building serverless microa...

1   56   56  

serpapi-javascript

Scrape and parse search engine results using SerpApi.

6   56   56  

Junior_Zone

Vagas Jr. atualizadas diariamente. Telegram e Planilha Online

2   55   55  

learn.scrapinghub.com

Scrapinghub Learning Center. Report issues in Jira: Report issues in J...

24   55   55  

mtnt

Code for the collection and analysis of the MTNT dataset

4   55   55  

scraper-fourone-jobs

This is a anti-scraping cracker for extracting apply information of on...

12   55   55  

pge-outages-pre-2024

Tracking PG&E outages

7   55   55  

Euro2016_TerminalApp

:soccer: Instantly find :trophy:EURO 2016 live-streams & highlights, n...

10   54   54