Most popular scraping repositories and open source projects

zyte-smartproxy-headless-proxy

A complimentary proxy to help to use SPM with headless browsers

36   103   103  

MachineLearning

Machine learning for beginner(Data Science enthusiast)

130   103   103  

jimutmap

API to get enormous amount of high resolution satellite images from sa...

15   103   103  

arxiv-miner

arxiv_miner is a toolkit for mining research papers on CS ArXiv.

7   97   97  

job_search

An app to search startup jobs scraped from websites written in Elixir,...

17   95   95  

proxifier

A fast, modern and intelligent proxy rotator perfect for crawling and...

15   95   95  

xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and sav...

20   94   94  

awesome-python-primer

自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫...

18   92   92  

Movies-and-Series-Scraper

A console application to scrape a valid watching links for any movie o...

16   92   92  

GoodreadsScraper

Scrape data from Goodreads using Scrapy and Selenium :books:

21   91   91  

apify-sdk-python

The Apify SDK for Python is the official library for creating Apify Ac...

3   91   91  

torrengo

Torrengo is a CLI (command line) program written in Go which concurren...

13   89   89  

scraper

Nodejs web scraper. Contains a command line, docker container, terrafo...

13   88   88  

ScrapeMate

Scraping assistant tool. Editing and maintaining CSS/XPath selectors a...

12   87   87  

viewstate

ASP.NET View State Decoder

13   87   87  

robox

Simple library for exploring/scraping the web or testing a website you...

1   87   87  

Detect-CMS

PHP Library for detecting CMS

48   86   86  

humanparser

Parse a human name string into salutation, first name, middle name, la...

27   85   85  

html2rss

📰 Build RSS 2.0 feeds from websites (and JSON APIs) with a few CSS sel...

9   85   85  

billy

legacy backend for Open States

51   84   84  

Leetcode-Questions-Scraper

Scrape Algorithm Questions from leetcode and generate html and epub fi...

23   84   84  

core

:spider: The PHP SERP Spider - A search engine scraper

43   83   83  

mechaml

OCaml functional web scraping library

6   83   83  

newser

Newser is a simple utility to generate a pdf with you favorite news ar...

3   82   82  

Auto-Gmail-Creator

Open Source Bulk Auto Gmail Creator Bot with Selenium & Seleniumwire (...

52   82   82  

google-covid19-mobility-reports

Data extraction of Google's COVID-19 Mobility Reports

11   81   81  

browser-pool

A Node.js library to easily manage and rotate a pool of web browsers,...

14   81   81  

introWebScraping

Code exemple for my blog posts

47   80   80  

bots-zoo

22   80   80  

ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the...

24   79   79  

html-table-extractor

extract data from html table

24   76   76  

feedbridge

Plugin based RSS feed generator for sites that don't offer any. Serves...

6   76   76  

Whatsapp-Net

Generate a network graph of connections from your WhatsApp groups data

7   75   75  

webdext

Intelligent Web Data Extractor

16   74   74  

linkedin-scraper

Tool to scrape linkedin

13   74   74  

gsocanalyzer

A blazingly fast tool to analyze all the selected organizations in Goo...

38   73   73  

venom

Your preferred open source focused crawler for the deep web.

5   72   72  

requests-random-user-agent

Configures the requests library to randomly select a desktop User-Agen...

22   71   71  

copycat

A PHP Scraping Class

13   70   70  

linkpreview

Open Graph, Twitter Card, Oembed preview. Shows visual cards that mimi...

10   70   70  

python-adv-web-apps

Updated python-beginners docs and examples

94   67   67  

rubium

Rubium is a lightweight alternative to Selenium/Capybara/Watir if you...

0   65   65  

facebook-group-members-scraper

Facebook Group Members Extractor. Download Facebook group members in C...

21   65   65  

medium-crawler

A crawler for scraping posts from medium.com

15   63   63  

local-api-examples

Useful and easy to understand examples written in Node.js and .NET Cor...

23   63   63  

daenerys

Scraping and Web Crawling Framework For Zhihu Live

30   62   62  

SourceScraper

Simple library which helps you to retrieve the source of various video...

19   62   62  

PyLex

Perform lexical analysis on words, one word at a time.

2   62   62  

Instagram-downloader

Instagram user's photos and videos downloader. Download all media file...

16   62   62  

top-github-scraper

Scape top GitHub repositories and users based on keywords

21   62   62