Topic

scraping

Repositories (1626)

moneyman
moneyman daniel-hauser TypeScript

Automatically save transactions from all major Israeli banks and credit card companies, using GitHub actions (or a self hosted docker image)

67
proxy-scraper
proxy-scraper TeaByte Python

Scraping from x75 websites asynchronously

67
foundation
foundation prescience-data TypeScript

🧱 A uniform template to use as a foundation for Puppeteer bot construction.

66
Pasta
Pasta Kr0ff Python

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

66
medium-crawler
medium-crawler NISH1001 Python

A crawler for scraping posts from medium.com

65
selectorlib
selectorlib scrapehero HTML

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

65
maps-to-lead
maps-to-lead jhowbhz JavaScript

Esse projeto tem como objetivo obter leads em formato JSON e enviar para um webhook

65
Google-Patents-Scraper
Google-Patents-Scraper wenyalintw Python

Automatically download all PDF files of searching results & their patent families found on Google Patents.

65
rubium
rubium vifreefly Ruby

Rubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and...

64
PyLex
PyLex techcentaur Python

Perform lexical analysis on words, one word at a time.

64
Tor_Spider
Tor_Spider absingh31 Python

Python project to crawl and scrap the lesser known deep web or one can say dark web. Just provide the onion link and get started.

64
DexScreener-Scraping
DexScreener-Scraping greengod63 Python

When a specific token pair from DEX Screener is given, this script will fetch pair address, liquidity, total supply and etc. And then, this bot will...

64
worldometer
worldometer matheusfelipeog Python

Get live, population, geography, projected, and historical data from around the world 🌍

64
dom_query
dom_query niklak Rust

A Flexible Rust Crate for DOM Querying and Manipulation

64
Pinterest-infinite-crawler
Pinterest-infinite-crawler mirusu400 Python

An infinite Pinterest crawler/scraper. Crawl image with inifnite-scroll!

64
daenerys
daenerys dongweiming Python

Scraping and Web Crawling Framework For Zhihu Live

63
pythonista-chromeless
pythonista-chromeless umihico Python

Serverless selenium which dynamically execute any given code.

63
angel.co-companies-list-scraping
angel.co-companies-list-scraping iamtodor Python
62
datasette-scraper
datasette-scraper cldellow Python

Add website scraping abilities to Datasette

62
datacrawl
datacrawl DataCrawl-AI Python

A simple and easy to use web crawler for Python

62
rebrowser-playwright-python
rebrowser-playwright-python rebrowser Python

A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.

62
ksoup
ksoup fcannizzaro Kotlin

Kotlin Wrapper for Jsoup

61
pycaching
pycaching tomasbedrich Python

A Python 3 interface for working with Geocaching.com website.

61
Porn-Novel-Scraper
Porn-Novel-Scraper ystemsrx Python

A script that can be used to capture various porn novels for machine learning / 一个可以用于抓取各类色情小说用于机器学习的脚本

61
conformist
conformist tatey Ruby

Bend CSVs to your will with declarative schemas.

60
apify-client-python
apify-client-python apify Python

Apify API client for Python

60
justetf-scraping
justetf-scraping druzsan Python

Scraping the justETF

60
pomp
pomp estin Python

Screen scraping and web crawling framework

59
playlist2links
playlist2links pierlauro Shell

This bash script allows to extract video links from a youtube playlist

59
proxycrawl-python
proxycrawl-python crawlbase Python

ProxyCrawl Python library for scraping and crawling

59
scrapy-distributed
scrapy-distributed Insutanto Python

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy...

59
webforai
webforai inaridiy TypeScript

The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.

59
PythonScrapyBasicSetup
PythonScrapyBasicSetup matejbasic Python

Basic setup with random user agents and IP addresses for Python Scrapy Framework.

58
whatsapp-tracking
whatsapp-tracking TomyCesaille JavaScript

Scraping the status of WhatsApp contacts

58
local-api-examples
local-api-examples kameleo-io C#

Easy-to-follow examples in Python, Node.js, and C# for web automation & multi-accounting with Kameleo anti-detect browser.

58
Pahe.ph-Scraper
Pahe.ph-Scraper roofman2008 C#

Pahe.ph [Pahe.in] Movies Website Scraper

58
coches-net-dashboard
coches-net-dashboard franloza Python

Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market

58
sample-web-scraping-with-electron
sample-web-scraping-with-electron Tazeg JavaScript

Sample project for web scraping with Electron

57
web_scraping_freecodecamp
web_scraping_freecodecamp GEJ1 Jupyter Notebook

Curso de web scraping con Python creado por Gustavo Juantorena para freeCodeCamp https://www.freecodecamp.org/espanol/news/aprende-web-scraping-con-py...

57
actor-whitepaper
actor-whitepaper apify Python

This whitepaper describes a new concept for building serverless microapps called Actors, which are easy to develop, share, integrate, and build upon....

56
SearchEngineScrapy
SearchEngineScrapy naqushab Python

Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com, Yandex.com

56
actor-facebook-scraper
actor-facebook-scraper pocesar TypeScript

Scrape public Facebook pages, posts, reviews and comments

56
ogpParser
ogpParser ukyoda TypeScript

Open Graph Protocol Parser for Node.js

56
serpapi-javascript
serpapi-javascript serpapi TypeScript

Scrape and parse search engine results using SerpApi.

56
Junior_Zone
Junior_Zone Moscarde Python

Vagas Jr. atualizadas diariamente. Telegram e Planilha Online

55
learn.scrapinghub.com
learn.scrapinghub.com scrapinghub CSS

Scrapinghub Learning Center. Report issues in Jira: Report issues in Jira: https://scrapinghub.atlassian.net/projects/WEB

55
mtnt
mtnt pmichel31415 Python

Code for the collection and analysis of the MTNT dataset

55
scraper-fourone-jobs
scraper-fourone-jobs kokokuo Python

This is a anti-scraping cracker for extracting apply information of one of Taiwan jobs recruiting website.

55
pge-outages-pre-2024
pge-outages-pre-2024 simonw Python

Tracking PG&E outages

55
torrent-tracker-scraper
torrent-tracker-scraper project-mk-ultra Python

A UDP torrent tracker scraper library written in Python 3

54