🔥 The API to search, scrape, and interact with the web for AI
Scrapy, a fast high-level web crawling & scraping framework for Python.
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multi...
Elegant Scraper and Crawler Framework for Golang
Python scraper based on AI
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs,...
🕵️♂️ Collect a dossier on a person by username from 3000+ sites
Pythonic HTML Parsing for Humans™
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
A scalable web crawler framework for Java.
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, P...
List of libraries, tools and APIs for web scraping and data processing.
🦊 Anti-detect browser
Tabula is a tool for liberating data tables trapped inside PDF files
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
Declarative web scraping
Convert cURL commands to Python, JavaScript, Java, C#, PHP, Go, Dart, R, Ruby, Rust, MATLAB, Elixir, CFML, Ansible, Strest or JSON
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Distributed crawler powered by Headless Chrome
Self-hosted webscraper.
Swiss-army tool for scraping and extracting data from online assets, made for hackers
Mechanize is a ruby library that makes automated web interaction easy.
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
Collection of useful data science topics along with articles, videos, and code
Up-to-date simple useragent faker with real world database
Snoop — инструмент разведки на основе открытых данных (OSINT world)
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons...
Scrape Facebook public pages without an API key
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Nat...
Stealth headless browser for AI agents — bypass Cloudflare, bot detection, and anti-scraping. Drop-in Puppeteer/Playwright replacement.
A browser testing and web crawling library for PHP and Symfony
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
A curated list of awesome puppeteer resources.
Learn step-by-step how to scrape Google Trends data and make a result comparison using Python and Oxylabs SERP API. Extract keywords, their popularity...
Web Scraping Framework
Web crawler and scraper for Rust
An open source fingerprint browser based on Ungoogled Chromium. 指纹浏览器 隐私浏览器
Advanced Privacy Browser Core with Unified Fingerprint Defense: Cloudflare, Akamai, Kasada, Shape, DataDome, PerimeterX, hCaptcha, FunCaptcha, Imperva...
Getting started with Puppeteer and Chrome Headless for Web Scraping
A command-line utility for taking automated screenshots of websites
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
Get info from any web service or page
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
OSINT cheat sheet, list OSINT tools, wiki, dataset, article, book , red team OSINT for hackers and OSINT tips and OSINT branch. This repository will g...
Internet-in-a-Box - Build your own LIBRARY OF ALEXANDRIA with a Raspberry Pi !
Hide your scrapers IP behind the cloud. Provision proxy servers across different cloud providers to improve your scraping success.
List of anti-detect and humanizing tools and browsers, including captcha solvers and sms-activation.