Most popular crawler repositories and open source projects

Spider xiantang Python

web crawler

41 26 41

CrawlerSamples VAllens C#

This is a Puppeteer+AngleSharp crawler console app samples, used C# 7.1 coding and dotnet core build.

41 12 41

UniversityRecruitment-sSurvey Maicius Python

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”？

41 18 41

crawel MrXujiang JavaScript

基于Apify+node+react搭建的有点意思的爬虫平台

41 15 41

FF14AutoSignIn renchangjiu Python

FF14 国服官网自动签到脚本

41 6 41

scrapy-diario-oficial-da-uniao sinayra Python

Script Python para buscar o conteúdo do Diário Oficial da União

41 14 41

doogle safesploitOrg PHP

Doogle is a search engine and web crawler which can search indexed websites and images

41 19 41

TikHub-API-Python-SDK-V2 TikHub Python

TikHub-API-Python-SDK-V2

41 5 41

TaiwanLotteryCrawler stu01509 Python

Taiwan Lottery Crawler 台灣樂透彩券爬蟲

41 22 41

Raven Symbolexe Go

Raven is a powerful and customizable web crawler written in Go.

41 8 41

aristotle egcodes Python

highly customizable news collector

40 4 40

insecres kkomelin Go

A console tool that finds insecure resources on HTTPS sites

40 2 40

laundry endquote JavaScript

Data laundering tools

40 4 40

SpiderWho lanrat Python

A very fast whois crawler

40 14 40

podcastcrawler podcastcrawler PHP

PHP library to find podcasts

40 10 40

Domainker BitTheByte Python

BugBounty Tool

40 17 40

TripAdvisor_crawler Tang-Li-Jen Python

Python Crawler: Scrape Data From Tripadvisor

40 11 40

ArticleSpider hackfengJam Python

Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation...

40 10 40

grab_beautiful_girls_pictures cunxi1992 Python

抓取MM131美女写真图片，并将其保存至本地指定的文件夹中。

40 21 40

sponge spypunk Kotlin

sponge is a website crawler and links downloader command-line tool

40 2 40

DeadPool Ryuchen Python

该项目是一个使用celery作为主体框架的爬虫应用，能够灵活的添加爬虫任务，并且同时运行多站点的爬虫工作，所有组件都能够原生支持规模并发和分布式，加上celery...

40 16 40

medium-stat-box kylemocode TypeScript

Practical pinned gist which show your latest medium status 📌

40 18 40

tse-client m-ahmadi JavaScript

A client for fetching stock data from the Tehran Stock Exchange (TSETMC). Works in Browser, Node and as CLI.

40 15 40

Youtube_Scraper CriticalHunter Python

Scrape data about an entire Channel or just a Playlist, or get stats about your Own Watch History.

40 4 40

d00r CYB3RMX Python

Simple directory brute-force tool written with python.

40 9 40

dijnet-bot juzraai JavaScript

Az összes számlád még egy helyen :)

40 3 40

GooglePlayWebServiceAPI BaseMax PHP

Tiny script to crawl information of a specific application in the Google play/store base on PHP.

40 9 40

PaperWebCrawler yagol2020 Java

IEEE XPLORE等文献网站的爬虫工具/Crawler for Paper Website like IEEE XPLORE

40 6 40

scrapy-zyte-api scrapy-plugins Python

Zyte API integration for Scrapy

40 21 40

AutoTBOXDataSystem DolorHunter Java

汽车TBOX数据采集及分析系统设计与实现

40 22 40

wayurls alwalxed Go

CLI tool for fetching URLs from Wayback Machine, Common Crawl, and VirusTotal.

40 4 40

Crawler taseikyo Python

:snake:A collection of simple Python crawlers.

39 15 39

AntiCloudFlare s045pd HTML

对抗cloudflare载入页反爬虫防护（已失效）

39 21 39

crawler crawlerclub Go

Crawler4U, a general purpose focused crawler

39 6 39

Android-Apps-Downloader harismuneer Python

📱 A utility for downloading Android apps from the Google Play Store and Xiaomi App Store (the Chinese App Store).

39 18 39

lolcrawler jonaslejon Python

Headless web crawler for bugbounty and penetration-testing/redteaming

39 10 39

papercut armand1m TypeScript

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Cachin...

39 2 39

ExHentaiReader AndyHsiehTA HTML

Best manga-viewer on windows for crawling/downloading/browsing exhentai.

39 2 39

CobWeb-lnx GoncaloMark Python

CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.

39 2 39

python-facebook-bot tudoanh Python

Get facebook events from location with Python 3

38 15 38

BaiduImageCrawler flexwang-zz Python

A multithreaded tool for downloading search results of Baidu image search.

38 21 38

integrada.minhabiblioteca.com.br tharyckgusmao JavaScript

Download de livros para PDF/EPUB - Integrada.minhabiblioteca / vitalsource

38 10 38

generic-seeder team-exor C++

Generic altcoin DNS seeder. Compatible with virtually any cryptocurrency cloned from bitcoin. Built-in lightweight DNS server ~ Cloudflare DNS support...

38 97 38