Topic

crawler

Repositories (1431)

Spider
Spider xiantang Python

web crawler

41
CrawlerSamples
CrawlerSamples VAllens C#

This is a Puppeteer+AngleSharp crawler console app samples, used C# 7.1 coding and dotnet core build.

41
UniversityRecruitment-sSurvey
UniversityRecruitment-sSurvey Maicius Python

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?

41
crawel
crawel MrXujiang JavaScript

基于Apify+node+react搭建的有点意思的爬虫平台

41
FF14AutoSignIn
FF14AutoSignIn renchangjiu Python

FF14 国服官网自动签到脚本

41
scrapy-diario-oficial-da-uniao
scrapy-diario-oficial-da-uniao sinayra Python

Script Python para buscar o conteúdo do Diário Oficial da União

41
doogle
doogle safesploitOrg PHP

Doogle is a search engine and web crawler which can search indexed websites and images

41
TikHub-API-Python-SDK-V2
TikHub-API-Python-SDK-V2 TikHub Python

TikHub-API-Python-SDK-V2

41
TaiwanLotteryCrawler
TaiwanLotteryCrawler stu01509 Python

Taiwan Lottery Crawler 台灣 樂透 彩券 爬蟲

41
Raven
Raven Symbolexe Go

Raven is a powerful and customizable web crawler written in Go.

41
aristotle
aristotle egcodes Python

highly customizable news collector

40
insecres
insecres kkomelin Go

A console tool that finds insecure resources on HTTPS sites

40
laundry
laundry endquote JavaScript

Data laundering tools

40
SpiderWho
SpiderWho lanrat Python

A very fast whois crawler

40
podcastcrawler
podcastcrawler podcastcrawler PHP

PHP library to find podcasts

40
Domainker
Domainker BitTheByte Python

BugBounty Tool

40
TripAdvisor_crawler
TripAdvisor_crawler Tang-Li-Jen Python

Python Crawler: Scrape Data From Tripadvisor

40
ArticleSpider
ArticleSpider hackfengJam Python

Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation...

40
grab_beautiful_girls_pictures
grab_beautiful_girls_pictures cunxi1992 Python

抓取MM131美女写真图片,并将其保存至本地指定的文件夹中。

40
sponge
sponge spypunk Kotlin

sponge is a website crawler and links downloader command-line tool

40
DeadPool
DeadPool Ryuchen Python

该项目是一个使用celery作为主体框架的爬虫应用,能够灵活的添加爬虫任务,并且同时运行多站点的爬虫工作,所有组件都能够原生支持规模并发和分布式,加上celery...

40
medium-stat-box
medium-stat-box kylemocode TypeScript

Practical pinned gist which show your latest medium status 📌

40
tse-client
tse-client m-ahmadi JavaScript

A client for fetching stock data from the Tehran Stock Exchange (TSETMC). Works in Browser, Node and as CLI.

40
Youtube_Scraper
Youtube_Scraper CriticalHunter Python

Scrape data about an entire Channel or just a Playlist, or get stats about your Own Watch History.

40
d00r
d00r CYB3RMX Python

Simple directory brute-force tool written with python.

40
dijnet-bot
dijnet-bot juzraai JavaScript

Az összes számlád még egy helyen :)

40
GooglePlayWebServiceAPI
GooglePlayWebServiceAPI BaseMax PHP

Tiny script to crawl information of a specific application in the Google play/store base on PHP.

40
PaperWebCrawler
PaperWebCrawler yagol2020 Java

IEEE XPLORE等文献网站的爬虫工具/Crawler for Paper Website like IEEE XPLORE

40
scrapy-zyte-api
scrapy-zyte-api scrapy-plugins Python

Zyte API integration for Scrapy

40
AutoTBOXDataSystem
AutoTBOXDataSystem DolorHunter Java

汽车TBOX数据采集及分析系统设计与实现

40
wayurls
wayurls alwalxed Go

CLI tool for fetching URLs from Wayback Machine, Common Crawl, and VirusTotal.

40
Crawler
Crawler taseikyo Python

:snake:A collection of simple Python crawlers.

39
AntiCloudFlare
AntiCloudFlare s045pd HTML

对抗cloudflare载入页反爬虫防护(已失效)

39
crawler
crawler crawlerclub Go

Crawler4U, a general purpose focused crawler

39
Android-Apps-Downloader
Android-Apps-Downloader harismuneer Python

📱 A utility for downloading Android apps from the Google Play Store and Xiaomi App Store (the Chinese App Store).

39
lolcrawler
lolcrawler jonaslejon Python

Headless web crawler for bugbounty and penetration-testing/redteaming

39
papercut
papercut armand1m TypeScript

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Cachin...

39
ExHentaiReader
ExHentaiReader AndyHsiehTA HTML

Best manga-viewer on windows for crawling/downloading/browsing exhentai.

39
CobWeb-lnx
CobWeb-lnx GoncaloMark Python

CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.

39
python-facebook-bot
python-facebook-bot tudoanh Python

Get facebook events from location with Python 3

38
BaiduImageCrawler
BaiduImageCrawler flexwang-zz Python

A multithreaded tool for downloading search results of Baidu image search.

38
integrada.minhabiblioteca.com.br
integrada.minhabiblioteca.com.br tharyckgusmao JavaScript

Download de livros para PDF/EPUB - Integrada.minhabiblioteca / vitalsource

38
generic-seeder
generic-seeder team-exor C++

Generic altcoin DNS seeder. Compatible with virtually any cryptocurrency cloned from bitcoin. Built-in lightweight DNS server ~ Cloudflare DNS support...

38
ProxyScan
ProxyScan Its-Vichy Go

🔎 scan the internet to find "private" proxies.

38
novelsave_sources
novelsave_sources m-haisham Python

A collection of webnovel sources offering varying amounts of scraping capability.

38
EH-PDF
EH-PDF Galgamer-org Python

將一個 E-Hentai 畫廊下載並轉換成 PDF,方便在 Kindle 上閱讀 以及在 iPad 上閱讀並作筆記,,,

38
article_crawler
article_crawler tychozzz Python

✨ Article Crawler is a package used to crawl articles with Markdown format from a specific webpage and store them locally in HTML / Markdown formats.

38
crawlee-one
crawlee-one JuroOravec TypeScript

Production-ready web scraping in a single function call. Built on Crawlee.

38
geckordp
geckordp jpramosi Python

A client implementation of Firefox DevTools over remote debug protocol in python

38
Data-Collection-Process-for-the-2024-Huashu-Cup-C-Problem
Data-Collection-Process-for-the-2024-Huashu-Cup-C-Problem Diraw Python

华数杯2024C题数据集收集过程

38