Topic

crawler

Repositories (1431)

FunUtils
FunUtils HoussemCharf Python

Some codes i wrote to help me with me with my daily errands ;)

46
scrapy-kafka-redis
scrapy-kafka-redis tenlee2012 Python

Distributed crawling/scraping, Kafka And Redis based components for Scrapy

46
dbworld-search
dbworld-search heqin-zhu HTML

:mag: 简单的搜索引擎, django 框架

46
anilist-crawler
anilist-crawler soruly TypeScript

Crawl data from anilist API and store as JSON file

46
AppCrawler
AppCrawler tongtzeho Python

Android应用市场网络爬虫

46
botcity-framework-web-python
botcity-framework-web-python botcity-dev Python

BotCity Framework Web - Python

46
scaling-to-distributed-crawling
scaling-to-distributed-crawling ZenRows HTML

Repository for the Mastering Web Scraping in Python: Scaling to Distributed Crawling blogpost with the final code.

46
iranian-news-agencies-crawler
iranian-news-agencies-crawler hamid JavaScript

a crawler to fetch last news from Iranian(Persian) news agencies.

46
mcp
mcp supadata-ai TypeScript

Official Supadata MCP Server - Adds powerful video & web scraping to Cursor, Claude and any other LLM clients.

46
tors
tors murat Ruby

⏬ Yet another torrent searching application for your command line

45
scrapy-admin
scrapy-admin liangWenPeng Python

A django admin site for scrapy

45
jason-the-miner
jason-the-miner mawrkus JavaScript

⛏ A versatile Web scraper for Node.js

45
crawler-jsoup-maven
crawler-jsoup-maven bluetata Java

This is a crawler(reptile)

45
PyTse
PyTse miladj Python

TseTmc Crawler

45
local-api-client-typescript
local-api-client-typescript kameleo-io TypeScript

Official JavaScript/TypeScript library for interacting with Kameleo Client

45
python-hacking-tools
python-hacking-tools cristianzsh Python

Python tools for ethical hacking

45
WebTable
WebTable AtomEcho Python

A python package that takes tables from a web page and processes them to get high quality tables

45
DarkSpider
DarkSpider PROxZIMA Python

Anatomy and Visualization of the Network structure of the Dark web using multi-threaded crawler

45
retro-env-can-weather-chan
retro-env-can-weather-chan Forceh91 TypeScript

Retro Environment Canada Weather Channel for your browser

45
bibliaAveMariaJSON
bibliaAveMariaJSON fidalgobr

Bíblia católica "Ave Maria" em formato JSON

45
spider.npm
spider.npm Ireoo JavaScript

网络爬虫类库,基本可以实现自定义规则大部分网站

44
maman
maman spk Rust

Rust Web Crawler saving pages on Redis

44
copyheaders
copyheaders jin10086 Python

方便的从浏览器复制浏览器头

44
seenreq
seenreq mike442144 JavaScript

Generate an object for testing if a request is sent, request is Mikeal's request.

44
WebCrawler
WebCrawler zhk0603 C#

一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。

44
wx-crawl
wx-crawl xuziping Java

微信公众号文章爬虫

44
bluebird
bluebird labteral Python

Unofficial Python client for Twitter

44
crawlerdetect
crawlerdetect moskrc Python

🕷CrawlerDetect is a Python library designed to identify bots, crawlers, and spiders by analyzing their user agents.

44
Web-crawler-engineer-for-Python
Web-crawler-engineer-for-Python zhangslob Python

Web-crawler-engineer-for-Python

43
Broken-Link-Crawler
Broken-Link-Crawler healeycodes Python

:robot: Python bot that crawls your website looking for dead stuff

43
SeleniumLogin
SeleniumLogin CharlesPikachu Python

Login some website using selenium.

43
BilibiliCrawler
BilibiliCrawler cgDeepLearn Python

:cyclone: crawl bilibili user info and video info for data analysis | BiliBili爬虫

43
n46-crawler
n46-crawler janelin612 JavaScript

Nogizaka46 Blog Crawler - 乃木坂46卒業成員部落格備份程式

43
scrapingant-client-python
scrapingant-client-python ScrapingAnt Python

ScrapingAnt API client for Python.

43
scalpel
scalpel lewoudar Python

A fast and powerful web scraping library

43
CygnusX1
CygnusX1 datnnt1997 Python

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

43
webmagician-ui
webmagician-ui Jkanon TypeScript

An admin UI project for a configurable web crawler platform

42
ncrawler
ncrawler kant2002 C#

Web Crawler written in C#

42
HttpProxy
HttpProxy asche910 Java

JAVA实现的IP代理池,支持HTTP与HTTPS两种方式

42
tiktok-crawler
tiktok-crawler hackertogether Python

This is a Tiktok Crawler App.

42
LuoguCrawler
LuoguCrawler himself65 Python

一个python爬虫来爬取洛谷各种信息

42
spiderable-middleware
spiderable-middleware veliovgroup JavaScript

Pre-rendering for JavaScript websites that delivers SSR-level SEO, enhanced link previews, and performance via effortless middleware integration — ide...

42
Bayesian-Stock-Market-Sentiment
Bayesian-Stock-Market-Sentiment wangys96 Python

A stock market text sentiment analysis website. A股舆情分析, web-crawler, bayesian algorithm, SQL, django, data-visualization.

42
ronin-web
ronin-web ronin-rb Ruby

ronin-web is a collection of useful web helper methods and commands.

42
noscrape
noscrape schoenbergerb TypeScript

This repository is deprecated

42
hentai-daily
hentai-daily bgzo Python

project(NSFW): hentai contents combined with multi sources daily

42
php-crawler
php-crawler elboletaire PHP

:spider: A simple crawler (spider) writen in php just for fun, with zero dependencies

41
crawler
crawler axetroy TypeScript

nodejs 爬虫框架. crawler framework for nodejs

41
ZUCC_ZhenFangHelper
ZUCC_ZhenFangHelper zhouzaihang Python

正方教务管理系统学生版的自动登录、选课、信息获取

41
leboncoin-crawler
leboncoin-crawler rfussien HTML

Crawler for leboncoin.fr

41