Topic

crawler

Repositories (1232)

SoFIFA
SoFIFA DiogoDantas Jupyter Notebook

A SoFIFA webcrawler and Machine Learning prediction

57
m3u8Downloader
m3u8Downloader mrzhangfelix Python

meijuba.net,Python crawler,M3U8格式视频下载,桌面应用

56
devsearch
devsearch nicholaskajoh Python

A web search engine built with Python which uses TF-IDF and PageRank to sort search results.

56
skweez
skweez edermi Go

Fast website scraper and wordlist generator

56
SearchEngineScrapy
SearchEngineScrapy naqushab Python

Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com, Yandex.com

56
actor-facebook-scraper
actor-facebook-scraper pocesar TypeScript

Scrape public Facebook pages, posts, reviews and comments

56
PicCrawler
PicCrawler fengzhizi715 Java

使用RxJava2 和 Java 8的特性开发的图片爬虫

55
talospider
talospider howie6879 Python

talospider - A simple,lightweight scraping micro-framework

55
MyCrawler
MyCrawler netcan Python

我的爬虫合集

55
All-IT-eBooks-Spider
All-IT-eBooks-Spider Kulbear Python

[Updated] A simple python crawler for my tutorial blog at http://www.jianshu.com/p/8fb5bc33c78e

55
telegram-groups-crawler
telegram-groups-crawler edogab33 Python

A Telegram crawler made in Python to automatically search groups and channels and collect any type of data from them (+ dataset included).

55
bolsa
bolsa gicornachini Python

Biblioteca feita em Python com o objetivo de facilitar o acesso a dados de seus investimentos na bolsa de valores(B3/CEI) através do Portal CEI.

55
simple_bank_korea
simple_bank_korea Beomi Python

simple crawler for Korean banks with Transactions

55
JMComic-Crawler-Python
JMComic-Crawler-Python hect0x7 Python

Python API For JMComic (禁漫天堂)

55
nest-crawler
nest-crawler saltyshiomix TypeScript

An easiest crawling and scraping module for NestJS

55
kalel
kalel noobscode Python

Kal El Network Stress Test and Penetration Testing Toolkit

54
tool-gin
tool-gin bajins Go

基于go-gin框架建立减少冗余动作项目,如:下载一些工具

54
instagram-hashtag-crawler
instagram-hashtag-crawler simonseo Python

Crawl Instagram hashtags

54
browser-as-a-service
browser-as-a-service hfreire JavaScript

A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML

54
alipay-crawler
alipay-crawler he426100 PHP

支付宝账单爬虫

53
go-crawler-distributed
go-crawler-distributed golang-collection Go

分布式爬虫项目,本项目支持个性化定制页面解析器二次开发,项目整体采用微服务架构,通过消息队列实现消息的异步发送,使用到的框架包括:redigo, gorm, goquer...

53
OpenCrawler
OpenCrawler merwin-asm Python

Open Crawler || Open Source Crawler

53
Deepminer
Deepminer Conso1eCowb0y Python

Deep web crawler and search engine

52
flink-crawler
flink-crawler ScaleUnlimited Java

Continuous scalable web crawler built on top of Flink and crawler-commons

52
WebTable
WebTable AtomEcho Python

A python package that takes tables from a web page and processes them to get high quality tables

52
site-mirror-py
site-mirror-py generals-space Python

[码云](https://gitee.com/generals-space/site-mirror-py) 通用爬虫, 仿站工具, 整站下载

52
findopendata
findopendata findopendata Python

A search engine for Open Data

52
rarbgcli
rarbgcli FarisHijazi Python

RARBG command line interface for scraping the rarbg.to torrent search engine

52
baidu-chain-dog
baidu-chain-dog CoolAcsi Java

百度莱茨狗爬虫。

51
GPlayCrawler
GPlayCrawler KopLyf Python
51
tw-stock-telegram-bot
tw-stock-telegram-bot x3388638 JavaScript

台股機器人,提供即時個股及大盤報價、走勢、新聞、盤後資料等 Telegram bot to query real-time TW stock quotes, charts, news, and other related informatio...

51
fb-page-chat-download
fb-page-chat-download eisenjulian Python

Python script to download messages from a Facebook page to a CSV file

51
open-gov-crawlers
open-gov-crawlers public-law Python

Parse government documents into well formed JSON

51
crawler-userscript
crawler-userscript zjh1943 JavaScript

一个基于 Tampermonkey 插件平台开发的爬虫。主要目的是最大限度模拟用户环境,避免被反爬虫系统识破。

51
facebook-messenger-bot-tutorial
facebook-messenger-bot-tutorial twtrubiks Python

facebook-messenger-bot-tutorial use Python Django

51
price-monitoring
price-monitoring roccomuso JavaScript

Node.js price monitoring library, leveraging the power of x-ray and nightmare.

51
crawler_shopee
crawler_shopee charlie0227 Python

Shopee coin getter is a script to collect daily shopee coins.

51
crawler
crawler tomasnorre PHP

Libraries and scripts for crawling the TYPO3 page tree. Used for re-caching, re-indexing, publishing applications etc.

51
billboard-json
billboard-json KoreanThinker TypeScript

🎧 Get json type billboard hot 100 chart

50
bloodhound
bloodhound vitorfs Python
50
TwitterCrawler
TwitterCrawler casolxia Java

抓取twitter数据,可根据时间、话题、用户名等条件抓取数据,twitter爬虫

50
nasty
nasty lschmelzeisen Python

NASTY Advanced Search Tweet Yielder

50
Timbr_V1
Timbr_V1 lvyachao JavaScript

A web service that turns an arbitrary web page into structural JSON data and easy-to-use APIs with just a few clicks

50
kepub
kepub TerakomariGandesblood C++

Crawl novels from sfacg, ciweimao, esjzone, lightnovel and masiro; generate, append and extract epub

50
snapcrawl
snapcrawl DannyBen Ruby

Crawl a website and take screenshots

50
Crawling-CV-Conference-Papers
Crawling-CV-Conference-Papers seanywang0408 Jupyter Notebook

Crawling CV conference papers with Python.

50
html-query
html-query h12w Go

A fluent and functional approach to querying HTML

49
scrapy.dart
scrapy.dart sachaarbonel Dart

Scrapy, a fast high-level web crawling & scraping framework for dart and Flutter

49
logo-scrape
logo-scrape fritzh321 TypeScript

🕷🚀 Scrapes/Crawls the logo from a provided url(s)/website for your Node.js applications.

49
Mini-Spider
Mini-Spider zhangyunhao116 Python

简单、实用的爬虫工具,仅需四步创建属于你的爬虫程序!

48