Most popular crawler repositories and open source projects

Novel-crawler

这是一个用Python写的小说爬虫软件

27   79   79  

deepweb-scappering

Discover hidden deepweb pages

16   79   79  

tumblr_crawler

tumblr解析网站

43   78   78  

arachnid

Powerful web scraping framework for Crystal

11   78   78  

ghs

GitHub Search: Platform used to crawl, store and present projects from...

11   78   78  

crawler_examples

Some classic web crawler projects.一些经典的爬虫

31   77   77  

scrapy-examples

Some scrapy and web.py exmaples

32   77   77  

ctrip_spider

Scrape Learning (ctrip)

34   76   76  

tumblr-crawler-cli

Tumblr Download Tool with High Speed and Customization. 高性能&高定制...

15   76   76  

fetchman

fetchman is a simple crawler system/简单好用的爬虫框架

20   76   76  

BUbiNG

The LAW next generation crawler.

24   76   76  

html-table-extractor

extract data from html table

24   76   76  

WebSecurityArticles

爬取及整理Freebuf\安全客\先知\知道创宇等站点的”web安全“类优质文章

20   76   76  

light-crawler

a simplified directed customizable website crawler

23   75   75  

tg_crawler

Just a crawler based on tg-cli for Telegram. Deprecated by now, please...

21   75   75  

BOJ-AutoCommit

When you solve the problem of Baekjoon Online Judge, it automatically...

12   74   74  

python-tools

A collection of Python tools, scripts and utilities to make your life...

18   74   74  

simpyder

超高速异步协程Python爬虫

23   74   74  

fund-crawler

基于NodeJS的基金数据爬虫,爬取的数据存于github的@nullpointer/fund-data...

42   73   73  

spider

python crawler spider

26   72   72  

python-testing-crawler

A crawler for automated functional testing of a web application

5   72   72  

venom

Your preferred open source focused crawler for the deep web.

5   72   72  

crawlzone

Crawlzone is a fast asynchronous internet crawling framework for PHP.

9   72   72  

lrabbit_scrapy

a quick start python mutil thread crawl

1   72   72  

achoz

Search through all your personal data efficiently like web search.

4   72   72  

COI

练手项目:Comment of Interest 电商文本评论数据挖掘 (爬虫 + 观点抽取 +...

11   71   71  

car-prices

Golang爬虫 爬取汽车之家 二手车产品库

37   70   70  

IpProxyPool

Golang 实现的 IP 代理池, 涉及到的技术点: go gorm proxy proxypool ip cr...

26   70   70  

python-crawler

Python Crawler

52   69   69  

tiktok-scraper-php

Tiktok (Musically) PHP scraper

30   69   69  

robotstxt

robots.txt file parsing and checking for R

8   69   69  

darc

Darkweb Crawler Project

13   69   69  

ComicSpider

动漫之家漫画站电脑版原图爬虫

17   68   68  

Wedge

可配置的小说下载及电子书生成工具

21   67   67  

hproxy

hproxy - Asynchronous IP proxy pool, aims to make getting proxy as con...

13   66   66  

newspaperjs

News extraction and scraping. Article Parsing

19   66   66  

JewelCrawler

豆瓣电影爬虫——a crawler which is able to crawl movie detail and short...

57   65   65  

carbonbot

A command line tool based on the crypto-crawler library.

8   65   65  

dht-crawler

A DHT Crawler based on Goroutine

4   64   64  

GMaps-Crawler

Google Maps crawler using Selenium. All extracted data is forwarded to...

17   64   64  

Auto_Shadowsocks

Shadowsocks. 科学上网, 仅供学习。是免费的服务器,可能存在科学上网不稳...

17   63   63  

medium-crawler

A crawler for scraping posts from medium.com

15   63   63  

social-scraper

Vietnamese text data crawler scripts for various sites (including Yout...

35   63   63  

eastmoney

python requests + Django+ nodejs koa+ mysql to crawl eastmoney fund an...

23   63   63  

qr-pirate

crawl QR-codes from search engines and look for bitcoin private keys

29   63   63  

HydraRecon

All In One, Fast, Easy Recon Tool

11   63   63  

local-api-examples

Useful and easy to understand examples written in Node.js and .NET Cor...

23   63   63  

ZhihuVAPI

优雅地玩知乎

14   62   62  

koshort

(deprecated) :cat: koshort is a Python package for Korean internet spo...

10   62   62  

sciBASIC

sciBASIC# is a kind of dialect language which is derive from the nativ...

29   62   62