Most popular crawler repositories and open source projects

crawler Charleswyt Jupyter Notebook

Crawler with Python 3.

34 19 34

serverless-instagram-crawler kimcoder TypeScript

serverless, instagram hashtag crawler with lambda, dynamoDB

34 10 34

phpwebcrawler subins2000

A Web Crawler Created in PHP

34 33 34

crawlerflow invana Python

Web Crawlers orchestration framework that lets you create datasets from multiple web sources using yaml configurations.

34 7 34

BingGallery benheart Python

A simple crawler to get all Bing gallery pictures.

34 12 34

visual-spider code4everything Java

欢迎体验我们全新的桌面端效率工具RunFlow，https://myrest.top/myflow

34 10 34

Spydan adanvillarreal Python

A web spider for shodan.io without using the Developer API.

34 8 34

goGamer davidleitw Go

巴哈姆特自訂API

34 3 34

wallstreetcnScrapy jianzhichun Python

a crawler for wallstreetcn,finance.sina by Scrapy-新浪财经，同花顺财经，华尔街见闻的爬虫

34 5 34

LOLPrediction tongtzeho Python

英雄联盟胜负预测

34 11 34

instagram-downloader haxzie-xx JavaScript

Node.js/Express app to retrive instagram video/image download urls

34 14 34

tor-ip-rotation-python-example baatout Python

An example of Tor IP rotation in Python

34 17 34

spider-mooc hy59 Python

本爬虫程序旨在从中国大学MOOC爬取相关课程的评论信息

34 7 34

images-grabber Antosik TypeScript

🖼️ Get all images from pixiv/twitter/deviantart

34 3 34

ZhiHu_Spider SakuraPuare Python

知乎内容爬虫 | Web scraper for Zhihu content extraction

34 7 34

scrapingai Agenty TypeScript

Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf from URL and web crawling with Agenty

34 4 34

taobao-crawler-selenium YoungZM339 Python

基于 Selenium 和 Tkinter 的爬取淘宝商品的Web自动化工具

34 9 34

figma-archives gridaco Python

Figma Files Scraper for Research & Studies

34 4 34

github-scanner-local arshadkazmi42 Shell

Locally scan all the repositories of a github organization

34 16 34

botasaurus-starter omkarcloud TypeScript

🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖

34 8 34

ebedke ijanos Python

crawl pages to check what is for lunch today

33 5 33

toutiaocrawler a252937166 Java

头条号爬虫案例

33 16 33

proxi nicksherron Go

Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.

33 4 33

iranian-calendar-events mamal72 JavaScript

Fetch Iranian calendar events (Jalali, Hijri and Gregorian) from time.ir website

33 3 33

ioweb lorien Python

Web Scraping Framework

33 11 33

LeetCodeCrawler ZhaoxiZhang Java

A tool for crawling the description and accepted submitted code of problems on the LeetCode and LeetCode-Cn website.

33 4 33

serritor peterbencze Java

Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaS...

33 14 33

advanced-php-crawler juzeon PHP

新浪博客文章/wenku8轻小说文库爬虫，可抓取图片保存，一键制作电子书。kindle读书党的神器！

33 10 33

PySitemap Cartman720 Python

🕸️ Spider Sitemap - Simple Python 3 crawler that automatically navigates your website, discovers all pages, and generates a complete XML sitemap. Easy...

33 31 33

Crawling-Emails pH-7 Shell

Very simple bash script to crawl email addresses from a specific website.

33 17 33

xiaohongshu-spider-visualizer KaitoHH Python

A distributed web crawler for xiaohongshu.com and visualization for the crawled content.

33 9 33

colymer-acquirers touuki Python

各种爬虫（目前支持Instagram、Weibo、Twitter）Miscellaneous crawlers (currently including instagram, twitter, weibo etc.).

33 5 33

scrapy-tor-proxy-rotation elvesmrodrigues Python

An IP rotator via Tor for Scrapy.

33 0 33

BiliBiliCommentsAnalysis Timecollector Python

对b站弹幕、评论进行爬虫，然后使用Word2Vec模型将其转化为词向量进行分析

33 2 33

flixhq-core shin202 TypeScript

Nodejs library that provides an Api for obtaining the movies information from FlixHQ website.

33 16 33

Douban-MovieReview-Crawler king-wang123 Python

豆瓣影评爬虫助手这个项目可以让你对感兴趣的电影进行影评数据抓取、分析。不仅可以看到影评的星级分布，还能查看根据点赞数加权后的平均星级，同时生成直观的...

33 4 33

undetectable-crawler darkotodoric JavaScript

A Node.js script powered by Puppeteer for undetectable web scraping

33 2 33

telegram_bbbot maddevsio Go

Telegram Bug Bounty Bot

32 4 32

see tmaciejewski Erlang

Search Engine in Erlang

32 3 32

SINA_Spider yinhao0214 Python

新浪微博爬虫：登录、关键词微博查询、微博监控

32 21 32

php-google howie6879 PHP

Google search results crawler, get google search results that you need - php

32 9 32

kontests AliOsm Ruby

Competitive programming contests schedule

32 13 32

crowlet Pixep Go

Tiny sitemap crawler for cache warming, and website status monitoring

32 6 32

squirm squirm-framework Crystal

This was the night of the crawling terror!

32 1 32

reddit_scraper_and_sentiment_analyzer pratikpv Python

Download reddit posts based on keywords and perform sentiment analysis on the posts.

32 8 32

ProductHunt-scraper fernandod1 Python

Producthunt.com famous website scraper script. Scrap all offers and save in spreadsheet excel file.

32 10 32

LLM-Web-Crawler buildship-ai TypeScript

Web Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode. Plug and play with your own logic and customize it flexibly and scalably...

32 8 32

Google-Reverse-Image-Search ramonclaudio Python

A lightweight python wrapper designed for leveraging Google's search by image capabilities to perform reverse image searches programatically.

32 5 32

maw memvid TypeScript

Crawl any website into a single searchable file. Query it forever, offline.

32 6 32

instagram-scraper-tool Z786ZA

instagram scraper tool automated insights

32 0 32

crawler

Repositories (1431)