Most popular crawling repositories and open source projects

trend-monitoring

실시간 트렌드 데이터 분석/모니터링 시스템 tremo

4   23   23  

Mimo-Crawler

A web crawler that uses Firefox and js injection to interact with webp...

2   23   23  

proxycrawl-node

ProxyCrawl Node library for scraping and crawling

5   23   23  

app-crawler

crawling App by uiautomator2 & mitmproxy

9   23   23  

zcrawl

An open source web crawling platform

4   22   22  

ragno

Common Lisp Web crawling library based on Psychiq.

2   22   22  

udemy-crawler

Crawling Udemy course info and save into JSON format.

7   22   22  

crawl-original-google-images

python scripts for crawling original image from Google Images

4   22   22  

scraper

All In One API to easily scrape data from any website, without worryin...

3   22   22  

arxiv2text

Converting PDF files to text, mainly with a focus on arXiv papers.

2   22   22  

crawlbase-python

Fast python library for the Crawlbase API

2   22   22  

SlackWebhooksGithubCrawler

Search for Slack Webhooks token publicly exposed on Github

1   21   21  

crawling-framework

Easily crawl news portals or blog sites using Storm Crawler.

4   21   21  

html-article-extractor

A web page content extractor

1   21   21  

proxycrawl-php

ProxyCrawl PHP library for scraping and crawling websites

5   21   21  

the-seinfeld-chronicles

A dataset for textual analysis on arguably the best written comedy tel...

2   21   21  

DDMKL

한국 현대문학 박사학위 논문 서지 데이터 분석

5   21   21  

product-integrations

Code examples and general information

10   21   21  

GlassFrog

Keyword Search & Information Gathering Tool

4   21   21  

crawler

Crawl your own website with various clients for SEO and indexing purpo...

4   20   20  

afreecatv-chat-crawler

⚡️ 웹소켓을 이용한 아프리카TV 실시간 채팅 크롤링

3   20   20  

path-finder-rl

Method For Establishing Database For Global Value Chain For Parts Proc...

14   20   20  

abx-spec-behaviors

🧩 Proposal to allow user scripts like "expand comments", "hide popups...

0   19   19  

DCinsideAlarm

DC인사이드, 아카라이브 새글 알림 프로그램

6   19   19  

mobile-de-car-data-collector

Crawl, scrape and persist Mobile.de car listings data in a smart & res...

3   19   19  

xXx___dead___xXx

b̶̡̪̬͒l̸̰̗̝̀ỏ̷̡̩g̴͇̑g̶̲̱̽͐i̵̹͗n̶̤̥͂̅̆g̴̮̾̅͜ ̷̧͎͆i̷̛͒͜͠n̸̥̺͒ ̶͚͚͊̿͜t̸̆...

1   19   19  

scrapy-fieldstats

A Scrapy extension to log items coverage when the spider shuts down

4   19   19  

PyCarGr

PyCarGr - Unofficial car.gr API

15   19   19  

scrapyteer

Web crawling & scraping framework for Node.js on top of headless Chrom...

0   19   19  

old_ver_bot

파이썬 슬랙 크롤링 봇입니다. It's slack bot made by python+flask+bs4....

7   18   18  

mida

MIDA: A Tool for Measuring the Internet

4   18   18  

XML-Parser

A Node.js XML DOM, Parser & Stringifier.

8   18   18  

web-search-engine-UIC

CS 582 Information Retrieval at University of Illinois at Chicago. Mul...

4   18   18  

fastcrawler

Modern, fast (high-performance) asynchronous scraping framework based...

2   18   18  

scrapingai

Build web scraping agents using AI to auto-extract the data from websi...

3   18   18  

webscrape-tutorial

A basic tutorial to web scraping using python for beginners

0   17   17  

deephotel

scraping TripAdvisor, Booking.com with Scrapy

10   17   17  

go-scrapy

Web crawling and scraping framework for Golang

2   16   16  

pyReptile

web crawling & scraping framework for Python

7   16   16  

WebSearch

Python module allowing you to do various searches for links on the Web...

8   16   16  

Google-Search-URL-Crawler

Desktop app that crawls urls from Google's search engine results

2   16   16  

scrapy-scraper

Web crawler and scraper based on Scrapy and Playwright's headless brow...

4   16   16  

velog-dashboard

2023.11) velog statistics dashboard fullstack

1   16   16  

twitter-account-data-crawler

Crawl and track followers count of Twitter account

2   16   16  

re-employment-kraken

re-employment-kraken scrapes (job) sites, remembers what it saw and no...

1   15   15  

free-llmstxt-generator

converts webpage content into Markdown format, optimized for LLM train...

1   15   15  

crawlly

A simple web crawller in go

12   15   15  

kasthack.osp

Генератор сырых дампов пользователей VK.

5   15   15  

img-cli

An interactive Command-Line Interface Build in NodeJS for downloading...

3   15   15  

facebook-scraper-for-non-english-user

crawling facebok page

4   15   15