Most popular crawling repositories and open source projects

pumba

Fetch, store and access user agent strings for different browsers

1   15   15  

FundCrawler

天天基金爬虫,抓取市面上所有基金信息\基金净值\基金成分\基金公司\基金经...

4   14   14  

CSCI572-Information_Retrieval_And_Web_Search_Engines

Search Engine projects

17   14   14  

node-vgmusic-downloader

Node.js tool for downloading all free MIDI files on VGMusic.com

5   13   13  

nutch-solr-integration

An ultra small PoC to show how to combine Apache Nutch and Apache Solr...

10   13   13  

React-YouTube-Comment-Section-Scraper

A full stack application that scrapes & filters YouTube comments using...

6   13   13  

Cross-The-Floor

Uses Sankey Diagrams to visualize politicians that have "crossed the f...

0   13   13  

house-bob

A django application for scraping properties with scrapy.

8   13   13  

omkar-temp-mail

🚀 OMKAR TEMP MAIL HELPS YOU USE TEMPORARY EMAILS. 🤖

4   13   13  

oda

Extraction, versioning and machine-readable provisioning of public dat...

0   12   12  

Quotes-Crawling

Crawl Anne Shirley's Quotes from Web | استخراج نقل قول های آن شرلی از...

0   12   12  

smd

Simple Manga Downloader, a tool to search and download manga

0   12   12  

insta-downloader

You Can Download Instagram Post With This Script

3   12   12  

wikipedia-externallinks-fast-extraction

Fast extraction of all external links from wikipedia

1   12   12  

crawl-tiki-products

Demonstration for crawling Laptop products on Tiki ecomercial website

10   12   12  

shopee-crawler

Simple scripts for crawling shopee's shop and product information from...

8   12   12  

SECTOOL

sᴇᴀʀᴄʜ ᴇɴɢɪɴᴇ sᴄʀᴀᴘᴇʀ ᴛᴏᴏʟ (ʙᴀsʜ)

4   12   12  

FinalProject-Datascience

Đồ án cuối kì môn khoa học dữ liệu ứng dụng. Thu thập data bằng cách p...

6   12   12  

darklight

Engine for collecting onion domains and crawling from webpage based on...

3   12   12  

crawler-ts

Crawler written in TypeScript using ES6 generators.

1   12   12  

node-raspar

🕷️ Easily scrap the web for torrent and media files.

7   12   12  

WeiboStockAnalysisCharts

crawling china stock recommendation from Sina Weibo, create pyecharts...

3   11   11  

data_camp_wcr

파이썬을 활용한 실전 웹크롤링 CAMP 강의 1-2기 소스코드

6   11   11  

newscorpus

Docker🐳 setup for automated news article crawling from German news we...

2   11   11  

googlescholar-crawler

This is a crawler for crawling papers from google scholar (http://scho...

0   11   11  

ResearchGateCrawler

Python script for crawling ResearchGate.net papers.✨⭐️📎

0   11   11  

Crizensolution_Project_CrawlingWebsite

Selenium, Jsoup을 활용한 '네이버부동산' 크롤링 및 Spring을 이용한 동적...

0   11   11  

Arachnida

App to scrap the web, for people without coding skills. Fully integrat...

11   11   11  

mb-checker

Python scripts, first traverses chrome Bookmark file and second remove...

0   11   11  

route-waypoints

Crawling route waypoints for HK bus routes

12   11   11  

crawling-study

파이썬 크롤링 스터디 내용

3   10   10  

book-product-data-pipeline-project

Automate ETL pipeline, build a data warehouse.

2   10   10  

crawler

:spider_web: Awesome scenario based crawler

2   10   10  

fundus-evaluation

[ACL 2024] Evaluation of the Fundus News Scraper

1   10   10  

poster-finder

Download All Poster of Movie with URL

2   10   10  

big-data-ocr-ner

Applying Optical Character Recogntion, Named Entity Detection, Object...

4   10   10  

Crawler-using-Scrapy

Crawling some e-commerce site in Indonesia (blibli, bukalapak, lazada,...

11   10   10  

StackoverflowCrawler

A web crawler which crawls the stackoverflow website.

0   10   10  

wp2static-addon-advanced-crawling

Advanced Crawling Add-on for WP2Static

10   10   10  

paytm-scraping-offers

Scraping & crawling all of the products (and their coupons, categories...

4   10   10  

ahegao

Repo for ahegao detection and style transfer

1   10   10  

playwright-task-server

A headless browser manager with multi tasking RESTful API, crawling or...

3   10   10  

isoxya-api

Isoxya Crawler API

0   10   10  

py_scripts_bots

The moderate bots for re-crawling from social medias.

3   10   10  

tarantula

Python web crawler tool

3   10   10  

Web-Crawling-Stock-Data-

东方财富网股票数据爬取

5   10   10  

ig-profile-scraper

Fetch and save real-time data anonymously from any Instagram profile w...

1   10   10  

quora-loader

A realtime read-only locator and extraction library for Quora question...

0   9   9  

StockExchangeCrawler

A crawler program to extract all of the data and the price for symbols...

8   9   9  

FarfetchCrawler

A web crawler for farfetch[https://www.farfetch.com]

0   9   9