Most popular crawling repositories and open source projects

vba-crawler

VBA web crawler using http GET/POST

3   6   6  

Formosan-languages

台灣南島語-華語句庫資料集(Dataset of Formosan-Mandarin sentence pairs)

6   6   6  

crawl-agoda

3   6   6  

SLR-Tools

Python scripts to perform a systematic literature review for Google Sc...

2   6   6  

order-metrics-data-automation

OrderMetrics.io Automation for data from there to Google Sheets (sprea...

2   5   5  

GooglePlayDatabaseMirror

Repository of designing a crawler script to update a mirror database f...

0   5   5  

Shopee-Crawler

Crawl data from the shopee.vn

12   5   5  

crwlr

🕷a minimal puppeteer crawler api

0   5   5  

CountriesSearchEngine

A search engine built to retrieve geographical information of any coun...

4   5   5  

PlatformsCrawler

多平台爬蟲 + 模塊化管理,用於搜集資料並經 redis pubsub 發送

2   5   5  

webArchive

Crawls websites and saves found URLs to a file.

1   5   5  

Puppeteer

Puppeteer proxy authentication example for Decodo

3   5   5  

everytime-timetable-crawling

에브리타임 수업 강좌 시간표 크롤링

2   5   5  

Fundamentus_scraping

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto d...

0   5   5  

proxycrawl-java

ProxyCrawl Java library for scraping and crawling

0   5   5  

Scrapy

Scrapy proxy authentication example for Decodo

0   5   5  

scrap-superloto

A web scrapping project to fetch all lottery winning numbers, date, pr...

1   5   5  

crawler-webpage

crawling data from vnexpress.net for my subjects at school

7   5   5  

bm25-ranking-php

Ranked the reuter's document using bm25 ranking algorithm.

3   5   5  

Text_mining

텍스트마이닝을 이용한 소비자분석 _네이버쇼핑 리뷰크롤링

2   5   5  

coronaflight-hkg

😷 Crawler and history manager for dangerous, coronavirus-infected fli...

0   5   5  

naver_webtoon

딥러닝과 머신러닝을 활용한 독자 반응 기반 웹툰 데뷔작 성공 예측 모델

0   5   5  

gumbo-parser-cpp

C++ Library to Extract Information from the Google Gumbo HTML Parse Tr...

2   5   5  

FALL

A automated penetration testing tool

1   5   5  

Learning-By-Crawling

Riot Games API crawler and a machine learning project. Created in Hask...

2   5   5  

estela-entrypoint

estela entrypoint for job runner 🕸

2   5   5  

namu-soup

숲Soup - 나무위키 인기 검색어 크롤러

0   5   5  

woocommerce-scraper

The best scraping solution for WooCommerce

1   5   5  

amazon_luwak_coffee_scraper

This repo contains a Python-based web crawler that scrapes data on Luw...

0   5   5  

Migale

Migale was born out of a need to extract data quickly and with a very...

0   5   5  

Advanced-proxy-Scraper

Advanced Proxy Scraper Crawler fetcher

1   5   5  

sitemapr

sitemapr is a library that generates sitemaps for SPA websites by read...

1   5   5  

craw-BadanPusatStatistik

craw-BadanPusatStatistik adalah program untuk mengambil data dari webs...

0   5   5  

Awesome-Web-Scraping

A list of libraries, tools, and APIs for web scraping and data process...

1   5   5  

Scraping-IMDB

This Python script extracts comprehensive movie data from IMDB, focusi...

2   5   5  

Instagram-image-downloader

💟 Instagram Image Downloader

1   5   5  

craw-Pinterest

melakukan web scraping dan mengambil gambar berdasarkan keyword pencar...

1   5   5  

ScrapySub

ScrapySub is a Python library designed to recursively scrape website c...

0   5   5  

bot-safe-agents

A library for fetching a list of bot-safe user agents.

0   5   5  

Spider

This asynchronous web crawler is designed for reconnaissance tasks. It...

0   5   5  

EPhoto360

Create text effects online , Effects online for free, photo frames, ma...

5   4   4  

estela-cli

estela Command Line Client 🕸

3   4   4  

buscando-meu-carro

O buscando-meu-carro é um repositório que contém um projeto Python que...

0   4   4  

Playwright

Playwright proxy authentication & scraping example for Decodo

0   4   4  

rag-backend

Retrieval-Augmented Generation server with Pinecone and OpenAI

0   4   4  

Naver-cafe-crawling-ver240115

Naver cafe crawling using search keywords / 키워드 검색 위주 네이버 카...

1   4   4  

Firecrawl

Generated C# SDK based on official Firecrawl OpenAPI specification

1   4   4  

ArtStyle-Detector

A project aiming to detect artstyles from images. It queries Wikimedia...

2   4   4  

BiLSTM-StockPrediction-Algorithm

양방향 LSTM 기반 주가 예측 알고리즘 논문 연구 코드입니다.

0   4   4  

craw-kompas

crawling and scrapping data from the kompas news website

0   4   4