Most popular crawling repositories and open source projects
the-seinfeld-chronicles
A dataset for textual analysis on arguably the best written comedy tel...
2 21 21
mobile-de-car-data-collector
Crawl, scrape and persist Mobile.de car listings data in a smart & res...
3 19 19
web-search-engine-UIC
CS 582 Information Retrieval at University of Illinois at Chicago. Mul...
4 18 18
re-employment-kraken
re-employment-kraken scrapes (job) sites, remembers what it saw and no...
1 15 15
free-llmstxt-generator
converts webpage content into Markdown format, optimized for LLM train...
1 15 15