57 Forks
633 Stars
633 Watchers

Craw4LLM

Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"

How to download and setup Craw4LLM

Open terminal and run command
git clone https://github.com/cxcscmu/Craw4LLM.git
git clone is used to create a copy or clone of Craw4LLM repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with Craw4LLM https://github.com/cxcscmu/Craw4LLM/archive/master.zip

Or simply clone Craw4LLM with SSH
[email protected]:cxcscmu/Craw4LLM.git

If you have some problems with Craw4LLM

You may open issue on Craw4LLM support forum (system) here: https://github.com/cxcscmu/Craw4LLM/issues