56 Forks
46 Stars
46 Watchers

web-languages

Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ for the code

How to download and setup web-languages

Open terminal and run command
git clone https://github.com/commoncrawl/web-languages.git
git clone is used to create a copy or clone of web-languages repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with web-languages https://github.com/commoncrawl/web-languages/archive/master.zip

Or simply clone web-languages with SSH
[email protected]:commoncrawl/web-languages.git

If you have some problems with web-languages

You may open issue on web-languages support forum (system) here: https://github.com/commoncrawl/web-languages/issues