57 Forks
207 Stars
207 Watchers

Tokenizer

Fast and customizable text tokenization library with BPE and SentencePiece support

How to download and setup Tokenizer

Open terminal and run command
git clone https://github.com/OpenNMT/Tokenizer.git
git clone is used to create a copy or clone of Tokenizer repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with Tokenizer https://github.com/OpenNMT/Tokenizer/archive/master.zip

Or simply clone Tokenizer with SSH
[email protected]:OpenNMT/Tokenizer.git

If you have some problems with Tokenizer

You may open issue on Tokenizer support forum (system) here: https://github.com/OpenNMT/Tokenizer/issues

Similar to Tokenizer repositories

Here you may see Tokenizer alternatives and analogs

 aws-doc-sdk-examples    awesome-cpp    infer    natural-language-processing    openage    ChakraCore    OpenRCT2    lectures    spaCy    openpose    x64dbg    HanLP    gensim    cpr    tinyrenderer    MatchZoo    tensorflow-nlp    finalcut    Awesome-pytorch-list    MuseScore    appleseed    openauto    hotspot    awesome-quant    arl    vectiler    spacy-models    KaHIP    Catch2    yuzu