24 Forks
373 Stars
373 Watchers

fastLLaMa

fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.

How to download and setup fastLLaMa

Open terminal and run command
git clone https://github.com/PotatoSpudowski/fastLLaMa.git
git clone is used to create a copy or clone of fastLLaMa repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with fastLLaMa https://github.com/PotatoSpudowski/fastLLaMa/archive/master.zip

Or simply clone fastLLaMa with SSH
[email protected]:PotatoSpudowski/fastLLaMa.git

If you have some problems with fastLLaMa

You may open issue on fastLLaMa support forum (system) here: https://github.com/PotatoSpudowski/fastLLaMa/issues

Similar to fastLLaMa repositories

Here you may see fastLLaMa alternatives and analogs

 aws-doc-sdk-examples    awesome-cpp    infer    openage    ChakraCore    OpenRCT2    openpose    x64dbg    cpr    tinyrenderer    finalcut    MuseScore    appleseed    openauto    hotspot    awesome-quant    arl    vectiler    KaHIP    Catch2    yuzu    compiler-explorer    scylladb    rxterm    vcpkg    project-based-learning    BackgroundMusic    cpp_redis    ogre    CS-Notes