34 Forks
368 Stars
368 Watchers

KVQuant

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

How to download and setup KVQuant

Open terminal and run command
git clone https://github.com/SqueezeAILab/KVQuant.git
git clone is used to create a copy or clone of KVQuant repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with KVQuant https://github.com/SqueezeAILab/KVQuant/archive/master.zip

Or simply clone KVQuant with SSH
[email protected]:SqueezeAILab/KVQuant.git

If you have some problems with KVQuant

You may open issue on KVQuant support forum (system) here: https://github.com/SqueezeAILab/KVQuant/issues