13 Forks
359 Stars
359 Watchers

KVSplit

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

How to download and setup KVSplit

Open terminal and run command
git clone https://github.com/dipampaul17/KVSplit.git
git clone is used to create a copy or clone of KVSplit repositories. You pass git clone a repository URL.
it supports a few different network protocols and corresponding URL formats.

Also you may download zip file with KVSplit https://github.com/dipampaul17/KVSplit/archive/master.zip

Or simply clone KVSplit with SSH
[email protected]:dipampaul17/KVSplit.git

If you have some problems with KVSplit

You may open issue on KVSplit support forum (system) here: https://github.com/dipampaul17/KVSplit/issues