KVSplit

KVSplit

dipampaul17

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

359 Stars
13 Forks
359 Watchers
Python Language
other License
Cost to Build
$23.5K
Market Value
$69.7K

Growth over time

1 data points  ·  2025-09-25 → 2025-09-25
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about KVSplit

Question copied to clipboard

What is the dipampaul17/KVSplit GitHub project? Description: "Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone KVSplit

Clone via HTTPS

git clone https://github.com/dipampaul17/KVSplit.git

Clone via SSH

[email protected]:dipampaul17/KVSplit.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the KVSplit issue tracker:

Open GitHub Issues