KVSplit

KVSplit

dipampaul17

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

361 Stars
13 Forks
361 Watchers
Python Language
other License
100 SrcLog Score
Cost to Build
$23.8K
Market Value
$70.7K

Growth over time

3 data points  ·  2025-09-25 → 2026-04-21
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about KVSplit

Question copied to clipboard

What is the dipampaul17/KVSplit GitHub project? Description: "Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone KVSplit

Clone via HTTPS

git clone https://github.com/dipampaul17/KVSplit.git

Clone via SSH

[email protected]:dipampaul17/KVSplit.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the KVSplit issue tracker:

Open GitHub Issues