KVSplit

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

optimization

View on GitHub

361 Stars

13 Forks

361 Watchers

Python Language

other License

100 SrcLog Score

Cost to Build

$24.2K

Market Value

$59.8K

How is this calculated?

Growth over time

3 data points · 2025-09-25 → 2026-04-21

Stars Forks Watchers

💬

How do you feel about this project?

Ask AI about KVSplit

Question copied to clipboard

What is the dipampaul17/KVSplit GitHub project? Description: "Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone KVSplit

Clone via HTTPS

git clone https://github.com/dipampaul17/KVSplit.git

Clone via SSH

[email protected]:dipampaul17/KVSplit.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the KVSplit issue tracker:

Open GitHub Issues

Similar to KVSplit

prepack svgo closure-compiler llvm clean-css simplify imagemin game-programming-patterns webpackmonitor reactopt BayesianOptimization nnvm MTuner webdnn easyengine gosl soot scikit-optimize DietPi faster incubator-kie-optaplanner search-engine-optimization react-ssr-optimization opticss wheels owl meshoptimizer MLBox JuMP.jl eaopt