bong-water-water-bong

👤 Developer

2 repositories on SrcLog

2 Repos

2 Stars

0 Forks

2 Watchers

Repositories (2)

Local, ternary-weight LLM inference on AMD Strix Halo. Rust above the kernels, HIP below, zero Python at runtime. https://discord.gg/EhQgmNePg

Native ROCm C++ kernels for Strix Halo (gfx1151): ternary BitNet GEMV, RMSNorm, RoPE, split-KV Flash-Decoding attention. Zero hipBLAS, zero Python.