Dev-next-gen

👤 Developer

6 repositories on SrcLog

6 Repos

16 Stars

0 Forks

16 Watchers

Repositories (6)

ROCm-compatible fork of Bittensor – Full PyTorch 2.4 ROCm support – Wallet, Metagraph, Dendrite fully working.

AMD/NVIDIA GPU cluster infrastructure — ~300 GPU deployment, ROCm, kernel tuning, multi-node benchmarking

Production-grade local LLM deployment stack — llama.cpp, Ollama, GGUF/GGML, ROCm AMD, 14B to 80B models

Production infrastructure scripts — ROCm setup, multi-GPU config, server hardening, LLM deployment automation

FLUX.1-dev on AMD Radeon consumer GPUs — fast, low-VRAM, and shippable. Backport patches + benchmarks for torchao + diffusers group_offload on ROCm.

Multi-GPU tensor/context parallel diffusion on AMD ROCm — with the patch that makes it actually work.