High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.
What is the thushan/olla GitHub project? Description: "High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backends.". Written in Go. Explain what it does, its main use cases, key features, and who would benefit from using it.
Question is copied to clipboard — paste it after the AI opens.
Clone via HTTPS
Clone via SSH
Download ZIP
Download master.zipReport bugs or request features on the olla issue tracker:
Open GitHub Issues