FlexLLMGen

FlexLLMGen

FMInference

Running large language models on a single GPU for throughput-oriented scenarios.

9.4k Stars
592 Forks
9.4k Watchers
Python Language
apache-2.0 License
100 SrcLog Score
Cost to Build
$2.09M
Market Value
$8.65M

Growth over time

4 data points  ·  2023-03-01 → 2026-04-01
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about FlexLLMGen

Question copied to clipboard

What is the FMInference/FlexLLMGen GitHub project? Description: "Running large language models on a single GPU for throughput-oriented scenarios.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone FlexLLMGen

Clone via HTTPS

git clone https://github.com/FMInference/FlexLLMGen.git

Clone via SSH

[email protected]:FMInference/FlexLLMGen.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the FlexLLMGen issue tracker:

Open GitHub Issues