vllm-mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

macos

View on GitHub

978 Stars

146 Forks

978 Watchers

Python Language

apache-2.0 License

100 SrcLog Score

Cost to Build

$402.2K

Market Value

$2.20M

How is this calculated?

Growth over time

1 data points · 2026-04-25 → 2026-04-25

Stars Forks Watchers

💬

How do you feel about this project?

Ask AI about vllm-mlx

Question copied to clipboard

What is the waybarrios/vllm-mlx GitHub project? Description: "OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone vllm-mlx

Clone via HTTPS

git clone https://github.com/waybarrios/vllm-mlx.git

Clone via SSH

[email protected]:waybarrios/vllm-mlx.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the vllm-mlx issue tracker:

Open GitHub Issues

Similar to vllm-mlx

node atom the-art-of-command-line netdata nylas-mail awesome-mac dotfiles awesome-macos-command-line nativefier ShadowsocksX-NG macOS-Security-and-Privacy-Guide brew iina open-source-mac-os-apps textmate alacritty Best-App electronic-wechat PowerShell flow Kap BoostNote-Legacy kivy browser-laptop awesome-macOS macdown xbmc Karabiner-Elements marktext m-cli