Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and π video, up to 5x faster than OpenAI CLIP and LLaVA πΌοΈ & ποΈ
What is the unum-cloud/uform GitHub project? Description: "Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and π video, up to 5x faster than OpenAI CLIP and LLaVA πΌοΈ & ποΈ". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.
Question is copied to clipboard β paste it after the AI opens.
Clone via HTTPS
Clone via SSH
Download ZIP
Download master.zipReport bugs or request features on the uform issue tracker:
Open GitHub Issues