Multi-Modal-Transformer

Multi-Modal-Transformer

junchen14

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.

229 Stars
30 Forks
229 Watchers
Cost to Build
$17.5K
Market Value
$51.3K

Growth over time

6 data points  ·  2021-11-01 → 2025-08-01
Stars Forks Watchers
💬

How do you feel about this project?

Ask AI about Multi-Modal-Transformer

Question copied to clipboard

What is the junchen14/Multi-Modal-Transformer GitHub project? Description: "The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains. ". Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard — paste it after the AI opens.

How to clone Multi-Modal-Transformer

Clone via HTTPS

git clone https://github.com/junchen14/Multi-Modal-Transformer.git

Clone via SSH

[email protected]:junchen14/Multi-Modal-Transformer.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the Multi-Modal-Transformer issue tracker:

Open GitHub Issues