An implementation of local windowed attention for language modeling
Implementation of Slot Attention from GoogleAI
Implementation of Classifier Free Guidance in Pytorch, with emphasis on text conditioning, and flexibility to include multiple text embedding models
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Implementation of RT1 (Robotic Transformer) in Pytorch
Implementation of the convolutional module from the Conformer paper, for use in Transformers
Implementation of Axial attention - attending to multi-dimensional data efficiently
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
Implementation of Autoregressive Diffusion in Pytorch
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch
Implementation of π₀, the robotic foundation model architecture proposed by Physical Intelligence
Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
Implementation of ChatGPT, but tailored towards primary care medicine, with the reward being able to collect patient histories in a thorough and efficient manner and come up with a reasonable differential diagnosis
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch
Fully featured implementation of Routing Transformer
Implementation of the proposed minGRU in Pytorch
Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repository is geared towards integration with eventual Alphafold2 replication.
Implementation of a U-net complete with efficient attention as well as the latest research findings
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
Implementation of Linformer for Pytorch
Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and adopted for use by EquiFold for protein folding
Implementation of a Transformer, but completely in Triton