flash-attention-with-sink

flash-attention-with-sink

kelvindelrosario

πŸ™ Implements Flash Attention with sink for gpt-oss-20b; includes test.py. WIP backward pass, varlen support, and community sync to return softmax_lse only.

2 Stars
0 Forks
2 Watchers
Python Language
53.9 SrcLog Score
Cost to Build
$17.3K
Market Value
$11.5K

Growth over time

2 data points  Β·  2026-04-12 β†’ 2026-04-19
Stars Forks Watchers
πŸ’¬

How do you feel about this project?

Ask AI about flash-attention-with-sink

Question copied to clipboard

What is the kelvindelrosario/flash-attention-with-sink GitHub project? Description: "πŸ™ Implements Flash Attention with sink for gpt-oss-20b; includes test.py. WIP backward pass, varlen support, and community sync to return softmax_lse only.". Written in Python. Explain what it does, its main use cases, key features, and who would benefit from using it.

Question is copied to clipboard β€” paste it after the AI opens.

How to clone flash-attention-with-sink

Clone via HTTPS

git clone https://github.com/kelvindelrosario/flash-attention-with-sink.git

Clone via SSH

[email protected]:kelvindelrosario/flash-attention-with-sink.git

Download ZIP

Download master.zip

Found an issue?

Report bugs or request features on the flash-attention-with-sink issue tracker:

Open GitHub Issues