Efficient Self-Attention for Transformers
HTML-код
- Опубликовано: 1 окт 2024
- The memory and computational demands of the original attention mechanism increase quadratically as sequence length grows, rendering it impractical for longer sequences.
However, various methods have been developed to streamline the attention mechanism's complexity. In this video, we'll explore some of the most prominent models that address this challenge.
#transformers
Link to the activation function video:
A Review of 10 Most Popular Activation Functions in Neural Networks
• A Review of 10 Most Po...
Thank you so much
what would be the advantage of this methods vs Flash attention. Flash attention speeds up the computation and it is an exact computation most of these methods are approximations. I would like if possible to see a video explaining other attention types as Paged attention and Flash Attention. Great content :)
Thank you for the suggestion! You're absolutely right. In this video, I focused on purely algorithmic approaches, not hardware-based solutions like FlashAttention. FlashAttention is an IO-aware exact attention algorithm that uses tiling to reduce memory reads/writes between GPU memory levels, which results in significant speedup without sacrificing model quality.
I appreciate your input and will definitely consider making a video to explain FlashAttention!
Thanks for the suggestion, I made a new video on Flash Attention:
FlashAttention: Accelerate LLM training
ruclips.net/video/LKwyHWYEIMQ/видео.html
I would love to hear your comments and if you have any other suggestions
Very informative. Thank you!
Glad it was helpful!
good explanation, very clear
Thank you for the nice comment! Glad you find the videos useful!
you should include axial attention and axial position embedding, its simple yet work great on image, and video.
Thanks for the suggestion, yes I agree. I have briefly described axial attention in the vision transformer series
ruclips.net/video/bavfa_Rr2f4/видео.htmlsi=0SB9Yc_0SasafhJN
@@PyMLstudio thats awesome, thanks you!