RoPE Rotary Position Embedding to 100K context length
HTML-код
- Опубликовано: 22 май 2024
- ROPE - Rotary Position Embedding explained in simple terms for calculating the self attention in Transformers with a relative position encoding for extended Context lengths of LLMs.
All rights w/ authors:
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING (RoPE)
arxiv.org/pdf/2104.09864
#airesearch
#aiexplained Наука
Amazing content! The explanations are SO clear! Thank you!
Thank you SO MUCH for providing such high quality conten! Very much enjoying all your many videos! If you have a chance, I'd love to see you discuss the recent work in giving AI spatial reasoning. I.e. artificial "imagination". (In it's natural form, very much a core feature of human thought.) Perhaps one might think about the creation of a "right brain" to go along with the "left brain" language models we have now? (Please forgive the over-simplification of human neuroscience.) Thanks again! All the best to you sincerely!
Hi Sir! Really love your videos! How can we access your presentation slides?
If one rotation is good, how about going into three dimensional rotations and using quaternions? Is there any work using that?
Maybe it's obvious to YOU that the solution is that complex exponential, but I wish you hadn't assumed that WE would all see that as self-evident as you do.
I see what you mean. You know, I spend some days to find simple explanations for the not so self explanatory RoPE algo, especially I will build on this in my second video, and then we examine more complex, more recent ideas about RoPE. I decided for a way, that will enable my audience to understand the main ideas and methods, and go from there. I recorded 90 min for the second part, and currently I cut it to max 60 min, striking a balance of providing insights for all my viewers. I'll try harder ....
there is a mistake, smaller dimensions change more quickly,and large dimensions change more slowly