Positional encodings in transformers (NLP817 11.5)

Transformers (how LLMs work) explained visually | DL5

How Rotary Position Embedding Supercharges Modern LLMs

What Really Happened to Mike Tyson Last Night: The Truth Behind His Loss to Jake Paul

I Built a Soccer Stadium in My House!

Zelenskyy on Biden authorizing use of US-supplied longer range missiles against Russia

How positional encoding works in transformers?

BrainDrain

Просмотров 12 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 19 ноя 2024

Комментарии • 31

@emitate 10 месяцев назад ⁺¹⁵
In my opinion, best explanation so far of positional encoding! Super clear and concise! Thank you very much sir!
@literasikeamanandigital4771 13 дней назад ⁺¹
3:21 - 3:59 are super intuitive, great job!
@harshithdr3952 День назад
great job, keep making this video. This video solved my confusion on positional encoding.
@Wesleykoonps 8 месяцев назад ⁺⁴
I like very concise graphical explanation with the similarity to binary coding and basic linear algebra!
@cybermanaudiobooks3231 7 месяцев назад ⁺⁴
The best explanation of transformer positional encoding on the internet. Awesome video. Thanks!
@marcinstrzesak346 6 месяцев назад
I couldn't find anywhere why creators of transformer decied to encode the positions in this way and last minute of your video was what I was looking for. Thanks for good explanation
@Adhbutham 3 месяца назад
Incredible! I have surfed through various resources online and no one got this so accurately. Absolutely spot on explanation.
@atabhatti2844 5 месяцев назад
Great explanation. Short enough. Detailed enough. Enough talking. Enough showing. Loved the examples.
@mohammedelsiddig3939 5 месяцев назад
I'm eternally grateful for this concise explanation, other sources made the positional encoding concept sound so counter-intuitive to grasp
@JesseNerio 7 месяцев назад ⁺³
Fantastic. This was amazing! Best explanation.
@ea_777 6 месяцев назад
Just when I was about to pull the last hair on top of my head, I came across this video. Beautifully Explained. Thank You !
@EdeYOlorDSZs 28 дней назад
more content pleaaaase, this is amazing!
@markburton5318 3 месяца назад ⁺¹
It seems the addendum is a 5th requirement. I can’t word this precisely but the positional encoding can be learned easily, that the embedding is only a linear transformation of position. It cannot be an encryption of the token.
@simaogoncalves1957 6 месяцев назад ⁺¹
Keep these coming!
@prabhuramnagarajan1893 3 месяца назад
please explain in detail about the linear relation with two encoding. You mathematical proofs, sounds excellent.
Please recommend a good book to understand in detail about these concepts.
@joshuakessler8346 Месяц назад
This is an excellent video
@xuwenzhe9165 13 дней назад
Thank you, love this ^^
@temanangka3820 5 месяцев назад ⁺¹
1. Why positional encoding is added to the word embedding? Will it changes the semantic value?
2. Why positional encoding use random number produce by sin and cosine... I think it must be simple if we add the one dimension to word embedding storing the position as integer.
Why use such a hard, random, and unpredictable algorithm to encode positions!
@atharvahude 22 дня назад ⁺¹
The position set t(position) = { 1, 2, ..., L} in your video 2:44 starts from 1. But in the diagram to the left it starts from 0. I can see sine wave starting first. Can you clarify this situation ?
@BrainDrainAgain 22 дня назад
Good catch. Suppose I tried to note that there are L positions overall.
@lewists9475 8 дней назад
reli good video
@gart1609 2 месяца назад
Why do we need to alternate sine and cosine? It seems like either one on its own should do the job. The only reason I can see for alternations is that this way we can solve the problem of positional encoding with the wavelength twice as short, as opposed to sine or cosine alone. Is that right? Are there other reasons?
@KP-fy5bf 22 дня назад
You just have more unique variations before repeats. Say your only using sin and you have two dimensions. At sin(0) the entire vector is [0,0] at pos 180 you have the same vector [0,0]. If you alternate sin and cos then you now have a larger range until a repeat. You could have alternated between sin, cos, and tan if you wanted to and it would add even more uniqueness except for tan(90). So sin and cos was just a good balance. This is just heuristic.
@phobosmoon4643 4 месяца назад
bravo
@wilfredomartel7781 3 месяца назад
❤
@temanangka3820 5 месяцев назад
How can adding positional encoding to word embedding doesnt change the word semantic meaning?
Example:
Word embedding of "Cat" is [1, 2, 3],
Word embedding of Money is [2, 3, 4].
If the positional encoding is [2, 1, 0] for word "Cat",
positional encoding for word "Money" is [1, 0, -1]
then the positional encoded of both word is [3, 3, 3]
How can "Cat" equal to "Money"?
@BrainDrainAgain 5 месяцев назад
Because positional part is a constant. Token part is stochastic, it changes depending on current token, but positional part remains the same. Imagine that you recorded all embeddings of a 0th token from the whole dataset and you got a map, distribution. If you add some constant, this map will remain the same, but shifted to some other location. And yes, it will not work for two examples, you need sufficient amount of data to prevent confusion.
@temanangka3820 5 месяцев назад
@@BrainDrainAgain I see... 🔥 Thank you ✅
@DanielYang-mc6zn 3 месяца назад
This video is already very outdated lol
@suhasbrad4884 3 месяца назад
How?
@ronin2963 3 месяца назад
Can you project your speak. You asmr tone is disturbing

Следующие

Автовоспроизведение

Positional encodings in transformers (NLP817 11.5)

Positional encodings in transformers (NLP817 11.5)

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

How Rotary Position Embedding Supercharges Modern LLMs

How Rotary Position Embedding Supercharges Modern LLMs

What Really Happened to Mike Tyson Last Night: The Truth Behind His Loss to Jake Paul

What Really Happened to Mike Tyson Last Night: The Truth Behind His Loss to Jake Paul

I Built a Soccer Stadium in My House!

I Built a Soccer Stadium in My House!

Zelenskyy on Biden authorizing use of US-supplied longer range missiles against Russia

Zelenskyy on Biden authorizing use of US-supplied longer range missiles against Russia

Gold Blooded 💰 Palette & Collection Reveal! | Jeffree Star Cosmetics

Gold Blooded 💰 Palette & Collection Reveal! | Jeffree Star Cosmetics

RoPE (Rotary positional embeddings) explained: The positional workhorse of modern LLMs

RoPE (Rotary positional embeddings) explained: The positional workhorse of modern LLMs

Attention Mechanism - Basics, Additive Attention, Multi-head Attention

Attention Mechanism - Basics, Additive Attention, Multi-head Attention

Why do we need activation functions?

Why do we need activation functions?

How To Code A Quantum Computer

How To Code A Quantum Computer

RoFormer: Enhanced Transformer with Rotary Position Embedding Explained

RoFormer: Enhanced Transformer with Rotary Position Embedding Explained

CS480/680 Lecture 19: Attention and Transformer Networks

CS480/680 Lecture 19: Attention and Transformer Networks

How are Redstone Computers even possible?

How are Redstone Computers even possible?

Positional Encoding in Transformer Neural Networks Explained

Positional Encoding in Transformer Neural Networks Explained

D3 Tesla CyberBeast Дорого и Быстро.

D3 Tesla CyberBeast Дорого и Быстро.

ПОТЕРЯЛ СЕСТРУ 😱 СТРАННАЯ ПРОПАЖА ШКОЛЬНИКОВ 🤯 КАМИЛЬ НАКАЗАЛ ПРЕСТУПНИКА

ПОТЕРЯЛ СЕСТРУ 😱 СТРАННАЯ ПРОПАЖА ШКОЛЬНИКОВ 🤯 КАМИЛЬ НАКАЗАЛ ПРЕСТУПНИКА

ВЫПЛАТИТЬ ДОЛГИ за 4 ДНЯ С МОНСТРОМ! - Debt Hunt

ВЫПЛАТИТЬ ДОЛГИ за 4 ДНЯ С МОНСТРОМ! - Debt Hunt

ТАБЛЕТКА (смешное видео, юмор, приколы, поржать)

ТАБЛЕТКА (смешное видео, юмор, приколы, поржать)

Lp. Сердце Вселенной #49 КОНЕЦ СНОВИДЕНИЙ [Прибытие Отца] • Майнкрафт

Lp. Сердце Вселенной #49 КОНЕЦ СНОВИДЕНИЙ [Прибытие Отца] • Майнкрафт

ОН ВАМ НЕ ЛИТВИН: ЧТО НЕ ТАК С РОЗЫГРЫШЕМ БМВ? ВОЕННЫЙ БИЛЕТ, АНДРЕЙ SD, ИСТОРИЯ НАШЕГО КОНФЛИКТА

ОН ВАМ НЕ ЛИТВИН: ЧТО НЕ ТАК С РОЗЫГРЫШЕМ БМВ? ВОЕННЫЙ БИЛЕТ, АНДРЕЙ SD, ИСТОРИЯ НАШЕГО КОНФЛИКТА

Rodar el golpe👌#tips #boxeo #boxing #boxingtraining #boxingday #boxinglife #tutorial #teamsanchez

Rodar el golpe👌#tips #boxeo #boxing #boxingtraining #boxingday #boxinglife #tutorial #teamsanchez

Дитя Тьмы 2: Первая жертва - ТРЕШ ОБЗОР на фильм

Дитя Тьмы 2: Первая жертва - ТРЕШ ОБЗОР на фильм