Transformers (how LLMs work) explained visually | DL5

LSTM is dead. Long Live Transformers!

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Superman - Teaser Trailer Tomorrow

Madison Police identify school shooter as 15-year-old female student

Avengers wake up, Marvel Rivals is fire

Language Learning with BERT - TensorFlow and Deep Learning Singapore

Engineers.SG

Просмотров 62 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 янв 2025

Комментарии • 19

@rohitdhankar360 5 лет назад ⁺³
@10:05 - Excellent explanation of Byte-Pair Encodings , thanks.
@autripat 5 лет назад ⁺⁸
The presenter says, these models "do not use RNNs" (correct), instead "they use CNNs" (incorrect, no use of convolution kernels). They use simple linear transformations of the type XW(transpose) + b
@jaspreetsahota1840 5 лет назад ⁺²
You can model convolution operations with transformers.
@mouduge 5 лет назад ⁺⁸
IMHO, that's debatable. Indeed, think of what happens when you apply the same dense layer to each input in a sequence? Well you're effectively running a 1D convolutional layer with kernel size 1. If you're familiar with Keras, try building a model with:
TimeDistributed(Dense(10, activation="relu"))
then replace it with this:
Conv1D(10, kernel_size=1, activation="relu")
You'll see that it gives precisely the same result (assuming you use the same random seeds).
Since the Transformer architecture applies the same dense layers across all time steps, you can think of the whole architecture as a stack of 1D-Convolutional layers with kernel size 1 (then of course there's the important Multihead attention part, which is a different beast altogether).
Granted, it's not the most typical CNN architecture, which usually use fairly few convolutional layers with kernel size 1, but still, it's not really an error to say the Transformer is based on convolutions. I think Martin's goal was mostly to highlight the fact that, contrary to RNNs, every time step gets processed in parallel.
Just my $.02! :))
@daewonyoon 5 лет назад ⁺⁵
Thank you. This summary/introduction is very very helpful.
@hiyassat 5 лет назад ⁺³
Can we have link to slides please
@archywillhe1379 4 года назад
wow engineers sg sure haz come a long way ha! great talk
@mkpandey4909 5 лет назад ⁺¹
Where to get this PPT; Please share the link
@monart4210 4 года назад
Could I extract word embeddings from BERT and use them for unsupervised learning, e.g. topic modeling? :)
@chirpieful 5 лет назад ⁺²
Very good updates for nlp enthusiasts
@MegaBlizzardman 5 лет назад ⁺¹
Very clear and helpful talk
@janekou2482 5 лет назад
Does bpe also works well for non english languages like chinese and french?
@zingg7203 4 года назад
BERT uses wordpiece. Albert uses sentence piece
@zingg7203 4 года назад ⁺¹
How is it CNN based?
@pr22345 5 лет назад ⁺¹
Very Informative.
@xiaochengjin6478 6 лет назад
very nice speech!
@revolutionarybitnche 5 лет назад
thank you!
@ishishir 5 лет назад
Nice !
@chriscannon303 4 года назад
what in gods name are you talking about?? what is an LSTM chain?? I came here because I need to know im writing the correct content for my website and I haven't a fucking clue what the hell you are on about.

Следующие

Автовоспроизведение

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

LSTM is dead. Long Live Transformers!

LSTM is dead. Long Live Transformers!

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Superman - Teaser Trailer Tomorrow

Superman - Teaser Trailer Tomorrow

Madison Police identify school shooter as 15-year-old female student

Madison Police identify school shooter as 15-year-old female student

Avengers wake up, Marvel Rivals is fire

Avengers wake up, Marvel Rivals is fire

YELLOWSTONE Season 5 Episode 14 Ending Explained

YELLOWSTONE Season 5 Episode 14 Ending Explained

Attention in transformers, visually explained | DL6

Attention in transformers, visually explained | DL6

Lecture 2 | Image Classification

Lecture 2 | Image Classification

Deep Learning State of the Art (2019) - MIT

Deep Learning State of the Art (2019) - MIT

2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]

2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]

Sentence Transformers - EXPLAINED!

Sentence Transformers - EXPLAINED!

BERT for pretraining Transformers

BERT for pretraining Transformers

Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass

Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass

Elon Musk JUST Dropped Massive Bombshell on Tesla Bots, Neuralink and Major Tesla Updates | CES 2025

Elon Musk JUST Dropped Massive Bombshell on Tesla Bots, Neuralink and Major Tesla Updates | CES 2025

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Учёные Обнаружили Самолёт, Погребённый во Льдах Арктики. То, Что Было Внутри, Потрясло Всех!

Учёные Обнаружили Самолёт, Погребённый во Льдах Арктики. То, Что Было Внутри, Потрясло Всех!

Идентифицирует себя как чемодан

Идентифицирует себя как чемодан

ИГРА В КАЛЬМАРА 2 - 45 Скрытых Деталей, Которые Вы Могли Пропустить в Сериале от Netflix

ИГРА В КАЛЬМАРА 2 - 45 Скрытых Деталей, Которые Вы Могли Пропустить в Сериале от Netflix

ATHLETIC CLUB 0 vs 2 FC BARCELONA | SPANISH SUPERCUP SEMIFINAL 🔵🔴

ATHLETIC CLUB 0 vs 2 FC BARCELONA | SPANISH SUPERCUP SEMIFINAL 🔵🔴

Paris😍📱 Rate my courage 1-10🔥

Paris😍📱 Rate my courage 1-10🔥

Я ПРОТИВ ИГРЫ В КАЛЬМАРА 2 В ЗАКУЛИСЬЕ В ГАРРИС МОД!

Я ПРОТИВ ИГРЫ В КАЛЬМАРА 2 В ЗАКУЛИСЬЕ В ГАРРИС МОД!

ШОУ не СВОЯ ИГРА: Егор Крид, Гарик Харламов , Джиган , Денис Дорохов #1

ШОУ не СВОЯ ИГРА: Егор Крид, Гарик Харламов , Джиган , Денис Дорохов #1

Индекс Биг Мака #россия #сша #рубль

Индекс Биг Мака #россия #сша #рубль