Classifying Text with a Transformer, From Scratch! : PyTorch Deep Learning Tutorial

Cursor Is Beating VS Code (...by forking it)

Diffusion with Transformers AND Diffusion In-Painting from Scratch! PyTorch Deep Tutorial

Zach Bryan - Oak Island

GLOVES OFF: CANELO vs. BERLANGA | Episode 2

I REBUILT MARCUS RASHFORD'S WRECKED ROLLS ROYCE

Adding Self-Attention to a Convolutional Neural Network! PyTorch Deep Learning Tutorial

Luke Ditria

Просмотров 2,2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 11 сен 2024

Комментарии • 12

@profmoek7813 3 месяца назад ⁺¹
Master piece. Thank you so much 💗
@aldonin21 День назад ⁺¹
hello. I was trying to introduce a self_attention layer between a fully connected layer (with 32 neurons) and an output layer to recreate "Patt-lite" CNN model. I used Attention function from maximal library. The thing is, I get mixed results for the same parameters, even seed. Sometimes I get quickly to 95% accuracy and other times it doesn't learn at all and stays at 15-30%. Without the attention added, i get constant ~75%. Do you know why this could be happening?
@LukeDitria 5 часов назад ⁺¹
Do you use any type of regularisation? Could be that
@aldonin21 4 часа назад ⁺¹
@@LukeDitria I eventually figured out the issue was that the self_attention layer is very sensitive to weights initialization, so I used constant seed=42 for kernel_initialization inside the 3 dense layers for q, k and v weights (i modified by hand the Attention layer from maximal library which is posted on github). After this modification I run the 5fold CV and I got stable results around 95% for each fold :) I repeated it few time and it was always learning perfectly. And I am very happy about it
@thouys9069 3 месяца назад
very cool stuff. Any idea how this compares to Feature Pyramid Networks, which are typically used to enrich the high-res early convolutional layers?
I would imagine that the FPN works well if the thing of interest is "compact". I.e. can be captured well by a quadratic crop, whereas the attention would even work for non-compact things. Examples would be donuts with large holes and little dough, or long sticks, etc.
@LukeDitria 3 месяца назад
I believe Feature Pyramid Networks were primarily for object detection, and are a way of bringing fine grain information from earlier layers deeper into the network with big residual connections, they sill rely on multiple conv layers to combine spatial information. What we're trying to do here is mix spatial information early in the network. With attention the model can also choose how exactly to do that.
@yadavadvait 3 месяца назад
Good video! Do you think this experiment of adding the attention head so early on can extrapolate well to graph neural networks?
@LukeDitria 3 месяца назад
Hi thanks for your comment! Yes, Graph Attention Networks do what you are describing!
@esramuab1021 3 месяца назад
thank U
@unknown-otter 3 месяца назад
I'm guessing that adding self-attention in deeper layers would have lesser of an impact due to each value having greater receprive field?
If not, then why not to add at the end, where it would be less expensive? Without the fact that we could incorporate it in every conv block if we had infinite compute
@LukeDitria 3 месяца назад
Thanks for your comment! Yes you are correct, in terms of combining features spatially it won't have as much of an impact if the features already have a large receptive field. The idea is to try to add it as early as possible, and yes you could add it multiple times throughout your network, though you would probably stop once your feature map is around 4x4 etc...
@unknown-otter 3 месяца назад
Thanks for the clarification! Great video

Следующие

Автовоспроизведение

Classifying Text with a Transformer, From Scratch! : PyTorch Deep Learning Tutorial

Classifying Text with a Transformer, From Scratch! : PyTorch Deep Learning Tutorial

Cursor Is Beating VS Code (...by forking it)

Cursor Is Beating VS Code (...by forking it)

Diffusion with Transformers AND Diffusion In-Painting from Scratch! PyTorch Deep Tutorial

Diffusion with Transformers AND Diffusion In-Painting from Scratch! PyTorch Deep Tutorial

Zach Bryan - Oak Island

Zach Bryan - Oak Island

GLOVES OFF: CANELO vs. BERLANGA | Episode 2

GLOVES OFF: CANELO vs. BERLANGA | Episode 2

I REBUILT MARCUS RASHFORD'S WRECKED ROLLS ROYCE

I REBUILT MARCUS RASHFORD'S WRECKED ROLLS ROYCE

Super Orange vs. Flaming Red & Freezing Blue

Super Orange vs. Flaming Red & Freezing Blue

Can LSTMs Predict Movie Reviews? (PyTorch, Word2Vec, LSTMs)

Can LSTMs Predict Movie Reviews? (PyTorch, Word2Vec, LSTMs)

Building an AI Wildlife Monitor with Nvidia Jetson Nano Object Detection!

Building an AI Wildlife Monitor with Nvidia Jetson Nano Object Detection!

Little-Known AI Tools Giving Academics an Unfair Advantage

Little-Known AI Tools Giving Academics an Unfair Advantage

Latent Diffusion for Image Generation with a Unet: PyTorch Deep Tutorial

Latent Diffusion for Image Generation with a Unet: PyTorch Deep Tutorial

Are you Smarter than a Network Engineer

Are you Smarter than a Network Engineer

Fix CUDA Out of Memory (OOM) in PyTorch! No GPU Upgrades

Fix CUDA Out of Memory (OOM) in PyTorch! No GPU Upgrades

Minecraft: Destroying Freddy Fazbears Pizza ! 😱🤡🦧 (Did you do this with はいよろこんで ) #shorts

Minecraft: Destroying Freddy Fazbears Pizza ! 😱🤡🦧 (Did you do this with はいよろこんで ) #shorts

Дзюдоист усмирил оборзевшего кабана!

Дзюдоист усмирил оборзевшего кабана!

Дебаты Трамп-Харрис: о чем говорили и что это значит? ПРЯМОЙ ЭФИР

Дебаты Трамп–Харрис: о чем говорили и что это значит? ПРЯМОЙ ЭФИР

Эркак ва Аёл #rek #uzbwedding #love #kulgilivideo #live #svadbauz #uzbekistanmusic #music #hahaidea

Эркак ва Аёл #rek #uzbwedding #love #kulgilivideo #live #svadbauz #uzbekistanmusic #music #hahaidea

Новый уровень твоей сосиски

Новый уровень твоей сосиски

Каха домашние продукты #непосредственнокаха

Каха домашние продукты #непосредственнокаха

😲 Гаишник шокировал водителя Мерседеса такими новостями! | Новостничок

😲 Гаишник шокировал водителя Мерседеса такими новостями! | Новостничок

Чистка пляжа - в мусоре лежат интересные вещи

Чистка пляжа - в мусоре лежат интересные вещи