LambdaNetworks: Modeling long-range Interactions without Attention (Paper Explained)

[ECCV 2020 Spotlight] Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

Hopfield Networks is All You Need (Paper Explained)

Inter Miami CF vs. Atlanta United | Audi 2024 MLS Cup Playoffs | Full Match Highlights

Dodgers DEFEAT Yankees in Game 2, Shohei Ohtani Injury: David Ortiz, Derek Jeter, Alex Rodriguez

Belinda & Kenia OS - JACKPOT (Video Oficial)

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation (Paper Explained)

Yannic Kilcher

Просмотров 15 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 окт 2024

Комментарии • 33

@Kram1032 4 года назад ⁺¹⁵
It's gonna take a lot of doing to make that feasible but I'm really curious what could happen with this attentional type of processing for multimodal data.
Like, imagine you could scrub the web like they did for GPT-3 but include not just text but also images. Entire illustrated books. Embedded videos with spoken language.
Language is fundamentally dependent on the real world. It's crazy how far we can get with *just* text but I'd imagine a lot of things could be easily disambiguated if words aren't just typed but also heard or in the context of other stuff.
So making attention more efficient for images is a solid step towards something like this and I'm really looking forward to what'll come of it.
@felipemello1151 4 года назад ⁺²
Google actually has a trained NN that accepts all sorts of inputs (images, text, etc). The idea was to have a single model for everything. I cant remember the name of it though.
@Kram1032 4 года назад ⁺²
@@felipemello1151 Google Brain I think, but I'd imagine there was quite some progress since
@jasdeepsinghgrover2470 4 года назад ⁺²
I think we are very close to something like this but positional embedding should become something general then. Like context embedding. Something like an Image caption should be associated with both the image and the text referring to the image. Maybe after that, this will be possible.
@herp_derpingson 4 года назад ⁺⁶
27:00 I think a better interpretation would be. "When I am at this position, I am more important. Or less important."
Also attention based models are inherently interpretable compared to convolution based models. So, I think these will win out in the long run. Perhaps we can have a hybrid of CNN and attention.
@socratic-programmer 4 года назад ⁺⁴
To an extent convolutional models can also be analysed to see what parts were the most excited (and contributed to the final prediction). The other main advantage - and the reason I think we will at least have some hybrid of conv + attention - is that convolutions are much more parameter-efficient than FC or self-attention layers.
@redjammie8342 4 года назад ⁺²
@@socratic-programmer Also local connectivity at low level visual features make perfect sense.
@whatdl6002 4 года назад ⁺¹²
Are we a couple of million dollars of Neural Architecture Search away from the end of convolutions???
@binjianxin7830 4 года назад ⁺¹
When convolutions go deep, they seem not only to be more efficient but also condense information in various abstract and profound ways. Certainly the Attention layers need more efficiency.
@alceubissoto 4 года назад ⁺¹
Thanks for the video Yannic. Amazing explanation!
@jahcane3711 4 года назад ⁺¹
Beautiful. Thank you Yannic
@marcussky 4 года назад ⁺⁴
Check out Tabnet... Attention is coming for tabular data as well...
@PaganPegasus 2 года назад
7:55 Yannic just predicted the Perceiver architecture. Madman.
@TechVizTheDataScienceGuy 4 года назад ⁺¹
Nicely explained! 👍
@blizzard072 3 года назад
As the subscript implies, there seems to be a positional embedding r_p for every output position o. Then I'm not sure if that would be memory friendly.. Having relative positional embeddings for every pixel seems intense.
@shrutishrestha8296 4 года назад ⁺³
are there any code using this for segmentation?
@YannicKilcher 4 года назад ⁺¹
I don't think so
@sahilriders 3 года назад ⁺¹
Did you checked out MaX-DeepLab paper? It will be nice if you can make a video on that.
@YannicKilcher 3 года назад ⁺¹
thanks!
@jackeown 4 года назад ⁺¹
You should do a video on TabNet for tabular data using neural nets. I feel like there's a lot there and the explanations online kind of suck.
@seyeeet8063 3 года назад
can someone explain to me what does axial means? :) have a hard time getting it
@freddiekalaitzis5708 3 года назад
In times when SOTA is unfortunately king to young reviewers, I can appreciate the authors' need to perform well at least within some class of models. Imagine the frustration when all you offer to the community is a competitive alternative, only for the reviewer to retort it's not the best tool by an arbitrary margin.
Great video.
@GyuHobbyRC 4 года назад ⁺¹
I enjoyed a great video let's be friends 😊😊😊😊
우리 친구 해요.!!!~~^^
@trevormartin1944 4 года назад
Does anyone know what Yannic uses to be able to draw and edit over the PDFs?
@YannicKilcher 4 года назад ⁺¹
OneNote
@az8134 3 года назад
Attention is the new MLP when you are rich
@monstrimmat 3 года назад
"What's a good number?"
@Lee-vs5ez 4 года назад ⁺¹
So many tricks for reducing computational powers lately. Intuitive but also qustionable
@autonomous2010 4 года назад ⁺¹
Yep. A lot of approaches scale very poorly requiring exponentially more resources the more data you have. So there's a lot of experimenting to try to get around that major limitation.
@mariomariovitiviti 4 года назад ⁺⁴
These names are getting out of hand
@qimingzhong1044 4 года назад
with transformer dominating ranking indexes, light weight neural network might be a thing of the past.
@redjammie8342 4 года назад
what do you mean lightweight neural network?

Следующие

Автовоспроизведение

LambdaNetworks: Modeling long-range Interactions without Attention (Paper Explained)

LambdaNetworks: Modeling long-range Interactions without Attention (Paper Explained)

[ECCV 2020 Spotlight] Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

[ECCV 2020 Spotlight] Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

Hopfield Networks is All You Need (Paper Explained)

Hopfield Networks is All You Need (Paper Explained)

Inter Miami CF vs. Atlanta United | Audi 2024 MLS Cup Playoffs | Full Match Highlights

Inter Miami CF vs. Atlanta United | Audi 2024 MLS Cup Playoffs | Full Match Highlights

Dodgers DEFEAT Yankees in Game 2, Shohei Ohtani Injury: David Ortiz, Derek Jeter, Alex Rodriguez

Dodgers DEFEAT Yankees in Game 2, Shohei Ohtani Injury: David Ortiz, Derek Jeter, Alex Rodriguez

Belinda & Kenia OS - JACKPOT (Video Oficial)

Belinda & Kenia OS - JACKPOT (Video Oficial)

BLACK'S SAD ORIGIN STORY! Incredibox Sprunki Animation

BLACK'S SAD ORIGIN STORY! Incredibox Sprunki Animation

Synthesizer: Rethinking Self-Attention in Transformer Models (Paper Explained)

Synthesizer: Rethinking Self-Attention in Transformer Models (Paper Explained)

SIREN: Implicit Neural Representations with Periodic Activation Functions (Paper Explained)

SIREN: Implicit Neural Representations with Periodic Activation Functions (Paper Explained)

The Most Brutal Bullying in DBZ💀😱

The Most Brutal Bullying in DBZ💀😱

Rethinking Attention with Performers (Paper Explained)

Rethinking Attention with Performers (Paper Explained)

Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers (Paper Explained)

Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers (Paper Explained)

MLT __init__ Session #2: DeepLab - Semantic Image Segmentation

MLT __init__ Session #2: DeepLab — Semantic Image Segmentation

DINO: Self-distillation with no labels

DINO: Self-distillation with no labels

Big Bird: Transformers for Longer Sequences (Paper Explained)

Big Bird: Transformers for Longer Sequences (Paper Explained)

[Long Review] Axial Attention in Multidimensional Transformers

[Long Review] Axial Attention in Multidimensional Transformers

Распаковываю Детский Спиннинг-ручку! #shorts

Распаковываю Детский Спиннинг-ручку! #shorts

UFC 308: Шара Буллет - Слова после боя

UFC 308: Шара Буллет - Слова после боя

Поехал за кунгом, а купил Эксклюзив из СССР!!! Строю дом на колёсах 4х4.

Поехал за кунгом, а купил Эксклюзив из СССР!!! Строю дом на колёсах 4х4.

一把传统手工线锯的制作，简单易学，极致性价比 #woodworking

一把传统手工线锯的制作，简单易学，极致性价比 #woodworking

Incredibox Sprunki: Oren vs AnythingAlexia - Help Raddy x Wenda x Fun Bot Get Out #shorts

Incredibox Sprunki: Oren vs AnythingAlexia - Help Raddy x Wenda x Fun Bot Get Out #shorts

🔥 ПРЕМЬЕРА 2024! 🔥 Взгляд русалки (2024). 1 серия. Детективный сериал.

🔥 ПРЕМЬЕРА 2024! 🔥 Взгляд русалки (2024). 1 серия. Детективный сериал.

Только ЕМУ это удалось

Только ЕМУ это удалось

Хочу клип с Димой Биланом или как я познакомилась с таксистом! #машмилаш #димабилан

Хочу клип с Димой Биланом или как я познакомилась с таксистом! #машмилаш #димабилан