David Duvenaud | Reflecting on Neural ODEs | NeurIPS 2019

MIT Introduction to Deep Learning | 6.S191

Large Language Models explained briefly

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

Trying EVERY Fast Food Holiday Item!

BLACK BAG - Official Trailer [HD] - Only in Theaters March 14

NeurIPS 2020 Tutorial: Deep Implicit Layers

Zico Kolter

Просмотров 49 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 янв 2025

Комментарии • 27

@liduque 28 дней назад
Hi Zico, got to know you on 2024's NeurIPS MLNCP workshop, and wasn't familiar with DEQ until then. I'm very glad to have found this material. It's got everything: theory, simple examples, code implementation, supplementary material. Besides, I'm a big fan of the way the three of you convey your points, it's always very clear and adequate. Thank you!
@keraeduardo 10 месяцев назад ⁺¹
I am a graduate student in Physics. This video is clear, easy to follow and highly informative. Many thanks for making this video public! This is very helpful for me
@alicsir Год назад ⁺¹
Thanks for making this video public. The explanations are very intuitive and clear.
@alexeychernyavskiy4193 4 года назад ⁺¹¹
Thank you guys! Very solid video, and good tempo. You present the material with a smile in a very user-friendly manner, that's a rare delicacy :) I wish new successes for your trio in the coming year. Separate thank you for the website and the code! I think I will try to apply DEQ to image denoising.
@shashanks.k855 4 месяца назад ⁺¹
Thank you for the presentation was really useful.
@gewang9770 2 года назад ⁺¹
I like this tutorial very much!
@Kram1032 7 месяцев назад
I wonder how much can be done here with stochastic continuous evaluations in the spirit of MCMC or recent "Walk on Stars" style evaluations, where you don't have any discretization error at all, but trade that off with some noise...
@kimchi_taco 4 года назад ⁺³
I learn a lot. Thank you very much. There are 2 questions about DEQ.
1. Why does equilibrium point z* matter? How is z* better representation than any intermediate representation z_t?
2. ALBERT is BERT but share the weight by all transformer layers. How DEQ save memory sounds like ALBERT computes the gradient of only last layer and update the "shared" weight. ALBERT actually computes all gradients of all layers and update the "shared" weight by average of gradients. Why does DEQ work even though it doesn't care of gradient of intermediate layer?
@zicokolter9110 4 года назад ⁺³
Thanks for the questions. For 1) this is mainly just an empirical issue, but in practice we do see that "deeper" networks (even in the weight-tied setting) do appear to work better, and thus the equilibrium point works best as the final representation (plus allowing efficient differentiation). 2) Yes, ALBERT would store all the intermediate activations, and compute gradients through the whole unrolled network. The idea of the DEQ model is that this is actually unnecessary, though, precisely via the implicit differentiation method we discuss in the tutorial.
@jiangao5652 4 года назад ⁺⁵
This work is amazing! When I saw GPT-3 use 175 billion parameters to build a language model, just feel hopeless. It's more fair to compete state-of-the-art performance based on model complexity.
@vishwajitkumarvishnu3878 3 года назад
shouldn't the last partial differentiation at 54:00 in backward pass be d1(z*,x,theta) ? its written d2(z*,x,theta)
@sippy_cups 4 года назад ⁺²
Awesome! Really well presented!
@elisim7 3 года назад
Great tutorial and notes!
@omarsharif4676 2 года назад ⁺²
Thank you for a very informative video. I have a very limited mathematics background and was wondering if there are any good resources to better understand the differentiation in ODE. Please let me know if have such resources if you see my comment.
Cheers!
@khuongnguyenduy2156 3 года назад
Thank you very much for sharing this amazing tutorial!
@ezamora1981 4 года назад
Very cool idea!! Congratulations! and thanks for the tutorial.
@kimchi_taco 4 года назад ⁺²
Awesome, but closed caption is little bit out sync. Could you sync it?
@zicokolter9110 4 года назад ⁺³
Thanks for pointing this out! We've re-uploaded them to properly sync. They should work correctly now.
@CristianGarcia 4 года назад
Thanks for the tutorial!
I have a question about the representations created by DEQs, in normal Deep Networks depth means you can compose features and deeper layers are supposed to have higher level representations, does the same story apply for DEQs or is there a similar way to understand its computation?
@ansha2221 4 года назад
Thank you for sharing this.
@ezamora1981 4 года назад
Hi Zico Kolter, great work! ....What about the inference time of DEQs w.r.t DNNs? Are they similar? ...Another question Do you recommend to use JAX instead PyTorch or Tensorflow2?
@adrianbergesenfedaque8016 3 года назад
Hi, I'm just getting started with DILs/DEQs but from what I can tell, their inference time tends to be x2 slower when compared to DNNs. Still, depending on your application it might not be important at all; e.g. in my case we are interested in processing requests on the minute, while a feed-forward DNN takes milliseconds to do inference, so doubling the milliseconds is not going to be a problem. In fact, our hope is that solving the optimization problem directly via this method will save time overall (compared to DNN + optimization algorithm).
@dominikklotz1035 4 года назад
Great Idea.
@DasGrosseFressen 4 года назад
Really cool. One question though? What is the fuss about neural ODEs? Honestly, I think I am missing something. They look just as taking a fireing rate model as an RNN... What is the difference?
@강수현-b4c 4 года назад ⁺¹
50:56
@CppExpedition 3 года назад ⁺¹
earned like + sub at min. 1.47
@rohullahalavi 3 года назад
like

Следующие

Автовоспроизведение

David Duvenaud | Reflecting on Neural ODEs | NeurIPS 2019

David Duvenaud | Reflecting on Neural ODEs | NeurIPS 2019

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

Large Language Models explained briefly

Large Language Models explained briefly

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

Searching the Jungle for WWII Battlefields (6 Days Fishing, Kayaking & Snorkeling in Palau)

Trying EVERY Fast Food Holiday Item!

Trying EVERY Fast Food Holiday Item!

BLACK BAG - Official Trailer [HD] - Only in Theaters March 14

BLACK BAG - Official Trailer [HD] - Only in Theaters March 14

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Inventing liquid neural networks

Inventing liquid neural networks

Diffusion and Score-Based Generative Models

Diffusion and Score-Based Generative Models

ODE | Neural Ordinary Differential Equations - Best Paper Awards NeurIPS

ODE | Neural Ordinary Differential Equations - Best Paper Awards NeurIPS

[Seminar Series] Implicit Deep Learning

[Seminar Series] Implicit Deep Learning

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Equilibrium Models in Deep Learning

Equilibrium Models in Deep Learning

Percolation: a Mathematical Phase Transition

Percolation: a Mathematical Phase Transition

Variational Autoencoders | Generative AI Animated

Variational Autoencoders | Generative AI Animated

Ясновидящая предупредила: следующая вспышка перезагрузит всё на планете! Дарья Миронова

Ясновидящая предупредила: следующая вспышка перезагрузит всё на планете! Дарья Миронова

Екатерина Шульман. Демократия сдулась? Почему мир правеет и голосует за диктаторов?

Екатерина Шульман. Демократия сдулась? Почему мир правеет и голосует за диктаторов?

Трамп отменяет гранты // DeepSeek - Революция в ИИ // Илон Маск читер

Трамп отменяет гранты // DeepSeek - Революция в ИИ // Илон Маск читер

Пора отказываться от USB Type-C?

Пора отказываться от USB Type-C?

😱 Как Остановить ХОРРОР СКАЯ? СПРУНКИ ИНКРЕДИБОКС В МАЙНКРАФТ

😱 Как Остановить ХОРРОР СКАЯ? СПРУНКИ ИНКРЕДИБОКС В МАЙНКРАФТ

КАК ВЫГЛЯДИТ Audi Quattro 2010 по низу рынка. Дико повезло! Эпизод 1.

КАК ВЫГЛЯДИТ Audi Quattro 2010 по низу рынка. Дико повезло! Эпизод 1.

Что мы НЕ отправим бабушке 😂

Что мы НЕ отправим бабушке 😂

🩵 𝗘𝘃𝗲𝗿𝘆𝗯𝗼𝗱𝘆 𝗱𝗮𝗻𝗰𝗲 𝗶𝗻 𝟮𝟬𝟮𝟱 🩵

🩵 𝗘𝘃𝗲𝗿𝘆𝗯𝗼𝗱𝘆 𝗱𝗮𝗻𝗰𝗲 𝗶𝗻 𝟮𝟬𝟮𝟱 🩵