Видео 82
Просмотров 67 628

36:19

Guidance for diffusion Models

34:21

Vecrtor Quantization and Multi-Modal Models

Vecrtor Quantization and Multi-Modal Models

Видео

36:19

Contrastive Coding

Просмотров 2163 месяца назад

This lecture describes contrastive coding, a fourth fundamental loss function of deep learning.

34:21

Guidance for diffusion Models

Просмотров 2863 месяца назад

In practice diffusion models need a poorly understood alteration of their objective function called "guidance". This talk gives the timeline of the rapid development from the introduction of guidance in May 2021 to DALLE2 in March 2022.

46:35

Diffusion1

Просмотров 2154 месяца назад

This is an introduction to the mathematics of diffusion models.

39:51

VAE

Просмотров 1414 месяца назад

This lecture describes the ELBO loss function that defines variational auto encoders (VAEs) as a third fundamental equation (together with cross entropy loss and the GAN adversarial objective).

48:13

GANs

Просмотров 834 месяца назад

The formulation of GANs plus a variety of applications of GANs and discriminative loss.

1:08:16

Lecture 8b

Просмотров 1584 месяца назад

The Occam Guarantee (The Free Lunch Theorem), The PAC-Bayes Theorem (real valued model parameters and L2 regularization guarantees), Implicit Regularization, Calibration, Ensembles, Dounble Descent, and Grokking.

53:23

SDE

Просмотров 2295 месяцев назад

Gradient Flow, Diffusion Processes (Brownian Motion), Langevin Dynamics and the Stochastic Differential Equation (SDE) model of SGD

44:05

SGD

Просмотров 2715 месяцев назад

A presentation of Vanilla SGD, momentum and Adam with an analysis based on understanding temperature and its relationship to hyper-parameter tuning.

36:35

Transformer

Просмотров 2625 месяцев назад

Language Modeling, Self Attention and the Transformer.

36:23

Some Fundamental Architectural Elements

Просмотров 2215 месяцев назад

This describes the motivation for RELU, initialization methods, normalization layers and residual connections.

39:21

history to 2024

Просмотров 7076 месяцев назад

This is an overview of the history of Deep learning. It reviews the history starting from the introduction of the neural threshold unit in 1943 but focusing mainly on the "current era" which starts in 2012 with AlexNet.

Lecture 5: Language Modeling and the Transformer

1:11:06

Lecture 5: Language Modeling and the Transformer

Просмотров 289Год назад

Lecture 5: Language Modeling and the Transformer

Lecture 4: Initialization, Normalization, and Residual Connections

1:16:43

Lecture 4: Initialization, Normalization, and Residual Connections

Просмотров 327Год назад

Lecture 4: Initialization, Normalization, and Residual Connections

51:16

Lecture 3 Einstein notation and CNNs

Просмотров 410Год назад

Lecture 3 Einstein notation and CNNs

1:28:46

Lecture 1: A Survey of Deep Learning

Просмотров 770Год назад

Lecture 1: A Survey of Deep Learning

15:59

More Recent Developments

Просмотров 5472 года назад

More Recent Developments

Vector Quantized Variational Auto-Encoders (VQ-VAEs).

27:14

Vector Quantized Variational Auto-Encoders (VQ-VAEs).

Просмотров 8 тыс.2 года назад

Vector Quantized Variational Auto-Encoders (VQ-VAEs).

9:45

Progressive VAEs

Просмотров 4602 года назад

Progressive VAEs

Gaussian Models and the Perils of Differential Entropy

14:55

Gaussian Models and the Perils of Differential Entropy

Просмотров 4102 года назад

Gaussian Models and the Perils of Differential Entropy

29:19

Variation Auto Encoders (VAEs)

Просмотров 8202 года назад

Variation Auto Encoders (VAEs)

53:36

VAE Lecture 1

Просмотров 4542 года назад

VAE Lecture 1

1:07:32

SGD Lecture 1

Просмотров 2733 года назад

SGD Lecture 1

18:22

2021 Developments

Просмотров 5183 года назад

2021 Developments

11:06

Back-Propagation

Просмотров 8343 года назад

Back-Propagation

8:46

Back-Propagation with Tensors

Просмотров 1 тыс.3 года назад

Back-Propagation with Tensors

21:13

The Educational Framework (EDF)

Просмотров 1 тыс.3 года назад

The Educational Framework (EDF)

7:28

Minibatching

Просмотров 5623 года назад

Minibatching

33:03

Trainability

Просмотров 5273 года назад

Trainability

11:22

Einstein Notation

Просмотров 5013 года назад

Einstein Notation

@nserver109 Месяц назад
Wonderful!
@DorPolo-x5g Месяц назад
great video.
@ees7416 2 месяца назад
fantastic course. thank you.
@saikalyan3966 3 месяца назад
Uncanny
@stevecaya 4 месяца назад
At minute 16 is priceless. The teacher is not sure if there has been any big advances in AI other then this small thing called GPT-3. Ha ha ha ha. Nothing big about that model other then it would turn out to be the biggest consumer app to 100 million users in history. And usher in the AI age to the general public. Dude, how did you miss that one…ouch.
@moormanjean5636 4 месяца назад
This is such a helpful video thank you
@martinwafula1183 5 месяцев назад
Very timely tutorial
@solomonw5665 5 месяцев назад
*released 3 years ago 🫠
@quickpert1382 5 месяцев назад
@@solomonw5665 timed for showing off in RUclips recommendations after KANs released. For him was quite timed, for me, too late already
@shivamsinghcuchd 5 месяцев назад
This is gold!!
@aditya_a 9 месяцев назад
Narrator: deep networks were NOT saturated lol
@jennifergo2024 11 месяцев назад
✅
@jennifergo2024 11 месяцев назад
✅
@jennifergo2024 11 месяцев назад
✅
@jennifergo2024 11 месяцев назад
✅
@jennifergo2024 11 месяцев назад
✅
@jennifergo2024 11 месяцев назад
✅
@K3pukk4 Год назад
what a legend!
@yorailevi6747 Год назад
Thanks i was just searching for this idea
@zeydabadi Год назад
Could you elaborate on “… j ranges over neurons at that position …” ?
@verystablegenius4720 Год назад
terrible exposition - doesn't seem to understand it himself either. "we should do the verification" even your notation is not clear. Also: "unary potential" is called a BIAS. Just read a stat. mech. book before making these videos, sigh.
@andrewluo6088 Год назад
After watched this video, I finally understand
@stupidoge 2 года назад
Thanks for ur interpretation. I have a clear understading of how this equation works. (If could, I still need some detailed teaching on each part of equation. all in all, thanks for your help!!!
@AmitKumarPradhan57 2 года назад
I understood when Ps = Pop, contrastive divergence goes to zero as distribution of Y_hat and Y are same. It's not clear to me why the gradient also goes to zero. Thank you in advance. PS: I took this course last quarter.
@Jootawallah 3 года назад
Another question, why is the gate function G(t) not just an independent parameter between 0 and 1? What do we gain from making it a function of h_t-1 and x? At the end, SGD would find good values for G(t) even if it were an independent parameter.
@Jootawallah 3 года назад
Is there an explanation for why the three gated RNN architectures here differ in performance? Why is the LSTM, the architecture with the most parameters, not the most effective one? In fact, neither is the simplest one the most effective. It's the intermediate one that takes the gold medal. But why?
@davidmcallester4973 3 года назад
A fair comparison uses the same number of parameters for each architecture --- you can always increase the dimension of the hidden vectors. Some experiments have indicated that at the same number of parameters all the gated RNNs behave similarly. But there is no real analytic understanding.
@Jootawallah 3 года назад
I don't understand, what is the benefit of using a gated, i.e. residual, architecture? You talk about gates allowing forgetting or remembering, but why would we want to forget anyway? Also, whether G is zero or one, we always remember the previous state h_t-1 in some way! So I don't get it ...
@davidmcallester4973 3 года назад
A vanilla RNN just after initialization does not remember the previous hidden state because the information is destroyed by the randomly initialized parameters. Vanilla RNNs could probably be improved with initializations that are better at remembering the previous state, but the structure of a gated RNN seems to cause SGD to find parameter settings with better memory than happens by running SGD on vanilla RNNs.
@Jootawallah 3 года назад
@@davidmcallester4973 So is this again just a matter of residual architectures providing a lower bound to the gradient, and thus preventing them vanishing?
@Jootawallah 3 года назад
On slide 11, shouldn't it be self.x.addgrad(self.x.grad*...) ? self.grad isn't defined, right?
@deepfoundations5697 3 года назад
There is no typo here. Self.grad is defined.
@Jootawallah 3 года назад
Illuminating!
@Jootawallah 3 года назад
So if KL(p|q) = 0, does it mean that p = q upto a constant? Or are there other symmetries to take into account?
@jonathanyang2359 3 года назад
Thanks! I don't attend this institution, but this was an extremely clear lecture :)
@addisonweatherhead2790 3 года назад
The intuition around the 13 minute mark was really helpful! I've been trying to understand this paper for a few days now, and this has really made its goal and reasoning more succinct. Thanks!
@bernhard-bermeitinger 3 года назад
Thank you for this video, however, please don't call your variable ŝ 😆 (or at least don't say it out loud)
@kaizhang5796 3 года назад
Great lecture! May I ask how to choose the conditional probability of node i given its neighbors in a continuous case? Thanks!
@davidmcallester4973 3 года назад
If the node values are continuous and the edge potentials are Gaussian then the conditional probability of a node given its neighbors is also Gaussian.
@kaizhang5796 3 года назад
David McAllester thanks! if each node has d-dimensional features, and let node i has k neighbors. Then how to determine the parameters of this Gaussian p(i | k neighbors)?
@kaizhang5796 3 года назад
Should I multiply k Gaussians together, where the mean of each Gaussian are the k neighbors?
@LyoshaTheZebra 3 года назад
Thanks for explaining that! Great job. Subscribed!
@sdfrtyhfds 3 года назад
also, what if you skip the quantization during inference? would you still get images that make sense?
@davidmcallester4973 3 года назад
Do you mean "during generation"? During generation you can't skip the quantization because the pixel-CNN is defined to generate the quantized vectors (the symbols).
@sdfrtyhfds 3 года назад
@@davidmcallester4973 I guess that during generation it wouldn't make much sense, i was thinking more in the direction of interpolating smoothly between two different symbols.
@sdfrtyhfds 3 года назад
do you train the pixel cnn on the same data and just not update the Vae weights while training?
@davidmcallester4973 3 года назад
yes, the vector quantization is held constant as the pixel CNN is trained.
@bastienbatardiere1187 3 года назад
you are not even taking the examle of a graph with loops, which is the whole point of LBP. Moreover, please introduce a little bit the notations, we should be able to understand even if we did not watch your previous videos. Nevertheless, it's great to teach such method.
@mim8312 3 года назад
Now, the cutting edge scientists are working on the future AIs. Creating an AI by a combination of multiple AI's, which reportedly is similar to how our brain functions, which different portions performing specific functions, which can then understand and perform a multiple set of completely different tasks better than humans? What could go wrong?
@mim8312 3 года назад
I think that too many people are focusing on the game, which I also follow, as if this were an ordinary player. Since I have significant knowledge, and since I believe that Hawking and Musk were right, I am really anxious by the self-taught nature of this AI. This particular AI is not the worrisome thing, albeit it has obvious, potential applications in military logistics, military strategy, etc. The really scary part is how fast this was developed after AlphaGO debuted. We are not creeping up on the goal of human-level intelligence. We are likely to shoot past that goal amazingly soon without even realizing it, if things continue progressing as they have. The early, true AIs will also be narrow and not very competent or threatening, even if they become "superhuman" in intelligence. They will also be harmless, idiot savants at first. Upcoming Threat to Humanity. The scary thing is the fact that computer speed (and thereby, probably eventually AI intelligence) doubles about every year, and will likely double faster when super-intelligent AIs start designing chips, working with quantum computers as co-processors, etc. How fast will our AIs progress to such levels that they become indispensable -- while their utility makes hopeless any attempts to regulate them or retroactively impose restrictions on beings that are smarter than their designers? At first, they may have only base functions, like the reptilian portion of our brain. However, when will they act like Nile crocodiles and react to any threat with aggression? Ever gone skinny dipping with Nile crocodiles? I fear that very soon, before we realize it, we will all be doing the equivalent of skinny dipping with Nile crocodiles, because of how fast AIs will develop by the time that the children born today reach their teens or middle age. Like crocodiles that are raised by humans, AIs may like us for a while. I sure hope that lasts. As the announcer in Jeopardy said about a program that was probably not really an advanced AI long ago, I, for one, welcome our future, AI overlords.
@zv8369 3 года назад
5:50 The reference was meant to be Poole et el. rather than Chen et al.? arxiv.org/abs/1905.06922
@zv8369 3 года назад
Could you please provide reference for your statement at 11:18 *"cross-entropy objective is an upperbound on the population entropy"*
@zv8369 3 года назад
I think I got around to understanding why this is the case. Entropy, H(X), is the minimum number of bits required for representing X. Cross entropy is minimum when q matches the true distribution p (the minimum CE value is the entropy that is using the true distribution); otherwise CE is larger than entropy. Therefore, CE is an upperbound on the entropy! I didn't do a good job describing this but hope this helps!
@siyaowu7443 Год назад
@@zv8369 Thanks! It helps me a lot!

Deep Foundations

Видео

Комментарии