Видео 37
Просмотров 126 558

Reinforcement Learning: The Decision Learning Problem

9:48

Introduction

3:22

torch.nn.LayerNorm Explained

9:38

torch.nn.BatchNorm2d Explained

12:33

torch.distributions.poisson.Poisson - Poisson Distribution Guided Synthetic Data Generation

10:38

GPT: A Technical Training Unveiled #7 - Final Linear Layer and Softmax

11:02

Markov Chains

📌 Understanding Markov Chains in Reinforcement Learning
Welcome back to the Reinforcement Learning Series! 🚀 This week, we’re diving into Markov Chains, a fundamental concept for modelling environments in RL.
🔹 What is a Markov Chain?
🔹 The Markov Property & Why It Matters
🔹 Sample Episode Walkthrough
🔹 Bridging to Markov Reward Processes
By the end of this video, you’ll understand how state transitions work and why Markov Chains make RL more efficient. Stay tuned for the next episode, where we introduce Markov Reward Processes (MRPs)!
#ReinforcementLearning #MarkovChains #MachineLearning #RL #DataScience

Видео

Reinforcement Learning: The Decision Learning Problem

9:48

Reinforcement Learning: The Decision Learning Problem

Просмотров 71День назад

In this introductory video, we explore the Decision Learning Problem, the foundation of Reinforcement Learning. Using a 1D cleaning robot, we start with states, actions, rewards, and terminal states. Real-world applications: Robotics, Operations Research, AI Next Up: 🔹 Markov Chains & MDPs 🔹 The Bellman Equation 📂 Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Expl...

3:22

Introduction

Просмотров 1222 месяца назад

Welcome to Part 1 of my Reinforcement Learning (RL) series! In this series, I’ll cover the foundational concepts of RL, from the Decision Learning Problem to key principles like Markov Processes, Dynamic Programming, and essential RL algorithms. My goal is to make these topics easy to follow with practical examples that bring each concept to life.

9:38

torch.nn.LayerNorm Explained

Просмотров 4972 месяца назад

This video explains how the LayerNorm works and also how PyTorch takes care of the dimension. Unlike BatchNorm that relies on statistics across batches, LayerNorm normalizes data features. Having a good understanding of the dimension really helps a lot in understanding the neural network. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.nn....

12:33

torch.nn.BatchNorm2d Explained

Просмотров 3882 месяца назад

This video explains how the Batch Norm 2d works and also how Pytorch takes care of the dimension. Having a good understanding of the dimension really helps a lot in understanding the neural network. Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.nn.BatchNorm2d.ipynb Full Directory: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules...

torch.distributions.poisson.Poisson - Poisson Distribution Guided Synthetic Data Generation

10:38

torch.distributions.poisson.Poisson - Poisson Distribution Guided Synthetic Data Generation

Просмотров 11710 месяцев назад

This video includes the use of how Poisson could be reveraged to inhance the quality of synthetic data. Pytorch Poisson- pytorch.org/docs/stable/distributions.html#poisson Sampling - stats.stackexchange.com/questions/551568/sampling-from-a-poisson-distribution-infinite-support Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.distributions.po...

GPT: A Technical Training Unveiled #7 - Final Linear Layer and Softmax

11:02

GPT: A Technical Training Unveiled #7 - Final Linear Layer and Softmax

Просмотров 111Год назад

Linear Layer: ruclips.net/video/QpyXyenmtTA/видео.html Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt.pdf

GPT: A Technical Training Unveiled #6 - Block Two of Transform Decoder

10:57

GPT: A Technical Training Unveiled #6 - Block Two of Transform Decoder

Просмотров 67Год назад

This is the second block (layer 2) which repeats Masked Multihead Attention and the feedforward Layer Linear Layer: ruclips.net/video/QpyXyenmtTA/видео.html Layer Normalization: ruclips.net/video/G45TuC6zRf4/видео.html Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Networ...

GPT: A Technical Training Unveiled #5 - Feedforward, Add & Norm

6:07

GPT: A Technical Training Unveiled #5 - Feedforward, Add & Norm

Просмотров 85Год назад

After the attention outputs for each head are computed, they are concatenated and then passed through a feedforward network. The Add and Norm steps involve adding the original input to the output of the attention or feedforward networks (a form of residual connection) and then normalizing the result. This helps in stabilizing the activations and aids in training deeper models. Linear Layer: ruc...

GPT: A Technical Training Unveiled #4 - Masked Multihead Attention

15:38

GPT: A Technical Training Unveiled #4 - Masked Multihead Attention

Просмотров 161Год назад

Detailed exposition of the attention mechanism with an example of key, query, and value matrices in transformer neural networks. The Multihead Attention mechanism allows the model to focus on different parts of the input sequence when producing an output sequence. The mechanism works by producing multiple sets (or "heads") of key, query, and value projections, and then combining them. For our s...

GPT: A Technical Training Unveiled #3 - Embedding and Positional Encoding

8:27

GPT: A Technical Training Unveiled #3 - Embedding and Positional Encoding

Просмотров 186Год назад

Explanation of token embeddings and positional encodings in transformer models, showcasing their significance in AI training. Embeddings are a way of representing categorical data, like words or characters, as continuous vectors. So each character is embedded into a continuous vector space using an embedding layer. Positional encodings are added to give the model information about the relative ...

GPT: A Technical Training Unveiled #2 - Tokenization

10:18

GPT: A Technical Training Unveiled #2 - Tokenization

Просмотров 107Год назад

A demonstration of the tokenization process, detailing the conversion of text to tokens using character sets in language models. Tokenization is the process of converting a sequence of characters into a sequence of tokens. For example, given a small text data, every unique character in this text is treated as a token, leading to a vocabulary of unique characters Wikipedia: en.wikipedia.org/wiki...

GPT: A Technical Training Unveiled #1 - Introduction

3:17

GPT: A Technical Training Unveiled #1 - Introduction

Просмотров 173Год назад

Andrej Karpathy Video: ruclips.net/video/kCc8FmEb1nY/видео.html Wikipedia: en.wikipedia.org/wiki/Generative_pre-trained_transformer Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt.pdf

torch.nn.TransformerDecoderLayer - Part 4 - Multiple Linear Layers and Normalization

4:52

torch.nn.TransformerDecoderLayer - Part 4 - Multiple Linear Layers and Normalization

Просмотров 275Год назад

This video contains the explanation of Multiple Linear Layers of the torch.nn.TransformerDecoderLayer module. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained.git Transformer Encoder Playlists: ruclips.net/video/oCWFyt2kWLg/видео.html&ab_channel=MachineLearningwithPytorch

torch.nn.TransformerDecoderLayer - Part 2 - Embedding, First Multi-Head attention and Normalization

9:29

torch.nn.TransformerDecoderLayer - Part 2 - Embedding, First Multi-Head attention and Normalization

Просмотров 395Год назад

This video contains the explanation of the first Multi-head attention of the torch.nn.TransformerDecoderLayer module. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained.git Transformer Encoder Playlists: ruclips.net/video/oCWFyt2kWLg/видео.html&ab_channel=MachineLearningwithPytorch

torch.nn.TransformerDecoderLayer - Part 3 -Multi-Head attention and Normalization

9:32

torch.nn.TransformerDecoderLayer - Part 3 -Multi-Head attention and Normalization

Просмотров 251Год назад

torch.nn.TransformerDecoderLayer - Part 3 -Multi-Head attention and Normalization

7:32

nn.TransformerDecoderLayer - Overview

Просмотров 947Год назад

nn.TransformerDecoderLayer - Overview

torch.nn.Embedding - How embedding weights are updated in Backpropagation

17:28

torch.nn.Embedding - How embedding weights are updated in Backpropagation

Просмотров 5 тыс.Год назад

torch.nn.Embedding - How embedding weights are updated in Backpropagation

Pytorch Backpropagation with Example 03 - Gradient Descent

8:37

Pytorch Backpropagation with Example 03 - Gradient Descent

Просмотров 2122 года назад

Pytorch Backpropagation with Example 03 - Gradient Descent

Pytorch Backpropagation With Example 02 - Backpropagation

18:47

Pytorch Backpropagation With Example 02 - Backpropagation

Просмотров 1902 года назад

Pytorch Backpropagation With Example 02 - Backpropagation

Pytorch Backpropagation With Example 01 - Forward-propagation

10:05

Pytorch Backpropagation With Example 01 - Forward-propagation

Просмотров 3852 года назад

Pytorch Backpropagation With Example 01 - Forward-propagation

14:32

torch.nn.ConvTranspose2d Explained

Просмотров 9 тыс.2 года назад

torch.nn.ConvTranspose2d Explained

torch.nn.CosineSimilarity explained and announcement!

8:40

torch.nn.CosineSimilarity explained and announcement!

Просмотров 8532 года назад

torch.nn.CosineSimilarity explained and announcement!

torch.nn.TransformerEncoderLayer - Part 5 - Transformer Encoder Second Layer Normalization

5:12

torch.nn.TransformerEncoderLayer - Part 5 - Transformer Encoder Second Layer Normalization

Просмотров 7312 года назад

torch.nn.TransformerEncoderLayer - Part 5 - Transformer Encoder Second Layer Normalization

torch.nn.TransformerEncoderLayer - Part 4 - Transformer Encoder Fully Connected Layers

6:07

torch.nn.TransformerEncoderLayer - Part 4 - Transformer Encoder Fully Connected Layers

Просмотров 8652 года назад

torch.nn.TransformerEncoderLayer - Part 4 - Transformer Encoder Fully Connected Layers

torch.nn.TransformerEncoderLayer - Part 3 - Transformer Layer Normalization

5:42

torch.nn.TransformerEncoderLayer - Part 3 - Transformer Layer Normalization

Просмотров 1,5 тыс.2 года назад

torch.nn.TransformerEncoderLayer - Part 3 - Transformer Layer Normalization

torch.nn.TransformerEncoderLayer - Part 2 - Transformer Self Attention Layer

15:53

torch.nn.TransformerEncoderLayer - Part 2 - Transformer Self Attention Layer

Просмотров 1,8 тыс.3 года назад

torch.nn.TransformerEncoderLayer - Part 2 - Transformer Self Attention Layer

torch.nn.TransformerEncoderLayer - Part 1 - Transformer Embedding and Position Encoding Layer

6:35

torch.nn.TransformerEncoderLayer - Part 1 - Transformer Embedding and Position Encoding Layer

Просмотров 4,3 тыс.3 года назад

torch.nn.TransformerEncoderLayer - Part 1 - Transformer Embedding and Position Encoding Layer

torch.nn.TransformerEncoderLayer - Part 0 - Module Overview

6:12

torch.nn.TransformerEncoderLayer - Part 0 - Module Overview

Просмотров 3,2 тыс.3 года назад

torch.nn.TransformerEncoderLayer - Part 0 - Module Overview

Transformer Positional Embeddings With A Numerical Example.

6:21

Transformer Positional Embeddings With A Numerical Example.

Просмотров 22 тыс.3 года назад

Transformer Positional Embeddings With A Numerical Example.

@samc6368 Месяц назад
Typo in calculation at 5:06, as i=0 for first 2 items of sin and cos for "boy", you have as 1.
@sarahkhan3217 Месяц назад
The division by 2 relates to splitting the embedding dimensions for sine and cosine computations and is independent of the number of sentences or words.
@AlessandraBlasioli Месяц назад
this is amazing, thanks!!
@kushagrasharma6042 2 месяца назад
First Multihead Attention is masked, and the second is normal multihead attention. I think you forgot to mention this is masked cause the decoder behaves as an autoencoder which sends the output back into the output. During training as we will need to mimic this behaviour of inference/testing in training , therefore attention is basically the attention with mask. Great videos overall!
@CrusadeVoyager 2 месяца назад
hi, Thank u for starting RF , would u also provide code :) and examples.?
@machinelearningwithpytorch 2 месяца назад
Of course! I don’t even see the point if there isn’t any code.
@SanthanalakshmiSM 2 месяца назад
wonderful explanation. Keep doing.
@Arthur12137 2 месяца назад
best video, ever
@machinelearningwithpytorch 2 месяца назад
Thanks , I really appreciate your comment
@shrabanKC 2 месяца назад
Thank you for the explanation
@lockdown-vq5bz 3 месяца назад
Hi is this the cpu version? If so, where do we get the gpu version and could you just explain it in a bit?? Thanks
@rahatkibriabhuiyan2426 3 месяца назад
Here I have a observation, in the input data number of features are 3 and we have number of training samples 2. That's why the input_data matrix's shape is 2 by 3. And also the number of the neurons in input MLP layer = number of features of the input data i.e. 3. Plz make me correct, if my speculation is wrong. And thanks for the interactive video.
@unclemusclez 4 месяца назад
i jut found your channel and i can not wait to watch all of your videos. this is awesome thanks
@machinelearningwithpytorch 4 месяца назад
Thank you
@Adhbutham 5 месяцев назад
Why is cosine of 0 = 0?? I think there is an error
@jefffang5691 5 месяцев назад
Thanks a lot! Definitely cleared up a lot of things
@marinamaher8211 6 месяцев назад
Thanks for the details, keep going 👏
@shirleyyli128 6 месяцев назад
goat
@ashishbehal5903 7 месяцев назад
Brother your videos are great! Very helpful thank you
@ketanmann4371 7 месяцев назад
All videos are great. Thanks a lot. Can you start some series or videos on diffusion models specially class conditioning and latent diffusion models (or other score matching models) ?
@machinelearningwithpytorch 7 месяцев назад
Thank you. My focus now is on reinforcement learning. I’m working on a very high quality content.
@VladislavVasilenko 7 месяцев назад
Sorry, I don't understand you lesson. 1. Why activation layer to 3 neurons on second layer (each of them has three connections from previously) have matrix 3x2. 2. Why you disable neurons at 1col [2,3rows] and after disable on 2col [3row]. If you have p=0.5, I am assume you should disable two-three neurons on each step of training, leave only 1-2 neuron enabled.
@VladislavVasilenko 7 месяцев назад
Thanks for details. I am learning Transformer Architecture, Attention mechanism and Math hidden under them. I am from Moscow / Russia.
@dw61w 7 месяцев назад
maybe a better microphone
@Yacktalkun 8 месяцев назад
you bro you skip that when model do the test~~
@j.a.d.ranasinghe7241 8 месяцев назад
Good Explanation !
@alokkumarsingh4641 8 месяцев назад
but in fully connected layer a 1D vector is passed so the 3*2 matrix will be flattened out right? so how are we defining the shape of weights and bias it should be (6*1)
@kabirsharma44 8 месяцев назад
Very nice
@SUDIPTODAS-r9l 9 месяцев назад
Got rid of the jargon , straight to the point , great tutorial
@wilfredomartel7781 9 месяцев назад
😊😊❤amazing tutorial man
@wishswiss 10 месяцев назад
thanks!
@කැලණිකුප්පි 10 месяцев назад
Love these hand writing style explanations ❤
@王天宁-y8y 10 месяцев назад
Thanks for the video! Really do me a big favor.
@thrivefoxxgaming1120 10 месяцев назад
Explanation is clear... great job.. but the audio is little bad
@mahendrans8678 11 месяцев назад
can you explain tgt_key_padding_mask parameter in nn.TransformerDecoderLayer.
@LuizHenrique-qr3lt 11 месяцев назад
Great explanation, do you have videos on layer norm, instance norm and group norm?
@atmismahir 11 месяцев назад
thank you for the clear math. I still couldn't get how the word order is preserved though. Is there any visual representation or mathematical illustration of how the positions are preserved?
@PhilipBrownEsq 11 месяцев назад
this video started off well, but it would ahve been better if it showed the implied second line of python code, explicitly
@findritesh 11 месяцев назад
great job explaining this concept!
@mosesmaned2151 Год назад
your videos are game changing for sure thank you very much please you are a life saver
@wolfisraging Год назад
Most underrated channel ever
@liam15williams Год назад
i don'r get why we divide two
@ahmedchaoukichami9345 Год назад
very good explaine thanks can you do vit transformers plz
@safiyabande7216 Год назад
Great job on explaining. Love your content!
@aspalladan Год назад
naji dadin video dinnan bansako kai bahause bane amma muryanka kaman na bahause ...Allah yakara basira
@mariamzomorodi2249 Год назад
Excellent! Thank you so much! The best explanation ever for embedding layer. We can find it anywhere else in the web!
@23232323rdurian Год назад
Thank you! I've been trying to understand that math unsuccessfully for a long time.....seen lots of videos, but somehow yours explained best
@rafaelgp9072 Год назад
Very good videos, in my opinion The best i ve ever seen about The math perspective of transformers
@rafaelgp9072 Год назад
Amazing
@Wongyork Год назад
May I ask what device you used to record this video?
@machinelearningwithpytorch Год назад
It’s an iPhone X
@jazzvids Год назад
Thank you! I didn't understand why the bias isn't of dimension 1 and this sorted it out for me
@sidhpandit5239 Год назад
Tysm :)
@GoogleAccount-kc6rt Год назад
Very good man, Keep posting ☺️

Machine Learning with Pytorch

Видео

Комментарии