- Видео 37
- Просмотров 126 558
Machine Learning with Pytorch
Добавлен 2 окт 2014
I’m Abdulsalam Bande, a computer scientist. This channel explains machine learning concepts in PyTorch, with a focus on covering PyTorch’s built-in functions and practical implementations.
For any collaborations, you can reach me at: annual03froze [at] icloud.com.
For any collaborations, you can reach me at: annual03froze [at] icloud.com.
Markov Chains
📌 Understanding Markov Chains in Reinforcement Learning
Welcome back to the Reinforcement Learning Series! 🚀 This week, we’re diving into Markov Chains, a fundamental concept for modelling environments in RL.
🔹 What is a Markov Chain?
🔹 The Markov Property & Why It Matters
🔹 Sample Episode Walkthrough
🔹 Bridging to Markov Reward Processes
By the end of this video, you’ll understand how state transitions work and why Markov Chains make RL more efficient. Stay tuned for the next episode, where we introduce Markov Reward Processes (MRPs)!
#ReinforcementLearning #MarkovChains #MachineLearning #RL #DataScience
Welcome back to the Reinforcement Learning Series! 🚀 This week, we’re diving into Markov Chains, a fundamental concept for modelling environments in RL.
🔹 What is a Markov Chain?
🔹 The Markov Property & Why It Matters
🔹 Sample Episode Walkthrough
🔹 Bridging to Markov Reward Processes
By the end of this video, you’ll understand how state transitions work and why Markov Chains make RL more efficient. Stay tuned for the next episode, where we introduce Markov Reward Processes (MRPs)!
#ReinforcementLearning #MarkovChains #MachineLearning #RL #DataScience
Просмотров: 44
Видео
Reinforcement Learning: The Decision Learning Problem
Просмотров 71День назад
In this introductory video, we explore the Decision Learning Problem, the foundation of Reinforcement Learning. Using a 1D cleaning robot, we start with states, actions, rewards, and terminal states. Real-world applications: Robotics, Operations Research, AI Next Up: 🔹 Markov Chains & MDPs 🔹 The Bellman Equation 📂 Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Expl...
Introduction
Просмотров 1222 месяца назад
Welcome to Part 1 of my Reinforcement Learning (RL) series! In this series, I’ll cover the foundational concepts of RL, from the Decision Learning Problem to key principles like Markov Processes, Dynamic Programming, and essential RL algorithms. My goal is to make these topics easy to follow with practical examples that bring each concept to life.
torch.nn.LayerNorm Explained
Просмотров 4972 месяца назад
This video explains how the LayerNorm works and also how PyTorch takes care of the dimension. Unlike BatchNorm that relies on statistics across batches, LayerNorm normalizes data features. Having a good understanding of the dimension really helps a lot in understanding the neural network. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.nn....
torch.nn.BatchNorm2d Explained
Просмотров 3882 месяца назад
This video explains how the Batch Norm 2d works and also how Pytorch takes care of the dimension. Having a good understanding of the dimension really helps a lot in understanding the neural network. Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.nn.BatchNorm2d.ipynb Full Directory: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules...
torch.distributions.poisson.Poisson - Poisson Distribution Guided Synthetic Data Generation
Просмотров 11710 месяцев назад
This video includes the use of how Poisson could be reveraged to inhance the quality of synthetic data. Pytorch Poisson- pytorch.org/docs/stable/distributions.html#poisson Sampling - stats.stackexchange.com/questions/551568/sampling-from-a-poisson-distribution-infinite-support Jupyter Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/torch.distributions.po...
GPT: A Technical Training Unveiled #7 - Final Linear Layer and Softmax
Просмотров 111Год назад
Linear Layer: ruclips.net/video/QpyXyenmtTA/видео.html Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt.pdf
GPT: A Technical Training Unveiled #6 - Block Two of Transform Decoder
Просмотров 67Год назад
This is the second block (layer 2) which repeats Masked Multihead Attention and the feedforward Layer Linear Layer: ruclips.net/video/QpyXyenmtTA/видео.html Layer Normalization: ruclips.net/video/G45TuC6zRf4/видео.html Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Networ...
GPT: A Technical Training Unveiled #5 - Feedforward, Add & Norm
Просмотров 85Год назад
After the attention outputs for each head are computed, they are concatenated and then passed through a feedforward network. The Add and Norm steps involve adding the original input to the output of the attention or feedforward networks (a form of residual connection) and then normalizing the result. This helps in stabilizing the activations and aids in training deeper models. Linear Layer: ruc...
GPT: A Technical Training Unveiled #4 - Masked Multihead Attention
Просмотров 161Год назад
Detailed exposition of the attention mechanism with an example of key, query, and value matrices in transformer neural networks. The Multihead Attention mechanism allows the model to focus on different parts of the input sequence when producing an output sequence. The mechanism works by producing multiple sets (or "heads") of key, query, and value projections, and then combining them. For our s...
GPT: A Technical Training Unveiled #3 - Embedding and Positional Encoding
Просмотров 186Год назад
Explanation of token embeddings and positional encodings in transformer models, showcasing their significance in AI training. Embeddings are a way of representing categorical data, like words or characters, as continuous vectors. So each character is embedded into a continuous vector space using an embedding layer. Positional encodings are added to give the model information about the relative ...
GPT: A Technical Training Unveiled #2 - Tokenization
Просмотров 107Год назад
A demonstration of the tokenization process, detailing the conversion of text to tokens using character sets in language models. Tokenization is the process of converting a sequence of characters into a sequence of tokens. For example, given a small text data, every unique character in this text is treated as a token, leading to a vocabulary of unique characters Wikipedia: en.wikipedia.org/wiki...
GPT: A Technical Training Unveiled #1 - Introduction
Просмотров 173Год назад
Andrej Karpathy Video: ruclips.net/video/kCc8FmEb1nY/видео.html Wikipedia: en.wikipedia.org/wiki/Generative_pre-trained_transformer Notebook: github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt Pretraining.ipynb Presentation:github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained/blob/main/Mini Gpt.pdf
torch.nn.TransformerDecoderLayer - Part 4 - Multiple Linear Layers and Normalization
Просмотров 275Год назад
This video contains the explanation of Multiple Linear Layers of the torch.nn.TransformerDecoderLayer module. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained.git Transformer Encoder Playlists: ruclips.net/video/oCWFyt2kWLg/видео.html&ab_channel=MachineLearningwithPytorch
torch.nn.TransformerDecoderLayer - Part 2 - Embedding, First Multi-Head attention and Normalization
Просмотров 395Год назад
This video contains the explanation of the first Multi-head attention of the torch.nn.TransformerDecoderLayer module. Jupyter Notebook : github.com/abdulsalam-bande/Pytorch-Neural-Network-Modules-Explained.git Transformer Encoder Playlists: ruclips.net/video/oCWFyt2kWLg/видео.html&ab_channel=MachineLearningwithPytorch
torch.nn.TransformerDecoderLayer - Part 3 -Multi-Head attention and Normalization
Просмотров 251Год назад
torch.nn.TransformerDecoderLayer - Part 3 -Multi-Head attention and Normalization
torch.nn.Embedding - How embedding weights are updated in Backpropagation
Просмотров 5 тыс.Год назад
torch.nn.Embedding - How embedding weights are updated in Backpropagation
Pytorch Backpropagation with Example 03 - Gradient Descent
Просмотров 2122 года назад
Pytorch Backpropagation with Example 03 - Gradient Descent
Pytorch Backpropagation With Example 02 - Backpropagation
Просмотров 1902 года назад
Pytorch Backpropagation With Example 02 - Backpropagation
Pytorch Backpropagation With Example 01 - Forward-propagation
Просмотров 3852 года назад
Pytorch Backpropagation With Example 01 - Forward-propagation
torch.nn.CosineSimilarity explained and announcement!
Просмотров 8532 года назад
torch.nn.CosineSimilarity explained and announcement!
torch.nn.TransformerEncoderLayer - Part 5 - Transformer Encoder Second Layer Normalization
Просмотров 7312 года назад
torch.nn.TransformerEncoderLayer - Part 5 - Transformer Encoder Second Layer Normalization
torch.nn.TransformerEncoderLayer - Part 4 - Transformer Encoder Fully Connected Layers
Просмотров 8652 года назад
torch.nn.TransformerEncoderLayer - Part 4 - Transformer Encoder Fully Connected Layers
torch.nn.TransformerEncoderLayer - Part 3 - Transformer Layer Normalization
Просмотров 1,5 тыс.2 года назад
torch.nn.TransformerEncoderLayer - Part 3 - Transformer Layer Normalization
torch.nn.TransformerEncoderLayer - Part 2 - Transformer Self Attention Layer
Просмотров 1,8 тыс.3 года назад
torch.nn.TransformerEncoderLayer - Part 2 - Transformer Self Attention Layer
torch.nn.TransformerEncoderLayer - Part 1 - Transformer Embedding and Position Encoding Layer
Просмотров 4,3 тыс.3 года назад
torch.nn.TransformerEncoderLayer - Part 1 - Transformer Embedding and Position Encoding Layer
torch.nn.TransformerEncoderLayer - Part 0 - Module Overview
Просмотров 3,2 тыс.3 года назад
torch.nn.TransformerEncoderLayer - Part 0 - Module Overview
Transformer Positional Embeddings With A Numerical Example.
Просмотров 22 тыс.3 года назад
Transformer Positional Embeddings With A Numerical Example.
Typo in calculation at 5:06, as i=0 for first 2 items of sin and cos for "boy", you have as 1.
The division by 2 relates to splitting the embedding dimensions for sine and cosine computations and is independent of the number of sentences or words.
this is amazing, thanks!!
First Multihead Attention is masked, and the second is normal multihead attention. I think you forgot to mention this is masked cause the decoder behaves as an autoencoder which sends the output back into the output. During training as we will need to mimic this behaviour of inference/testing in training , therefore attention is basically the attention with mask. Great videos overall!
hi, Thank u for starting RF , would u also provide code :) and examples.?
Of course! I don’t even see the point if there isn’t any code.
wonderful explanation. Keep doing.
best video, ever
Thanks , I really appreciate your comment
Thank you for the explanation
Hi is this the cpu version? If so, where do we get the gpu version and could you just explain it in a bit?? Thanks
Here I have a observation, in the input data number of features are 3 and we have number of training samples 2. That's why the input_data matrix's shape is 2 by 3. And also the number of the neurons in input MLP layer = number of features of the input data i.e. 3. Plz make me correct, if my speculation is wrong. And thanks for the interactive video.
i jut found your channel and i can not wait to watch all of your videos. this is awesome thanks
Thank you
Why is cosine of 0 = 0?? I think there is an error
Thanks a lot! Definitely cleared up a lot of things
Thanks for the details, keep going 👏
goat
Brother your videos are great! Very helpful thank you
All videos are great. Thanks a lot. Can you start some series or videos on diffusion models specially class conditioning and latent diffusion models (or other score matching models) ?
Thank you. My focus now is on reinforcement learning. I’m working on a very high quality content.
Sorry, I don't understand you lesson. 1. Why activation layer to 3 neurons on second layer (each of them has three connections from previously) have matrix 3x2. 2. Why you disable neurons at 1col [2,3rows] and after disable on 2col [3row]. If you have p=0.5, I am assume you should disable two-three neurons on each step of training, leave only 1-2 neuron enabled.
Thanks for details. I am learning Transformer Architecture, Attention mechanism and Math hidden under them. I am from Moscow / Russia.
maybe a better microphone
you bro you skip that when model do the test~~
Good Explanation !
but in fully connected layer a 1D vector is passed so the 3*2 matrix will be flattened out right? so how are we defining the shape of weights and bias it should be (6*1)
Very nice
Got rid of the jargon , straight to the point , great tutorial
😊😊❤amazing tutorial man
thanks!
Love these hand writing style explanations ❤
Thanks for the video! Really do me a big favor.
Explanation is clear... great job.. but the audio is little bad
can you explain tgt_key_padding_mask parameter in nn.TransformerDecoderLayer.
Great explanation, do you have videos on layer norm, instance norm and group norm?
thank you for the clear math. I still couldn't get how the word order is preserved though. Is there any visual representation or mathematical illustration of how the positions are preserved?
this video started off well, but it would ahve been better if it showed the implied second line of python code, explicitly
great job explaining this concept!
your videos are game changing for sure thank you very much please you are a life saver
Most underrated channel ever
i don'r get why we divide two
very good explaine thanks can you do vit transformers plz
Great job on explaining. Love your content!
naji dadin video dinnan bansako kai bahause bane amma muryanka kaman na bahause ...Allah yakara basira
Excellent! Thank you so much! The best explanation ever for embedding layer. We can find it anywhere else in the web!
Thank you! I've been trying to understand that math unsuccessfully for a long time.....seen lots of videos, but somehow yours explained best
Very good videos, in my opinion The best i ve ever seen about The math perspective of transformers
Amazing
May I ask what device you used to record this video?
It’s an iPhone X
Thank you! I didn't understand why the bias isn't of dimension 1 and this sorted it out for me
Tysm :)
Very good man, Keep posting ☺️