- Видео 128
- Просмотров 38 249
meanxai
Южная Корея
Добавлен 14 июл 2023
This channel covers AI-related technologies. We will discuss the mathematical theory of AI algorithms and implement practical models using Python, TensorFlow, etc.
The source codes can be found at:
(but the slides are private)
github.com/meanxai/machine_learning
github.com/meanxai/deep_learning
The topics we will cover are:
1. Machine Learning
2. Deep Learning
3. Recommendation System
4. Natural Language Processing
5. Reinforcement Learning
All videos are produced in Korean and translated into English. And the audio is generated by AI, Text-to-Speech. There may be some grammatical errors or awkward expressions.
The source codes can be found at:
(but the slides are private)
github.com/meanxai/machine_learning
github.com/meanxai/deep_learning
The topics we will cover are:
1. Machine Learning
2. Deep Learning
3. Recommendation System
4. Natural Language Processing
5. Reinforcement Learning
All videos are produced in Korean and translated into English. And the audio is generated by AI, Text-to-Speech. There may be some grammatical errors or awkward expressions.
[MXDL-13-03] Autoencoder [3/6] - Sparse autoencoder
In this video, we will look at a sparse autoencoder.
First, let's take a look at an overview of a sparse autoencoder and how to impose a sparsity constraint on an autoencoder.
Next, let's look at the L1 activity regularization and KL divergence regularization for a sparsity constraint.
Finally, let's implement a sparse autoencoder model to detect edges of MNIST images.
#Autoencoder #SparseAutoencoder #EdgeDetection #KLDivergenceRregularization
First, let's take a look at an overview of a sparse autoencoder and how to impose a sparsity constraint on an autoencoder.
Next, let's look at the L1 activity regularization and KL divergence regularization for a sparsity constraint.
Finally, let's implement a sparse autoencoder model to detect edges of MNIST images.
#Autoencoder #SparseAutoencoder #EdgeDetection #KLDivergenceRregularization
Просмотров: 46
Видео
[MXDL-13-02] Autoencoder [2/6] - Denoising, Deblurring Autoencoder
Просмотров 7021 час назад
In this video, we will look at noise and blur removal using CNN-autoencoders and LSTM-autoencoders. Contents: 1. Structure of an autoencoder model for noise reduction. 2. Denoising fashion MNIST images using a CNN-autoencoder 3. Deblurring fashion-MNIST images using a CNN-autoencoder 4. Denoising time series using a LSTM-autoencoder #Autoencoder #NoiseReduction #CNNAutoencoder #LSTMAutoencoder
[MXDL-13-01] Autoencoder [1/6] - Dimensionality reduction
Просмотров 12814 дней назад
In this series, we will look at autoencoders. In this video, we will look at the basics of autoencoders and dimensionality reduction using autoencoder models. Let's look at the full table of contents for this series. In Chapter 1, we will look at the basic concepts of autoencoders. In Chapter 2, we will look at dimensionality reduction using autoencoders. We will build fully connected autoencod...
[MXDL-12-06] Convolutional Neural Networks (CNN) [6/6] - Convolutional LSTM
Просмотров 9021 день назад
In this video, we will look at the Convolutional LSTM. Convolutional LSTM is a type of neural network that combines the Convolutional Neural Network and the Long Short-Term Memory model. It is designed to learn from videos (i.e., image sequences), which are spatiotemporal data that contain both spatial and temporal dependencies. In the final chapter of the CNN series, we will look at the basic ...
[MXDL-12-05] Convolutional Neural Networks (CNN) [5/6] - 3D Convolution
Просмотров 9721 день назад
In this video, we will look at the 3D convolution. 3D convolution can be used for 3D image slices, such as medical imaging, or for videos, such as action recognition data. In the last video, we looked at the Residual Neural Network using 2D convolution. In this video, we will look at 3D convolution process, and use it to build a 3D Residual Neural Network to classify 3D MNIST images. #Convoluti...
[MXDL-12-04] Convolutional Neural Networks (CNN) [4/6] - Residual Neural Network
Просмотров 69Месяц назад
Before moving on to 3D convolutions, let's look at the residual neural network, which can be used to build deeper CNNs. Let's look at the overall architecture for a Residual Network. And let's look at the structure of a small-scale Residual Network for the CIFAR10 dataset. Finally, let's implement a small ResNet model to classify CIFAR10 images. In 2015, Kaiming He et al. proposed the Residual ...
[MXDL-12-03] Convolutional Neural Networks (CNN) [3/6] - 2D convolutional layer and 2D pooling layer
Просмотров 115Месяц назад
In this video, we will take a closer look at a CNN model using 2D convolutional layers. Let's take a look at how the two-dimensional convolutional layer works. And let's take a look at the structure of the input and output of a 2D convolutional layer. And we will also look at a two-dimensional pooling layer. And let's use these layers to classify the CIFAR10 images. 2D convolution is typically ...
[MXDL-12-02] Convolutional Neural Networks (CNN) [2/6] - 1D convolutional layer and 1D pooling layer
Просмотров 193Месяц назад
In this video, we will look at 1D convolution. 1D convolution is typically used for sequence data such as time series and natural language data, and the filters slide in one dimension. Let's take a look at how the one-dimensional convolutional layer works. Let's take a look at the structure of the input and output of a 1D convolutional layer. And we will also look at a one-dimensional pooling l...
[MXDL-12-01] Convolutional Neural Networks (CNN) [1/6] - The basics of CNN
Просмотров 249Месяц назад
In this series, we will look at Convolutional Neural Networks, commonly referred to as CNNs. CNN is a specialized type of neural network designed to recognize images. It is widely used in computer vision tasks such as image classification, object detection, and image segmentation. In this video, we will look at the basics of convolutional neural networks. Let's look at the full table of content...
[MXDL-11-07] Attention Networks [7/7] - Stock price prediction using a Transformer model
Просмотров 422Месяц назад
Before I end this series, I would like to write some code to predict stock prices using the Transformer model. Since stock prices are also time series, we can apply all the Seq2Seq, Attention, and Transformer models we have looked at in this series to try to predict stock prices. In this video, we will use the Transformer model to predict stock prices. Stock prices are difficult to be predicted...
[MXDL-11-06] Attention Networks [6/7] - Time series forecasting using a Transformer model
Просмотров 191Месяц назад
In the last two chapters, we predicted time series using sequence-to-sequence based models. In this video, we will predict time series using a transformer model that only uses attention, rather than a sequence model. Instead of writing our own transformer code from scratch, we'll use the code posted on this site, github.com/suyash/transformer Since this code is for natural language processing, ...
[MXDL-11-05] Attention Networks [5/7] - Transformer model
Просмотров 1612 месяца назад
In the previous chapter, we looked at a sequence-to-sequence based attention model. In this video, we'll look at a Transformer model that uses only attention, rather than a sequence model. Eight data scientists working at Google published a groundbreaking research paper in the field of natural language processing in 2017 called “Attention is all you need.” This is the Transformer model. Transfo...
[MXDL-11-04] Attention Networks [4/7] - Seq2Seq-Attention model using input-feeding method
Просмотров 1962 месяца назад
In the last video, we implemented a simple Seq2Seq-Attention model to predict time series. In this video, we will add a feature called input-feeding method to the existing Seq2Seq-Attention model. The input-feeding approach is presented in section 3.3 of Luong's 2015 paper. In the existing Attention model we looked at in the previous video, Attention decisions are made independently, which is s...
[MXDL-11-03] Attention Networks [3/7] - Seq2Seq-Attention model for time series prediction
Просмотров 1422 месяца назад
In the last video, we implemented a sequence-to-sequence model to predict time series. In this video, we're going to add a feature called Attention to our sequence-to-sequence model. Let's take a look at the architecture of the Seq2Seq-Attention model and how to find attention scores and attention values. And let's implement this model with Keras and predict a time series. There are many papers...
[MXDL-11-02] Attention Networks [2/7] - Implementing a Seq2Seq model for time series forecasting
Просмотров 1552 месяца назад
In the last video, we looked at how a sequence-to-sequence model works and how to create a dataset for time series prediction. In this video, we will implement this model using Keras and predict a time series. #AttentionNetworks #Seq2Seq #SequenceToSequence #TeacherForcing #TimeSeriesForecasting
[MXDL-11-01] Attention Networks [1/7] - Sequence-to-Sequence Networks (Seq2Seq)
Просмотров 1952 месяца назад
[MXDL-11-01] Attention Networks [1/7] - Sequence-to-Sequence Networks (Seq2Seq)
[MXDL-10-08] Recurrent Neural Networks (RNN) [8/8] - Multi-layer and Bi-directional RNN
Просмотров 2442 месяца назад
[MXDL-10-08] Recurrent Neural Networks (RNN) [8/8] - Multi-layer and Bi-directional RNN
[MXDL-10-07] Recurrent Neural Networks (RNN) [7/8] - Gated Recurrent Unit (GRU)
Просмотров 1432 месяца назад
[MXDL-10-07] Recurrent Neural Networks (RNN) [7/8] - Gated Recurrent Unit (GRU)
[MXDL-10-06] Recurrent Neural Networks (RNN) [6/8]- Peephole LSTM models and time series forecasting
Просмотров 1173 месяца назад
[MXDL-10-06] Recurrent Neural Networks (RNN) [6/8]- Peephole LSTM models and time series forecasting
[MXDL-10-05] Recurrent Neural Networks (RNN) [5/8] - Build LSTM models for time series forecasting
Просмотров 2253 месяца назад
[MXDL-10-05] Recurrent Neural Networks (RNN) [5/8] - Build LSTM models for time series forecasting
[MXDL-10-04] Recurrent Neural Networks (RNN) [4/8] - Long Short-Term Memory (LSTM)
Просмотров 1383 месяца назад
[MXDL-10-04] Recurrent Neural Networks (RNN) [4/8] - Long Short-Term Memory (LSTM)
[MXDL-10-03] Recurrent Neural Networks (RNN) [3/8] - Build RNN models for time series forecasting
Просмотров 1563 месяца назад
[MXDL-10-03] Recurrent Neural Networks (RNN) [3/8] - Build RNN models for time series forecasting
[MXDL-10-02] Recurrent Neural Networks (RNN) [2/8] - Backpropagation Through Time (BPTT)
Просмотров 2133 месяца назад
[MXDL-10-02] Recurrent Neural Networks (RNN) [2/8] - Backpropagation Through Time (BPTT)
[MXDL-10-01] Recurrent Neural Networks (RNN) [1/8] - Basics of RNNs and their data structures.
Просмотров 3863 месяца назад
[MXDL-10-01] Recurrent Neural Networks (RNN) [1/8] - Basics of RNNs and their data structures.
[MXDL-9-01] Highway Networks [1/1] - Shortcut connections, implementing highway networks using Keras
Просмотров 1043 месяца назад
[MXDL-9-01] Highway Networks [1/1] - Shortcut connections, implementing highway networks using Keras
[MXDL-8-03] Weights Intialization [3/3] - Kaiming He Initializer
Просмотров 1043 месяца назад
[MXDL-8-03] Weights Intialization [3/3] - Kaiming He Initializer
[MXDL-8-02] Weights Initialization [2/3] - Xavier Glorot Initializer
Просмотров 2023 месяца назад
[MXDL-8-02] Weights Initialization [2/3] - Xavier Glorot Initializer
[MXDL-8-01] Weights Initialization [1/3] - Observation of the outputs of a hidden layer
Просмотров 1013 месяца назад
[MXDL-8-01] Weights Initialization [1/3] - Observation of the outputs of a hidden layer
[MXDL-7-02] Batch Normalization [2/2] - Custom Batch Normalization layer using Keras
Просмотров 3944 месяца назад
[MXDL-7-02] Batch Normalization [2/2] - Custom Batch Normalization layer using Keras
[MXDL-7-01] Batch Normalization [1/2] - Training and Prediction stage
Просмотров 3804 месяца назад
[MXDL-7-01] Batch Normalization [1/2] - Training and Prediction stage
Very nice and very clear explanation. Keep going please.
very nice . Thanks
Thanks for all. Playlist is very amaizing. I have a dataset but i can reach only %54.8 accuracy. Can you help me please?
Thanks for your comment. Please email your data and code to meanxai@gmail.com and we will review it.
Very fast, well and easy explained! Thank you!
Thanks for your comment.
good .mugo from africa
Great content, as always! Could you help me with something unrelated: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?
I am sorry, I have no idea about that.
Very nice. Thanks alot
Thanks for your comment.
Can you do time series videos on ARIMA, Prophet, BSTS?
Unfortunately, Time Series Analysis is not a topic covered in this channel.
Thank you for the clear and detailed illustration , but can you provide this presentation slides ?
I am sorry, unfortunately the slides are not public yet.
Very clean and thorough!!!! Absolute gold!
Very nice. Thanks
Very nice, thanks
Thank you so much!
Thanks.
Thanks
Thanks
Very nice. Thanks
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
This is a really clear video. After watching it, I have a deeper understanding of LightGBM. Thanks!
Thanks for your comment. I am glad it was helpful!
very good video!
Do all weak learners have to come from the same family? Meaning all weak learners are DTs or SVMs or they can be different?
Same family. To keep the way we measure the epsilon consistent across rounds, it makes sense to use the same weak learner across rounds.
@@meanxai Okay. Thank you!
How many no of examples are taken from the training data while sub sampling? Is it like random forests, equal to the number of training examples?
Typically, the sampled subset is the same size as the original data set and contains repeated data points. However, if your original data set is too large, you can generate smaller subsets. This is called "boosting-by-filtering". In this case, you need to consider a lower bound on the sample size that the model has to use in to guarantee that the final hypothesis has error smaller than epsilon. The lower bound is presented in theorem 3.2 of Yoav Freund's 1995 paper. Unfortunately, I haven't figured this out.
@@meanxai Oh okay. Thanks, will check that out!
This is what I got from NotebookLM ### Understanding Lambda, Gamma, and the Derivation of 'm' The sub-sample size 'm' plays a pivotal role in boosting algorithms. Here's an explanation of lambda (λ), gamma (γ), and how the lower bound for 'm' is derived: * **Lambda (λ) represents the reliability parameter.** This signifies the desired probability with which the final hypothesis outputted by the boosting algorithm should achieve the target accuracy (ε). In simpler terms, it's the confidence level that the final hypothesis will have the desired error rate. * **Gamma (γ) is a measure of the weak learner's advantage over random guessing.** The boosting algorithm utilizes a weak learning algorithm, denoted as **WeakLearn**, that is assumed to perform slightly better than random guessing. Gamma quantifies this advantage. * The sources provide a detailed analysis of a boosting algorithm that employs a "majority-vote game" analogy to illustrate its functionality. This game involves two players: a "chooser" who selects a weighting scheme for a set of points and an "adversary" who attempts to maximize the "loss" based on the chooser's strategy. * The derivation of the lower bound for 'm' stems from the requirement that **the hypotheses generated by WeakLearn should have a high probability of having an error smaller than 1/2 - γ.** This condition ensures that the weak learner is indeed performing better than random chance. * The sources state that the lower bound for 'm' is calculated using a formula that incorporates ε, λ, and other factors related to the weak learner's performance. This formula ensures that the sample size is sufficient to achieve the desired accuracy (ε) with the specified reliability (λ). * **The sources provide a specific formula for calculating 'm':** **m' >= (2/γ^2) * ln (m/(λ * ε)).** This formula guarantees that the probability of the final hypothesis having an error larger than ε is at most λ. * The choice of 'm' directly influences the number of training examples needed and the computational complexity of the boosting algorithm. A larger 'm' typically leads to higher accuracy but increases computational cost. In summary, lambda and gamma are essential parameters in determining the sub-sample size 'm', which plays a crucial role in the performance of boosting algorithms. The lower bound for 'm' is derived to ensure that the weak learner's performance surpasses random guessing and the desired accuracy and reliability are achieved. ### Selecting Gamma and Lambda Yes, **gamma (γ) and lambda (λ) are parameters that are typically chosen by the user.** There are factors that can help guide the selection of these parameters: * **Desired Accuracy (ε):** The user's desired accuracy for the final hypothesis plays a significant role in choosing gamma and lambda. A higher desired accuracy might necessitate a smaller gamma, indicating a requirement for a weak learner with a greater advantage over random guessing. This, in turn, might influence the choice of lambda to maintain a balance between accuracy and reliability. * **Computational Constraints:** The choice of gamma and lambda indirectly affects the computational complexity of the boosting algorithm through its influence on the sub-sample size 'm'. Users need to consider the available computational resources. A smaller gamma or a higher lambda might lead to a larger 'm', potentially increasing the computational cost. * **Nature of the Weak Learner:** The characteristics of the weak learner employed in the boosting algorithm can provide insights into selecting appropriate values for gamma and lambda. If the weak learner is known to have a significant advantage over random guessing, a larger gamma might be suitable. Conversely, a weaker learner might require a smaller gamma. * **Trade-off between Accuracy and Reliability:** The selection of gamma and lambda involves a trade-off between the desired accuracy and the reliability with which that accuracy should be achieved. A smaller gamma generally leads to better accuracy but might require a smaller lambda to maintain the desired reliability, potentially increasing the computational cost. The sources provided do not offer specific guidelines or formulas for choosing gamma and lambda. These parameters depend on the specific application, the characteristics of the weak learner, and the user's priorities regarding accuracy, reliability, and computational resources. **It's important to note that the information beyond what is stated in the sources about choosing gamma and lambda is not from the sources and may need to be independently verified.**
@@KushJuvekar-j3f Thanks for the useful information. It helps me a lot to understand the lower bound of m.
Thank. Very nice.
thanks.
Thanks. Keep going please.
👍👍👍❤️
Thanks
Interesting why dot product is used as a measure of similarity over cosine similarity. Cause if for example we compare [2,1] to [4,2] and [4,8] the resulting dot products would be 10 and 16 respectively which is counter intuitive as cosine between [2,1] and [4,2]r is zero hence the must be pretty similar but dot product of [2,1] and [4,8] is higher because of the higher values of the latter vector...
I totally agree with you. As in your example, dot product similarity does not make intuitive sense. We cannot say that dot product similarity is better than cosine similarity. However, we also cannot say that the latter is always better than the former. Cosine similarity only cares about the angle difference, while the dot product cares about both the angle and the magnitude. In deep learning, the magnitude of a vector may actually contain information we are interested in, so there is no need to remove it. Dot product similarity is said to be especially useful for high-dimensional vectors.
Great job!!!! Amazing tutorials!!!!
Thanks for your comment.
Great. Thanks
Thanks. amazing.
Thanks for your comment.
The codes can be found at github.com/meanxai/deep_learning
Nİce. Thank you.
❤❤
The codes can be found at github.com/meanxai/deep_learning
Best. Thank you.
Thank u so much.❤❤
Thanks
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
The codes can be found at github.com/meanxai/deep_learning
Allen Mary Moore Donna Rodriguez Michael
Thanks
Are these slides available for reference?
Sorry, the slides (PDF) are not public.
Please make video on pytroch
Sorry, I am not familiar with PyTorch.
thank you for a good explanation
Thanks for your comment.
Can you provide the code please?
The code can be found at github.com/meanxai/deep_learning Thanks.
@@meanxai thank you for sharing but I typed the whole 🤣 and I also use CiFAR10 dataset but the val_accuracy wasn't good. Like 52 only the training was more than 90
@@zoraizelya3975 I also got 52% accuracy, which is too low. I think this is because our highway network consists of basic feedforward networks, which are not suitable for challenging image datasets. Thanks for your comment.