Это видео недоступно.
Сожалеем об этом.

VQ-VAE | Everything you need to know about it | Explanation and Implementation

Поделиться
HTML-код
  • Опубликовано: 17 авг 2024
  • In this video I go over Vector Quantised Variational Auto Encoder(VQVAE).
    Specifically I talk over how its different from VAE, its theory and implementation.
    I also train a VQVAE model and show how the generation process looks like after we have trained a VQVAE model.
    Timestamps
    --------------------
    00:00 Intro
    00:27 Need and difference from VAE
    02:43 VQVAE Components
    04:31 Argmin and Straight Through Gradient Estimation
    07:11 Codebook and Commitment Loss
    08:45 KL Divergence Loss
    09:30 Implementation
    12:35 Visualization
    14:29 Generating Images
    16:37 Outro
    Paper Link - tinyurl.com/ex...
    Subscribe - tinyurl.com/ex...
    Inspiration of visualization of embedding taken from - • [VQ-VAE] Neural Discre...
    Github Repo Link - tinyurl.com/ex... (Will be updated soon)
    Background Track Fruits of Life by Jimena Contreras
    Email - explainingai.official@gmail.com

Комментарии • 45

  • @Explaining-AI
    @Explaining-AI  9 месяцев назад

    Github Code - github.com/explainingai-code/VQVAE-Pytorch
    Note: The code at line 65 @11:31 is wrong. It has a typo and it should actually be codebook_loss = torch.mean((quant_out - quant_input.detach())**2) . The repo has the correct version.

  • @Omsip123
    @Omsip123 5 дней назад

    Your videos are so helpful. They are well explained, concise.. I can’t find the word to describe. Unfortunately you do not get the millions of subscribers you deserve, but I hope that it gives you reward to know that your videos are top quality (I have watched over 100h of AI videos) and very helpful for the learning community.

    • @Explaining-AI
      @Explaining-AI  4 дня назад

      Thank you so much for your kind words :) Subs will come when they will come, right now am just happy to ensure my best to create videos that assist people in understanding things a little bit better.

  • @mehdizahedi2810
    @mehdizahedi2810 16 дней назад

    the best explanation of VQ-VAE, thanks.

  • @amirjodeiry7136
    @amirjodeiry7136 7 месяцев назад +1

    Thank you for providing insightful perspectives on this topic.
    I appreciate your unique perspective and the effort you've put into providing valuable information, rather than simply copying from the paper. Keep up the great work!

  • @vikramsandu6054
    @vikramsandu6054 2 месяца назад

    Loved every bit of it. The amount of effort you put in to explain these complex concepts in a simple manner is NEXT LEVEL. This has become my favourite Deep Learning Channel. THANKS A LOT!! keep up the amazing work.

    • @Explaining-AI
      @Explaining-AI  2 месяца назад

      Thank you for the continuous encouragement and appreciation Vikram. It means a lot!
      Will keep trying my best to put out videos that are worthy of this.

  • @IgorAherne
    @IgorAherne 2 месяца назад

    Thank you so much for taking the time to make this beautiful lesson! It is very well made, and made the whole concept clear

    • @Explaining-AI
      @Explaining-AI  Месяц назад +1

      Thank you! Really happy that you found the video helpful.

  • @drannoc9812
    @drannoc9812 3 месяца назад +1

    Thank you, the visuals really helped me understand, especially for the backpropagation part !

    • @Explaining-AI
      @Explaining-AI  3 месяца назад

      Happy that the video was some help to you :)

  • @foppili
    @foppili 3 месяца назад

    Great explanation, covering theory and implementation. Nice visualisations. Thanks!

  • @leleogere
    @leleogere 9 месяцев назад +1

    Very clear explanation! Thanks for the implementation + the vizualisation of the codebook!

  • @eddieberman4942
    @eddieberman4942 Месяц назад

    Really useful for a project Im working on, thanks!

  • @joegriffith1683
    @joegriffith1683 3 месяца назад

    Brilliant video, thanks so much!

  • @PrajwalSingh15
    @PrajwalSingh15 8 месяцев назад

    Amazing explanantion with easy to follow animations.

  • @inceptor1992
    @inceptor1992 7 месяцев назад

    Dude your videos are absolutely amazing! Thank you!!!

  • @badermuteb1012
    @badermuteb1012 7 месяцев назад

    How did code the visualizition?
    Thank you for the tutorial. This is by far the best on RUclips. Keep up please

    • @Explaining-AI
      @Explaining-AI  7 месяцев назад

      Thank you!
      The visualization is not something I came up by myself, I saw it in a different video(link in description) and I thought it would be much better to explain with that kind of visualization.
      This is roughly how I implemented it .
      -> Reduce latent dimension as 2 and codebook dimension as 3
      -> Bound VQVAE encoder outputs to a range using some activation at final layer of encoder. Say -1 to 1 or 0-1
      -> Map both the dimension to some color's intensity value. So maybe x axis is green component(0-1 mapped to 0-255) and y axis is red component and blue is 255 always. Then color each points as (R, G, B) -> (encoded_dimension_1_value*255, encoded_dimension_2_value*255, 255)
      -> Train VQVAE and get the codebook vectors for trained model and encoder outputs for an image.
      -> Points on plot are encoder outputs for each cell of encoder output feature map and e1,e2,e3 are the codebook vectors.
      -> Generate the quantized image using this mapping.
      I hope this gives some clarity on the implementation part.

  • @user-vd2vc6yg4t
    @user-vd2vc6yg4t 8 месяцев назад

    Thanks for the explanation, this is very helpful!!!

    • @Explaining-AI
      @Explaining-AI  8 месяцев назад

      Thank You! Am glad that it ended up helping in anyway

  • @amirnasser7768
    @amirnasser7768 Месяц назад

    Thanks so much for the informative video. I always used to ask myself what happened to the KL term 😅. BTW, have you thought about using a Gaussian prior instead of a uniform one? I mean, the prior of the real data is more likely to be Gaussian, so my gut feeling is that using the uniform prior may not be a better choice.

    • @Explaining-AI
      @Explaining-AI  Месяц назад +1

      You are most welcome :)
      The paper uses the uniform prior and that also simplifies the math by getting rid of KL term completely. I havent myself experimented with any other prior, but I am sure we could replace it with another prior(just that we would have to add the KL term depending on this new choice).
      But having said that, because vqvae has discrete latent space, I am not sure how exactly you would use the gaussian prior and what the kl divergence term would evaluate to, given that q(z) is a one hot vector. If possible, can you elaborate a bit on that ?

    • @amirnasser7768
      @amirnasser7768 Месяц назад

      @@Explaining-AI I think you are right it is not clear how one will use Gaussians as prior and it is easier to use uniform. Maybe one way is to add an additional KL loss to minimize the distance between each codebook embedding and another embedding initialized with gaussian dist 0 mean and 1 variance.

  • @yanlu914
    @yanlu914 19 дней назад

    Hi, very helpful video! I want to ask what the colors in the quantization output mean. From what I understand, the quantization output has 2 channels (because the codebook embedding dimension is 2). Each pixel in the quantization output corresponds to one of three embeddings in the codebook, so does the color come from the combination of the 2 channels?

    • @Explaining-AI
      @Explaining-AI  19 дней назад

      Thanks! For visualisations, I bound the quantization output between 0 and 1(using activation at end of encoder).
      And then for colors, I just fix 2 of the dimensions as red and green and get red and green components for color of the point as encoded_dimension_value*255 .
      For the blue color I always fix it to be 255.
      If you are interested in exact details of visualizations, then I have mentioned it here - ruclips.net/video/1ZHzAOutcnw/видео.html&lc=UgyASt6J38hMkqfdd3R4AaABAg.9yyad6YzcP49yyqxBiZkQZ

    • @yanlu914
      @yanlu914 18 дней назад

      @@Explaining-AI Very clean explanation! Thank you!

  • @scotth.hawley1560
    @scotth.hawley1560 6 месяцев назад

    Really nice. Thanks for posting. At 9:52, why are you using nn.Embedding, instead of nn.Parameter(torch,randn((3,2)))? I don't understand where the Embedding comes from.

    • @Explaining-AI
      @Explaining-AI  6 месяцев назад +1

      Thank you! Actually they are both the same. nn.Embedding any way just uses nn.Parameter and normal initialization.
      github.com/pytorch/pytorch/blob/d947b9d50011ebd75db2e90d86644a19c4fe6234/torch/nn/modules/sparse.py#L143
      So nn.Embedding just creates a wrapper in form of a lookup table to store embeddings of a fixed dictionary and size on top of nn.Parameter. Hope it helps.

  • @linhnhut2134
    @linhnhut2134 5 месяцев назад

    Thanks a lot for your video
    Can you explain more detail about the :
    quant_out = quant_input + (quant_out - quant_input).detach()
    Why don't just
    quant_out = quant_out.detach()

    • @Explaining-AI
      @Explaining-AI  5 месяцев назад +1

      Hello @linhnhut2134, what we want is the gradients from quant_out to used as if they are they are also the gradients for quant_input, kind of like copy pasting gradients. So ultimately in forward pass we desire to have quant_out = quant_out, but in backward pass what we want is quant_out = quant_input. And the operation "quant_out = quant_input + (quant_out - quant_input).detach()" allows us to achieve that distinction between forward and backward process.

  • @danieltsao4005
    @danieltsao4005 2 месяца назад

    The code in line 65 is wrong. It should be
    codebook_loss = torch.mean((quant_out - quant_input.detach())**2)

    • @Explaining-AI
      @Explaining-AI  2 месяца назад

      Yes indeed. Its correct in the repo - github.com/explainingai-code/VQVAE-Pytorch/blob/main/run_simple_vqvae.py#L65 but the video version has a typo where instead of torch.mean((quant_out - quant_input.detach())**2) its incorrectly implemented as torch.mean((quant_out - quant_input.detach()**2))

  • @jakula8643
    @jakula8643 10 месяцев назад

    link to code please

    • @Explaining-AI
      @Explaining-AI  10 месяцев назад

      Hi @jakula8643, as a result of working on the implementation for the next video, I ended up modifying and making the VQVAE code a bit messy. I will clean it up and have it pushed here github.com/explainingai-code/VQVAE-Pytorch in couple of days time. Apologies for missing this and I will let you know as soon as I do that.

    • @Explaining-AI
      @Explaining-AI  10 месяцев назад

      Code is now pushed to the repo mentioned above

  • @PanicGiraffe
    @PanicGiraffe 9 месяцев назад

    This video fuckin' rips.