Gaussian Mixture Model | Intuition & Introduction | TensorFlow Probability

Поделиться
HTML-код
  • Опубликовано: 31 дек 2024

Комментарии • 16

  • @karanshah1698
    @karanshah1698 2 года назад +2

    Marginalizing the value conveyed by *your* playlist, to *my* understanding of this subject, is intractable. And if what I said is even remotely sensible, per the rules of probability, the whole credit goes to you.

  • @saikeerthi5673
    @saikeerthi5673 2 года назад +2

    I've been following your channel for a while and you've really helped me understand complicated probability concepts, thank you! One question: I didn't understand how the z variable is a latent one. Why can't it just be a parameter?

    • @MachineLearningSimulation
      @MachineLearningSimulation  2 года назад +1

      First of all: Thanks for the feedback :). I am super glad, I could help!
      I think, here, it wouldn't make sense to have it as a parameter. A parameter, at least in my understanding, would be an adjustable value that defines the distribution of a random variable. Each data-point you observe (and that you want to cluster) consists (at least in the assumptions of a Gaussian Mixture Model) of a class and a position. Both are random variables, meaning that a data-point "does not follow belong to one class or one position". Instead, there is a probability associated with potential classes and potential positions in the observed space. The class variable is considered latent, because in the task of clustering, we do not know to which class a certain point belongs to. Certainly, if we did a scatter plot, we could use our "eye-norm" to figure this out. But we rather want to have a more probabilistic/mathematical treatment. This is because, it could also belong to a not-so-obvious class and just be an unlikely spread from the cluster's center.
      I hope that added some info into the right direction. Please ask a follow-up question if something remains unclear.

    • @saikeerthi5673
      @saikeerthi5673 2 года назад +1

      @@MachineLearningSimulation That makes sense, thank you! I got confused because of the nature of the latent variable because we infer its belongings from data, similar to how we model the distribution.

  • @harshitjuneja7768
    @harshitjuneja7768 Год назад +1

    Thanks thousands!!

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 года назад +1

    Is there a connection between using mixture coefficients to take a linear combinations of two gaussians and DGM approach, where the parameters of the gaussian parameters changes condition upon which category? They seem like 2 very different approaches to arrive at the probability. (Let me know if my question is not clear.). Thanks.

    • @MachineLearningSimulation
      @MachineLearningSimulation  3 года назад +2

      Thanks for the question :)
      I would view it from two perspectives:
      1) Calculating the likelihood/probability density of one sample. Imagine you have the GMM with two classes from the video, and you have sample, let's say at X=3.0 . Now, in order to get the probability of X, we have to marginalize over the latent class, that is because we don't know the class it belongs to. Hence
      p(X=3.0) = pi_0 * N(X=3.0, mu_0, sigma_0) + pi_1 * N(X=3.0, mu_1, sigma_1)
      In general, we would of course have the summation symbol, but since we only have two contributions in the sum, I explicitly wrote it down. This is of course a mixture here, and you could also call it a linear combination of Normal distributions.
      2) Sampling the GMM: Here you would first sample a latent class, then use the corresponding Normal to sample a point and then "throw away the latent class because it is not observed". Also take a look at this video after 14:10 ruclips.net/video/kMGjXVb8OzM/видео.html where I first do this process manually and then use TensorFlow Probabilities built-in Mixture Distribution
      A remark: If you have a special case of a Mixture Model in which you would observe the class (i.e. it is not latent) then you would evaluate the joint p(Z, X) instead of the marginal. "If you have more information than you should of course also use it." I think this could refer to the second case you mentioned. However, as we commonly use GMM for clustering in which we of course don't know the class, it is not really much seen in application.
      I hope that helped :) Let me know if something was unclear.

  • @nickelandcopper5636
    @nickelandcopper5636 2 года назад +1

    Hey, another great video! Is the GMM PDF you show at the end normalized? Thanks!

    • @MachineLearningSimulation
      @MachineLearningSimulation  2 года назад

      Hey, thanks again :)
      Are you referring to what is shown in TensorFlow Probability? If so, then yes. Since they are implemented as the Mixture Model in TFP, the probability of getting any value from the domain of possible values is 1, which is the condition for normalization.
      Is that what you were asking?

  • @engenglish610
    @engenglish610 3 года назад +1

    Thanks for this video. Can you make a video about multivariate case ?

    • @MachineLearningSimulation
      @MachineLearningSimulation  3 года назад +1

      Hey, thanks for the feedback ☺️
      Yes that's already planned. I think it will go online in 3 to 4 weeks

    • @MachineLearningSimulation
      @MachineLearningSimulation  3 года назад +2

      Unfortunately, I overestimated my video output :D So it took me a little longer, but here is the continuation for the Multivariate Case: ruclips.net/video/iqCfZEsNehQ/видео.html
      The videos on the EM derivation and its implementation in Python will follow.

  • @huat1998
    @huat1998 8 месяцев назад

    sorry, why the categorical distribution's P(Z) = Cat(Pi) = product of all the pi[0], pi[1]?

    • @MachineLearningSimulation
      @MachineLearningSimulation  5 месяцев назад

      Hi, thanks or the question. Do you have a time stamp in the video to which you are referring?