Paper Club with Ben - Score-Based Generative Modeling Through Stochastic Differential Equations

Поделиться
HTML-код
  • Опубликовано: 4 окт 2024

Комментарии • 12

  • @dippatel1739
    @dippatel1739 Год назад +5

    Best discussion and presentation.
    Not sure if I can attend this Paper Club because it looks like company sponsored but definatelity keep posting this videos.

    • @vahan.hovhannisyan
      @vahan.hovhannisyan Год назад

      Thanks for your interest! nPlan's paper club is open for all to attend in person in London or online! Just search for nPlan paper club :)

  • @nikahosseini2244
    @nikahosseini2244 Год назад +5

    Great explanation! Thanks

  • @Fr0z3nMus1k
    @Fr0z3nMus1k 3 месяца назад

    it would be really helpful if you could post on the description your pdf files of the papers so we can read your pdf notes and possibly achieve a better understanding of the papers. Thank you. If you cant post them can u send them to me via mail or something?

  • @alivecoding4995
    @alivecoding4995 5 месяцев назад

    With respect to energy-based models, where we need Langevin Dynamics to sample data from the model (p_theta(x)), which role do the 'empirical' and prior distribution play then? Do we use training data as samples from the prior? And samples from our current model to model our empirical distribution?

    • @benboys_
      @benboys_ 3 месяца назад +1

      Empirical distribution are the training data, it will be a mixture of point masses (look up 'Dirac delta') at the locations of the samples in the sample space. Then match forward and reverse markov chains that go from a p_theta(x, t=0) to a normal distribution at t=T which gives you a nice denoising score-matching objective that can be used to train energy based models (train p_theta(x, t)) or score based models (train grad_{x}(p_theta)(x, t)). This training is done by noising samples from the empirical distribution and predicting the amount of noise added. Inductive bias or regularisation gives inaccurate score after training resulting in that you don't recover empirical distribution after training but something more desirable to practitioners that can generalise and achieve good results on metrics they are interested in, such as FID score.

    • @alivecoding4995
      @alivecoding4995 20 дней назад

      @@benboys_thank you very much!

    • @alivecoding4995
      @alivecoding4995 20 дней назад

      I was wondering because I saw an explanation that said we need Langevin Dynamics for sampling from the model, such that those samples can then be used in a MCMC estimator for the true likelihood of the model.

  • @Zenchiyu
    @Zenchiyu 10 месяцев назад

    Concavity instead of convexity ? Since we try to push samples towards regions of high density (noisy gradient ascent)

    • @benboys_
      @benboys_ 5 месяцев назад

      Yes, you're right, same thing up to a sign change and people usually refer to convex optimization or log concave sampling (of a probability density)

  • @alivecoding4995
    @alivecoding4995 9 месяцев назад

    Are you on Twitter, Ben?