Coding MCMC : Data Science Code

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024

Комментарии • 41

  • @xxshogunflames
    @xxshogunflames 3 года назад +25

    Man you blow these videos out of the park, it’s like surreal how good these videos are at tying the big picture together. Thank you for the content!!!

  • @MrSystemoutprintln
    @MrSystemoutprintln 3 года назад +9

    These MCMC videos (and of course others too) are just brilliant, can't thank you enough!

  • @achimkeks3638
    @achimkeks3638 2 года назад

    Love your videos! Great balance between simple explanations and giving a good overview of the topic.

  • @SameenIslam
    @SameenIslam 3 года назад +1

    Really cool, it would be great if you could cover Sequential Importance Sampling (SIS) too.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 года назад +5

    In practice, you have observation data. Can you look at a video of how to use data with MCMC?

  • @salenatorresashton8016
    @salenatorresashton8016 Год назад +1

    You do a great job with your Code with Me Videos. I'd like to refer students to your videos-- do you plan to make more of these?

  • @Afewwilliams
    @Afewwilliams 2 года назад

    These vids are excellent, thanks a lot

  • @bryanshi3774
    @bryanshi3774 8 месяцев назад

    For the wolfram integral, I think it should be "for x from 1 to infinity".

  • @skate456park
    @skate456park 3 года назад

    ABSOLUTE KING

  • @leonardofacchin1452
    @leonardofacchin1452 2 года назад

    The fact that the sample draws are correletated in the Metropolis (and especially in the more efficient Hamiltonian MonteCarlo) algorithm looks like a feature to me rather than an unfortunate necessity. It's what allows the algorithm to be efficient. As long as the final proportion of sample draws closely matches the posterior density, how the samples were obtained seems to be much less important. The samples are correlated because these algorithms are actually built to explore the sample space in a smart manner.

    • @user-zg1yr5ot2z
      @user-zg1yr5ot2z 8 месяцев назад

      Generally you want independent samples or completely random sampling. This is usually handled with lagging the samples (ie for example throwing away every 100th sample) and then checking to make sure that you still have an approximation of the target distribution.

    • @leonardofacchin1452
      @leonardofacchin1452 8 месяцев назад

      @@user-zg1yr5ot2z HMC aims to reproduce the posterior density distribution.
      The way it does it is by basically treating the (unknown) posterior as a scalar potential energy field, randomly selecting a starting position and then integrating the phase space trajectory of a point mass that is randomly kicked and subjected to that potential.
      Once "the point mass stops" its "position" is recorded before being kicked again and the process is iterated, as many times as necessary.
      Since the position of the mass at the start of the next step is its position at the end of the previous step it's inevitable for the samples to be correlated, to an extent. But it's exactly for that reason that the exploration speed of the unknown posterior density is much higher than in "classical" Metropolis-Hastings (simplifying the entire idea: the potential makes sure the mass visits regions of low potential energy/high probability density more often than other regions).
      The algorithm goes through an initialiazation phase that's meant to tune the paramters in a way that maximizes the speed of convergence while minimizing the inaccuracy.
      If the "energy" was actually conserved like in a time-independent Hamiltonian, the transition probability would be one (that is, no samples would ever be rejected) but, as far as I remember, the convergence speed would not be the best.
      So, while completely random samples would obviously be ideal, it would take so many samples in order to thoroughly explore the posterior that obtaining an accurate result would be less practical.
      As long as we can be reasonably sure that the frequency of samples in each small parameter space subset is proportional to the actual unknown posterior density, it matters little how those samples were actually obtained and if they were random or correlated.
      The issue is devising a method that assuages our fear of obtaining a result that doesn't actually match the unknown posterior density.

  • @sharmilakarumuri6050
    @sharmilakarumuri6050 3 года назад +2

    Please can you do a video on hamiltonian monte carlo

  • @seyyidemre
    @seyyidemre 2 года назад +2

    The code in cell# 49 will always give 99.9% accuracy. Is it not better to subtract 1000 from n_accept and report that instead? The size of the retained_sample will always be 1 million as you append a new value whether you accept or not.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 года назад +2

    Is f(x) always greater than p(x) given the normalizing constant?

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 года назад +4

    Why is that a drawback, that samples are correlated? Isn’t that the entire point behind MCMC?

    • @ElizabethCoxon-u9d
      @ElizabethCoxon-u9d 13 дней назад

      I'd assume it's like cheating. You have a sample, rather than getting another random sample, it's like you copy the same sample. Not quite, but nearly.

  • @teegnas
    @teegnas 3 года назад +1

    Small suggestion ... you could try cropping some part from your video frame (mostly from the LHS) ... which will make the code more visible ... having said that ... that 's for the video, was looking for something similar!

    • @ritvikmath
      @ritvikmath  3 года назад

      Hey thanks for the suggestion. I'll try and remember to increase the font size as well for future videos!

    • @teegnas
      @teegnas 3 года назад

      @@ritvikmath the font size seems perfect ... I meant your video ... the one you have at the top right ... which shows you seating and coding ... there are some extra space which you could remove

    • @ritvikmath
      @ritvikmath  3 года назад +2

      @@teegnas ahh I see, good suggestion! thanks.

  • @Joy_SR
    @Joy_SR 2 года назад +5

    Why ">> if np.random.random() < prop " (at 10:43), the sample is accepted? what is the role of "np.random.random()" here?
    Thank you!

    • @AbdulGhaffar-lv3kp
      @AbdulGhaffar-lv3kp 2 года назад +3

      Basically np.random.random() step is used to make sure that samples are accepted with the probability "prob" calculated in the previous step. It's confusing for the beginners though.

    • @ankushkothiyal5372
      @ankushkothiyal5372 2 года назад +3

      np.random.random() generates a random number from the set [0,1). Here we have a probability "prob" for moving from a sample 'a to 'b''. Our sampler will accept the next sample 'b' with a probability "prob".
      Imagine if "prob" was large(near 1) then most of the times np.random.random() will result in an acceptance but if "prob" was small(near 0) then most times np.random.random() will result in non-acceptance. Basically this is how you incorporate selecting something with a probability in algorithms.

  • @user-zg1yr5ot2z
    @user-zg1yr5ot2z 8 месяцев назад

    Awesome video. No lag though?

  • @sharmilakarumuri6050
    @sharmilakarumuri6050 3 года назад +1

    Awesome

  • @UsmanKhan-xs7hz
    @UsmanKhan-xs7hz 24 дня назад

    how can MCMC be used in the realm of stocks and finance? I’ve been looking into making a stockbot as a personal project and landed on MCMC as a viable option

  • @gajrajsingh51
    @gajrajsingh51 3 года назад +1

    How to get this norm constant ? At 1:32 integration was from 0 to inf and -inf to 0, while function definition was for x>=1 ?

    • @sikobuenos
      @sikobuenos 2 года назад

      I was wondering that too. Should be x>=0?

  • @datorusfinance6659
    @datorusfinance6659 2 года назад +2

    What's so bad about the correlation in the Metropolis-Hastings method?

    • @user-zg1yr5ot2z
      @user-zg1yr5ot2z 8 месяцев назад

      Can insert a lag to get the samples independent.

  • @kmshraddha5121
    @kmshraddha5121 Год назад

    What is difference between tuning parameter and standard deviation and where to use which?

  • @changkaizhao
    @changkaizhao Год назад

    10:43 in cell 48 , why use f(candidate)/f(samples) as recept ratio? Where are the transition terms according to mcmc?

    • @user-zg1yr5ot2z
      @user-zg1yr5ot2z 8 месяцев назад

      This is the meat and potatoes of metropolis hastings. If the candidate probability (Pr(c)) is higher than the previous Pr(prev), the candidate will always be accepted. If Pr(c) is lower than Pr(prev) it will be accepted Pr(c)/Pr(prev) % of the time and rejected (1- Pr(c))/Pr(prev) % of the time. What this does is prevent the chain from getting stuck in local probability density maximums (if there are any)

  • @danielscott6302
    @danielscott6302 Год назад

    Hi.
    Thank you so much for all your effort and substance in the videos that you've released on RUclips. I'm a big fan, and you've really helped me in my Data Science/Analysis career to date 😊.
    I'm currently working on a research project that somewhat relates to MCMC - Variational Inference, which is an alternative approach to MCMC (to be very brief). If you're familiar with this machine learning algorithm, would you be able to help me understand a niche branch of research related to this algorithm? That is, "Variational Inference augmented with Copulas".
    If you are able to help, please let me know how I could return the favour ❤.

  • @dryolymatics007
    @dryolymatics007 5 месяцев назад

    Is it possible for you to comment on my mcmc notebook as to why its so slow?? That will be greatly appreciated

  • @kmshraddha5121
    @kmshraddha5121 Год назад

    Please do transitional MCMC

  • @JainmiahSk
    @JainmiahSk 3 года назад

    What is MCMC ?

    • @SameenIslam
      @SameenIslam 3 года назад +1

      Markov Chain Monte Carlo