Intuitively Understanding the Shannon Entropy

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024

Комментарии • 61

  • @maxlehtinen4189
    @maxlehtinen4189 10 месяцев назад +31

    for everyone trying to understand this concept even more thoroughly, towardsdatascience's article "The intuition behind Shannon’s Entropy" is amazing. it gives added insight on why information is the reciprocal of probability

  • @bluejays440
    @bluejays440 10 месяцев назад +5

    Please make more videos this is literally the only time I've ever seen entropy be explained in a way that makes sense

  • @caleblo8498
    @caleblo8498 Год назад +5

    it is confusing from some part of the concept, but the rethinking progress is helpful, as I can surely say:
    the expected possible outcomes (the surprises) * the probability = entropy (lower the entropy, the less surprising it will be)

  • @zizo-ve8ib
    @zizo-ve8ib 8 месяцев назад

    Bro really explained it in less than 10 mins when my professors don't bother even if it could be done in 5 secs, true master piece thus video keep it up man 🔥🔥🔥

  • @nicholaselliott2484
    @nicholaselliott2484 9 месяцев назад +1

    Dude, took information theory from a rigorously academic and formal professor. I'm a little slow and under the pressure of getting assignments done, couldn't always see the forest for the trees. Just the sentence "how much information, on average, would we need to encode an outcome from a distribution" just summed up the whole motivation and intuition. Thanks!

  • @charleswilliams8368
    @charleswilliams8368 5 месяцев назад

    Three bits to tell the guy on the other side of the wall what happened, and it suddenly made sense. Thanks.

  • @xyzct
    @xyzct 3 месяца назад

    Excellent. Short and sweet.

  • @caleblo8498
    @caleblo8498 Год назад +2

    each slice of probablity requires log2p_i number of bits to represent, and the total number of outcomes (they call it entropy) requires the sum of all slices of probability. Each slice of probability is basically one of the expected outcome, say, geting the combination ABCDEF in a six letter scramble. (correct me if I am wrong)

  • @Sars78
    @Sars78 Месяц назад

    Well done, Adian. I just found out-though I'm not surprised at all, in the Shannon sense 🤓 -that you're doing a PhD at Cambridge. Congratulations! Best wishes for everything 🙂

  • @mathy642
    @mathy642 4 месяца назад

    Thank you for the best explanation

  • @kowpen
    @kowpen 2 года назад +9

    At 4:54, may I know the reason to consider 10 bits and triples? why not any other combination? Thanks.

    • @adianliusie590
      @adianliusie590  2 года назад +8

      I was just showing arbitrary examples but I could have chosen many different examples too. The triples (when there were 8 outcomes) was to show this could be easily extended to any power of 2, and the 10 outcomes was to show that this generalises too to non-powers of 2.

  • @murilopalomosebilla2999
    @murilopalomosebilla2999 3 года назад +6

    Nice explanation. Keep up the good work, man!

  • @prateekyadav7679
    @prateekyadav7679 Год назад +3

    What I understand is that Entropy is directly related to the number of outcomes, right? So, I don't get why we need such a parameter/term when we could simply do by stating the number of outcomes of a probability distribution? What new thing does entropy bring to the table?

    • @derickd6150
      @derickd6150 10 месяцев назад +1

      Consider the case that a biased coin is flipped. There are two outcomes, just like an unbiased coin, but let's say this biased coin has a (0.1)^10000 chance of being heads. Do you have exactly the same information about the outcome before hand as you do with an unbiased coin?

    • @maxlehtinen4189
      @maxlehtinen4189 10 месяцев назад

      @@derickd6150 yes, it makes sense that a non-uniform distribution should have an effect on the uncertainty of a distribution, but can you explain how the bias affects the outcome via the entropy formula?

    • @derickd6150
      @derickd6150 10 месяцев назад +2

      @@maxlehtinen4189 I'm not sure what you mean by bias here? Edit: Oh right you're referring to my answer, not something in the video. Yes well the entropy formula says something along the lines of: "How many bits do we need to represent the outcome of the coin"? That is a very natural measure of how much information you have about the outcome. If the coin is unbiased, you need two bits. If it is so severly biased like I describe above, and you plug the numbers into the entropy formula, it will essentially tell you "Well... we really only need one bit to describe the outcome right? We essentially certain it will be tails" Something intuitively along these lines. Edit 2: to see this, plot y(p) = -p log(p) - (1-p) log(1-p) for p in [0,1]. That is the expression for the entropy of the coin, whatever its bias. You will see that when p is very close to 1 or to 0 (which it is in my example), y(p) is almost 0. This is to say, you need almost no information to represent the outcome. It is just known. You need not transfer any information to someone, on the moon say, for that person to guess that the biased coin I described gives tails. However, when p is 0.5, the entropy is maximised, and so you would need to transfer the most information to someone on the moon to tell them the outcome of the coin, because they cannot use their prior knowledge at all to make any kind of educated guess

  • @anusaxena971
    @anusaxena971 2 года назад +2

    You CERTAINLY DESERVE MORE VIEWS 👏 👍👍👍👍

  • @MissPiggyM976
    @MissPiggyM976 22 дня назад

    Very good!

  • @mansoor9894
    @mansoor9894 Год назад +1

    Fantastic job in explaining this,

  • @AdeshBenipal
    @AdeshBenipal 2 месяца назад

    Nice video

  • @nyx8017
    @nyx8017 9 месяцев назад

    god this is an incredible video thank you so much

  • @user-wi1rj4iw9y
    @user-wi1rj4iw9y 2 года назад

    Thank you for your video.Keep it up! 感谢你的视频. 再接再厉!

  • @tanjamikovic2739
    @tanjamikovic2739 Год назад

    this is great! i hope you will film more!

  • @chetanwarke4658
    @chetanwarke4658 Год назад +1

    Simple and precise!

  • @derickd6150
    @derickd6150 10 месяцев назад

    Great video!

  • @RodrigodaMotta
    @RodrigodaMotta 3 года назад +1

    Blew my mind!

  • @MaximB
    @MaximB Год назад

    Great job. Thank you

  • @debasishraychawdhuri
    @debasishraychawdhuri 4 месяца назад

    It does not explain the most important part - how the formula for non-uniform distribution came about

  • @avatar00001
    @avatar00001 3 месяца назад

    thank you codexchan

  • @lennerdsimon9117
    @lennerdsimon9117 2 года назад +1

    Great video, well explained!

  • @alixpetit2285
    @alixpetit2285 2 года назад

    Nice video, what do you think about set shaping theory (information theory)?

  • @sirelegant2002
    @sirelegant2002 8 месяцев назад

    Thank you!

  • @morphos2
    @morphos2 Год назад

    I didn't quite understand the 4:36 rational.

  • @prateek4546
    @prateek4546 2 года назад

    wonderful explaination !!

  • @huibosa2780
    @huibosa2780 2 года назад

    excelent video, thank you!

  • @karlzhu99
    @karlzhu99 Год назад

    Uncertainty is a confusing way to describe this. For the lottery example, wouldn't you be very certain of the outcome?

    • @TUMENG-TSUNGF
      @TUMENG-TSUNGF Год назад

      It’s about the numbers not whether you win the lottery or not.

  • @azerack955
    @azerack955 Год назад

    I dont quite understand the very last step. what does summing over all the probability outcomes give us?

    • @vibhanshugupta1729
      @vibhanshugupta1729 Год назад +1

      That is the way we calculate expectation values. For a random variable X which takes values {xi}, E(X) = sum P(xi) * xi

    • @AkashGupta-th2nm
      @AkashGupta-th2nm Год назад

      Intuitively, u sum over it to get some understanding of the average uncertainty

  • @user-cf2yo5qf3h
    @user-cf2yo5qf3h 6 месяцев назад

    Thankkkk youuuuu.

  • @zgz97
    @zgz97 2 года назад

    beautiful explanation :)

  • @robertwagner5506
    @robertwagner5506 2 года назад

    great video thank you

  • @aj7_gauss
    @aj7_gauss 9 месяцев назад

    can someone explain the triplets part

  • @Justin-zw1hx
    @Justin-zw1hx Год назад

    awesome!

  • @bodwiser100
    @bodwiser100 8 месяцев назад

    I appreciate your effort, but the video is quite confusing. For example, in the example about 8 football teams, you explain why 3 bits are required by flat out stating as a starting premise that 3 bits are required! It's a circular argument.

  • @corydkiser
    @corydkiser 2 года назад

    awesome

  • @AniketKumar-dl1ou
    @AniketKumar-dl1ou 11 месяцев назад

    Your should have written H[U(x)] = logM / M
    to better relate the entropy explanation.

  • @whoisray1680
    @whoisray1680 Год назад +1

    Why 1/p?????????

    • @lsacy8347
      @lsacy8347 Год назад

      im not too sure but I think its just a bitwise expression of M possible outcomes. considering there are M probabilities with equal probabilities (p), so p = 1/M -> 1/p = (1/(1/M)) = M

  • @energy-tunes
    @energy-tunes Год назад

    This seems so intuitive why did it take so long to get "discovered"

  • @diy_mo
    @diy_mo 10 месяцев назад

    I expected something else, but it's also ok.

  • @axonis2306
    @axonis2306 2 года назад +1

    Most of your understanding is good, but 4:50 is an unnecessary leap of logic. At level this introductory is probably best to assume outcomes to be at 2^n.

  • @2011djdanny
    @2011djdanny 2 года назад +1

    Example is even difficult than the concept itself 🤦🏼‍♂️😃
    Nice try by the way

  • @mikes9012
    @mikes9012 2 года назад +4

    this sucks, really unintuitive