Markov Chain Compression (Ep 3, Compressor Head)

Поделиться
HTML-код
  • Опубликовано: 25 окт 2024

Комментарии • 51

  • @BasantSiingh
    @BasantSiingh 10 лет назад +53

    The second half of this episode went straight above my head.

    • @pbezunartea
      @pbezunartea 10 лет назад +3

      Basant Singh Matharu Join the club! :P

    • @OmegaLok
      @OmegaLok 10 лет назад +8

      This episode is not in line with the previous two. In the second half he's talking too fast.

    • @ajayramanathan7589
      @ajayramanathan7589 10 лет назад

      Go read a paper; I think these videos are more like giving intro to topic. Something like a Wiki.

    • @OmegaLok
      @OmegaLok 10 лет назад

      Thank you for a very useful comment. Keep it constructive. Your contribution is very much appreciated.

    • @carlosmontano8277
      @carlosmontano8277 2 года назад

      I feel so good now that I know I'm not the only one, thanks for your very useful comment!

  • @GoogleDevelopers
    @GoogleDevelopers  10 лет назад +116

    *Compressor Head Episode 3: Markov Chains*
    #compressorhead #everybitcounts #developers
    At the cutting edge of compression algorithms sits the lonly kingdom of Markov Chains. These algorithms take an Artificial Intelligence approach to compression by allowing the encoder and decoder to ‘predict’ what data is coming next. In this episode Colt McAnlis talks about how these magical algorithms compress data, and why some think that they are the future of compression.

    • @eunicelarson697
      @eunicelarson697 10 лет назад +1

      What the heak....3 ....chains, mzgical, artificial inteligence.....???????!!!!!!!!!
      I dont think I need that.
      I have my brain thanks.
      Lolololol

    • @arpyan
      @arpyan 10 лет назад +2

      It gets fast and complex in the end, but at least the fundamentals are easily understandable.
      It was a great show, I learned a lot of interesting and useful things, thank you Colt McAnlis.

    • @ZelenoJabko
      @ZelenoJabko 9 лет назад

      Google Developers You didn't actually move from 12 to 3 bits, since in the first case, you have a starting and ending state encoded, in the second, you do not.

  • @419
    @419 2 года назад

    I like how you take each section into a different room or area and display the information in different ways. That really helps make each piece of information unique and easier to remember.

  • @LeYoCosK
    @LeYoCosK 10 лет назад

    I come from a heavy probability background and this being my first time handling any compression problem I'm glad to meet Markov Chains.
    Thanks for the insights Colt McAnlis

  • @DaveWhoa
    @DaveWhoa 9 лет назад +16

    good videos but a bit harder to learn from this one because you fly through everything so quickly, not much time to think about what you've said before you've already moved on to the next part

  • @CasperBang
    @CasperBang 10 лет назад +12

    Love these videos, however, had to give up on the baseball thing. Less cultural-dependent examples like i.e. states of a CD player, would cater to a non-american audience as well.

  • @johnb5033
    @johnb5033 10 лет назад

    That was nice and clear! Thanks for taking the time to make this.

  • @Businessaccountingllc
    @Businessaccountingllc 10 лет назад

    Awesome presentation on all three videos..love the content and the personality!! Keep on making them!

  • @denis419
    @denis419 7 лет назад +1

    Looks like the most of the video is the presentation of "Adaptive Huffman Coding" technique. And the most compression comes out of it, although, the part with the Markov Chain is still important.

  • @alexparker7791
    @alexparker7791 7 лет назад +1

    at the end, the whiteboard shows you're encoding "AABAC..." so I would expect to see A in a context 0 table and another A in a context 1 table (because of the transition from A→A)... but there doesn't appear to be anything like that. it almost seems like the encoder just magically knows when to descend into a new context and when to reset back to context 0.
    when do you reset back to context 0? if you never do, you'll always open a new context, start with an escape code, and output a symbol to the data stream for every input stream, leading to a compression ratio of 9/8; a net loss
    how compressible is the data stream itself?

  • @dARKf3n1Xx
    @dARKf3n1Xx 10 лет назад +6

    You are bit fast man :(
    I am too dumb, need to catch up with you... :)

  • @aoeu256
    @aoeu256 5 лет назад

    Is there a way that compression could use other files or shared data for compression? For example, it could use an English dictionary for compression, and for videos of say Mario it could figure out how Mario is drawn and share it on the cloud, and then it would be cached on the local computer so if you frequently download Mario videos it would only have to download Mario once and reuse it for all Mario videos. It could also with AI replace Mario with a stick figure and allow you to play the video with stick figures if a shared Mario isn't available.

  • @alexparker7791
    @alexparker7791 7 лет назад

    if you wanted to precompute an initial set of Markov chains using a corpus in a given language, how would you combine those probabilities in the encoder and decoder with the statistical information from the input stream? maybe some experimentation is in order

  • @jaypandya1154
    @jaypandya1154 10 лет назад

    Colt McAnlis in the last problem u said its 46 bits long but it looks only 45 bits and I didn't understand the part at 17:11 where u encoded ABC the first time

  • @wisdom538
    @wisdom538 2 месяца назад +1

    2024 here

  • @GuillermoValleCosmos
    @GuillermoValleCosmos 8 лет назад

    The video isn't sync with the voice at around 16m right?

  • @NoctumusTV
    @NoctumusTV 8 лет назад +2

    This video could easily be compressed to around 50% of its original size by simply removing all the places where he says the word "actually" ;)

  • @theyruinedyoutubeagain
    @theyruinedyoutubeagain 10 лет назад +2

    Just lovely. Could've used some Silicon Valley (Mike Judge's new show) references, though. :p

    • @marcinmiszczyk
      @marcinmiszczyk 10 лет назад +1

      And why do you think they are doing it in first place? :)

    • @theyruinedyoutubeagain
      @theyruinedyoutubeagain 10 лет назад

      Ah, touche.

    • @ColtMcAnlis
      @ColtMcAnlis 10 лет назад +5

      I was doing compression _before_ it was cool ;)

    • @theyruinedyoutubeagain
      @theyruinedyoutubeagain 10 лет назад

      Colt McAnlis I'm not a smart man

    • @marcinmiszczyk
      @marcinmiszczyk 10 лет назад +1

      Colt McAnlis so do you think the series is somehow based on your life? :D I would be proud of myself

  • @vishalparmar988
    @vishalparmar988 4 года назад +1

    What is going on !! Can anyone tell me ?? I don't understand Ep. 3 , not at all.

  • @matkallada9316
    @matkallada9316 10 лет назад +1

    Baha =), this series is awesome

  • @DerekHaasHaas
    @DerekHaasHaas 8 лет назад +4

    The 9 people that disliked this were beaten up by Markov.

  • @SAnDAnGE
    @SAnDAnGE 10 лет назад

    Colt McAnlis at 12:43 shouldn't the output change to 100 ?

    • @ColtMcAnlis
      @ColtMcAnlis 10 лет назад

      Nope. Once a VLC change occurs, we don't re-process the whole stream, we simply move forward with the encoding state at that point. This way, the encoder and decoder both know how to handle the problem.

  • @DilSeDesee
    @DilSeDesee 10 лет назад +4

    Bit complex :(

  • @diegonayalazo
    @diegonayalazo 3 года назад

    Thanks

  • @msclrhd
    @msclrhd 10 лет назад

    For the baseball state example, can't you just use 1 bit to encode the data? You removed the transitions from the out/on base states to the batting states, so also remove that from the data stream (it is redundant information). Thus, you only need to know whether a batter transitioned to on base (0) or (1), which can be represented as a single bit. Everything else is implicit.

  • @macromeda3746
    @macromeda3746 10 лет назад +2

    The hardest to understand so far .

  • @concavex
    @concavex 8 лет назад +2

    09:22 T-BONER!!

    • @batlin
      @batlin 7 лет назад +1

      The saga of T-BONER begins even earlier at 7:54 :D

  • @birajdpatel
    @birajdpatel 10 лет назад +1

    dude, u r very fast cud nt understnd 2nd part ep03

  • @jameswasil
    @jameswasil 10 лет назад +3

    These things are just a simple rehash in this video of what I've been doing privately and working on since I was 11 years old. This is a very basic video and leaves out a lot of important information about data logic and entropy.
    But trying to make it "hip" and "trendy" is not winning anyone over. And the "finger shaking" crap at the viewer via the camera is really annoying, dude. Seriously. We're already paying attention, so you don't need to wave a stupid cardboard sign at us to "pay attention" annoy us further, either. Thanks.
    P.S: You don't run out of memory for Markov chains if you write functions in advance to transition system memory to virtual memory and page that data in address spaces to disk (and then read it back to a memory buffer when needed).
    Yes, this requires some file organization and balancing of data, but it's worth it and far more efficient than expecting people to add gigabytes of ram to their system just to analyze a file or series of data and complete the algorithm.
    Granted, it's not as fast (unless you have an SSD drive to make up for the read/write time), but it works and allows for much higher precision of your trees when they grow than to be limited by available system ram only. Whether you're using markov chains, PPM, or hybrids with BWT and LZ,
    it's better to do it this way rather than all in RAM unless you have to use restricted hardware (like PIC controllers or other embedded systems) where you can't expect storage space to exist beyond a finite set. For PCs and other systems, you have a lot more flexibility, and should make use of it for efficiency's sake if you're not going just for speed in your algorithm.

    • @ColtMcAnlis
      @ColtMcAnlis 10 лет назад +4

      Check out Episode 1 (g.co/compressorhead) which discusses those topics.

  • @Jirayu.Kaewprateep
    @Jirayu.Kaewprateep Год назад

    📺💬 We are moving from 12 bits to 3 bits transmission and he will help you with how the Markov Model helps in this scenario.
    🥺💬 In the Markov Model state of the message is important you do not need to create multiple flags for information when the state of messages is considered and dataflow is verified and correct.
    🧸💬 It is simple as AI, your character is jumping it cannot jump across itself in the opposite direction, the message stage is the same when you request inquiry data you only have one message accept info or wait until the query set is ready.
    🐑💬 To have compression that rates it possible by encrypting data into multiple states, ordering its sequences, and creating message flow diagrams when it represents in a matrix it becomes a set of possibilities. ( explain in telecommunication computing )
    📺💬 To make sure Yui talking about the output streams.
    🥺💬 The next sequence is caused by the previous result or an update from the result.
    🧸💬 That is correct and for the first letter, you do not need to make its accuracy prediction to 100% because for 3 -4 syllable evaluation perform an update of the entire word with a set of probability matching with the word or sentence that is how Markov model wins in the speech or text sequence works.
    🐑💬 You may try randomly saying it wrong at the beginning or last of the sentence but the Markov model tries to match the most recognized word for your ears, it is the output bits stream applied to telecommunication is error correction as you read from ISBN code error correction.

  • @bonbonpony
    @bonbonpony 8 лет назад +1

    00:18 Yeah... You should rather work on a code which would allow me to find my own RUclips comments I wrote a week ago, because as of now it is impossible ;P (yeah, search engine my ass :P )

  • @heavensrevenge
    @heavensrevenge 10 лет назад +2

    Ahem, this method of teaching is just terrible... he is speaking as if he's trying to inform a trucker or pro chef on how this stuff works, the way of explanation and intended audience(hopefully programmers) is a huge impedance mismatch. I can't get any of this because I can't stand how he try's to present it and can't stoop down dumb enough to be on the same stupidity page as him. I'll go read about this the old fashioned way.

  • @codycanaday3836
    @codycanaday3836 10 лет назад +1

    Hey google, DO YOUR JOBS AND FIX YOUR MESSED UP APPS LIKE PLAY MUSIC, youtube ect.

    • @SamuelAttard
      @SamuelAttard 10 лет назад +3

      It's funny because Google or their developers will probably never read this. And even if they did what you have said will not help them fix the supposed problem at all.
      If something truly doesn't work submit a bug report with details of what is broken. Personally I have never found a problem with the Play Music and RUclips apps. They work perfectly :P