Thankyou man. been thinking about how file compression works for almost twelve years. couldnt figure out. thought to look it up today. understood everything in one video here. 👍
I always like to think about how entropy kinda flows from one thing to another. In the first example the uncertainty about the next character was maximal so the code had to carry all of the information. As soon as we find out, through some other channel, that some characters are more likely the average code gets smaller because in needs to carry less information. Of course theres also entropy in the rules for decoding the stream.
I am glad that you mentioned entropy (technically cross-entropy in this case). I am considering making a follow up video where I discuss entropy in more detail. My motivation is that information entropy is extremely important for many sub-fields of machine learning, so I'd like to have something to refer people to if I ever do an ML series.
Can someone explain to me why you cant just make everything smaller? He said at 10:36 that you can't just make everything smaller. is there some rule/law that won't let you compress everything as small as possible?
if you make everything smaller there are chances that other big encodings are gonna be represented as those smaller encodings. If you think carefully, it might make sense.
Wow this is great, because i have been working in with nibbles or two bit number system for a long time, to test my compression algorithms. Nice Huffman encoding of DNA. I just realized that i get the best compression in the two bit number system. Maybe that why mother nature use it? And it is good for testing chaos and randomness theories.
I loved the two-bit example for its simplicity, and I chose the "lumpy" probabilities specifically to ensure that the Huffman coding was optimal. Hmm, it's funny, nature doesn't really utilize DNA efficiently. My understanding is that the four letters in DNA actually encode twenty amino acids (three letters of DNA => 1 amino acid) so the code is actually quite suboptimal (multiple 3-letter codes create the same amino acids; the code is degenerate).
The minimum amount of data (in total bits) required to represent an arbitrary piece of information depends entirely on the entropy (or randomness) of the source information. In your question, the minimum entropy in a 15kb source of information would be 1 or 0 repeated 120,000. At least 19 bits are required to represent those two example pieces of information.
Are you in college nowadays? I've seen all your terminal videos with the girly voice. I was discouraged from starting to learn computers at 22 because I thought I was too old to master it now. But I started anyway from your terminal series because I was always a bit curious about that; now I'm 23 and I have been making constant headway in learning this stuff.
I get what youre saying but how would we then know when 110 is in fact A and not CG clearly we have maintain a bit length, 8bit 16bit etc... 000 010 110 111 maybe im just being extra technical but im sure this how you have expected us to interpret it.
I still don't understand how this apply's into computer binary's of a file.... will just leave with this idea for the next years and try to crack myself hahahah
my laptop has 108gb spare out of 360gb... after compressing my data I've now got 190gb 🤯🤯 spare but also buying another SSD 250gb - shitty laptops hey lol
Why is this titled "How File Compression" works when you're speaking of DNA->Binary encoding? Maybe you should re-title it to "How DNA Is Used to Encode Binary Data." Because this doesn't seem to have anything to do with file compression (like WinZip, WInRar, or 7Zip).
In the lowest level, data is stored using only 1s and 0s. Creating libraries and substituting one value for a certain arrangement of bits, the most common ones using fewer bits than others not that common, can - and most likely will, reduce the number of bits. This is how file compression works.
Because he's using an example so that you can understand practical application of compression. The field of file compression is not limited to Winzip and winrar universal compression freeware.
Nobody has been able to explain file compression to me....
....until this moment.
Thanx buddy
Better than any instructor I’ve ever had at explaining this.
for a brief moment I really thought this guy was Bighead from Silicon Valley lol
Channel Name is also machead..maybe his brother
😂Hhh me too
i jusy end silicon vallet s2 and decidec to investigate compresión
What an excellent example. Very easy to understand, thank you
Thankyou man. been thinking about how file compression works for almost twelve years. couldnt figure out. thought to look it up today. understood everything in one video here. 👍
Very helpful, I wonder why this doesn't have more views.
so true, i am happy to have found this video (4 years later)
I really enjoy your channel, you should really make more videos, the world needs them.
That red wall is awesome ! Nice explanation !
If dreamworks is ever looking to make a live action how to train your dragon , give them a call
I always like to think about how entropy kinda flows from one thing to another. In the first example the uncertainty about the next character was maximal so the code had to carry all of the information. As soon as we find out, through some other channel, that some characters are more likely the average code gets smaller because in needs to carry less information. Of course theres also entropy in the rules for decoding the stream.
I am glad that you mentioned entropy (technically cross-entropy in this case). I am considering making a follow up video where I discuss entropy in more detail. My motivation is that information entropy is extremely important for many sub-fields of machine learning, so I'd like to have something to refer people to if I ever do an ML series.
So your saying scrabble is basically compression
Stephen Miller actually decent analogy and joke, you would make a good teacher
quite good video and very easy to follow. thx
very well done! this was so helpful! but watching this in 2021, i just hope you have a better camera by now 😉
great video your good at explaining stuff thank you
Can someone explain to me why you cant just make everything smaller? He said at 10:36 that you can't just make everything smaller. is there some rule/law that won't let you compress everything as small as possible?
if you make everything smaller there are chances that other big encodings are gonna be represented as those smaller encodings. If you think carefully, it might make sense.
Awesome! Thank you
Good video mate
Thank you so much.
Thank you man! You made it really clear.
Wow this is great, because i have been working in with nibbles or two bit number
system for a long time, to test my compression algorithms.
Nice Huffman encoding of DNA. I just realized that i get the best compression
in the two bit number system. Maybe that why mother nature use it?
And it is good for testing chaos and randomness theories.
I loved the two-bit example for its simplicity, and I chose the "lumpy" probabilities specifically to ensure that the Huffman coding was optimal.
Hmm, it's funny, nature doesn't really utilize DNA efficiently. My understanding is that the four letters in DNA actually encode twenty amino acids (three letters of DNA => 1 amino acid) so the code is actually quite suboptimal (multiple 3-letter codes create the same amino acids; the code is degenerate).
I really enjoyed and fully understood it. thanks you!
you are awesome..
love ur vids
Thanks! So are you!
Very good. Thx.
Great video, thanks for your time !
Thanks, it was very helpful.
Amazing bro......Thank you from Arabia
Thanks!
Thank you.
Good idea.
that's very awesome and nice explanation..... like that
Thanks for the video. : )
I wonder why you stooped making videos. Hope you are doing good.
Very helpful video!
thanks man, really good explanation
Awesome Explanation! Thanks a lot! Can you recommend a book or article on compression algorithms?
I don't know about a book, but a few wikipedia pages will probably help you out. I'd look up Huffman coding and information entropy to get started.
Will do. Thanks again!
Nice and explanatory video but I have a question. Is it possible to compress a 15kb data to 5bit?
The minimum amount of data (in total bits) required to represent an arbitrary piece of information depends entirely on the entropy (or randomness) of the source information. In your question, the minimum entropy in a 15kb source of information would be 1 or 0 repeated 120,000. At least 19 bits are required to represent those two example pieces of information.
Thanks, Nice wallpaper btw
thnx
Great Video! Thank you!
this is fucking awesome.
the towell box in the backround XD
Wow, it's the only Apple user in the world who isn't completely computer illiterate.
Sabiancym lol I want to say it's not true. But your right!😂
Sabiancym surprisingly the computational chemistry research team at my college all used macs. They didn't use gui either, but with Linux
Are you in college nowadays? I've seen all your terminal videos with the girly voice.
I was discouraged from starting to learn computers at 22 because I thought I was too old to master it now. But I started anyway from your terminal series because I was always a bit curious about that; now I'm 23 and I have been making constant headway in learning this stuff.
I love hearing stories like yours! I am currently 19 and in college. When I made those terminal videos, I was probably 11 or 12. Good times.
good stuff
I get what youre saying but how would we then know when 110 is in fact A and not CG clearly we have maintain a bit length, 8bit 16bit etc... 000 010 110 111 maybe im just being extra technical but im sure this how you have expected us to interpret it.
Wayne Modz I'm interested in the answer to this also. Would be a much bigger issue with the while alphabet too
I’m confused where you are seeing CG in the 110 sequence. In his example, CG should be coded as 100.
Incredible ^_^
Thanks :)
So like are you a programmer or something?
haha i guess you wouldn't know from this video
I still don't understand how this apply's into computer binary's of a file.... will just leave with this idea for the next years and try to crack myself hahahah
Steve?
I know this is sexist but kleenex box always look suspicious in a dude's room lol
my laptop has 108gb spare out of 360gb... after compressing my data I've now got 190gb 🤯🤯 spare but also buying another SSD 250gb - shitty laptops hey lol
Steve Jobs? :D
this is complicated , not for people without experience.
Ella Marie lol what how?
not sure if a inexperienced person would/should look up stuff like this in the first place
Just stop cursing if you wanna learn
Why is this titled "How File Compression" works when you're speaking of DNA->Binary encoding?
Maybe you should re-title it to "How DNA Is Used to Encode Binary Data."
Because this doesn't seem to have anything to do with file compression (like WinZip, WInRar, or 7Zip).
In the lowest level, data is stored using only 1s and 0s. Creating libraries and substituting one value for a certain arrangement of bits, the most common ones using fewer bits than others not that common, can - and most likely will, reduce the number of bits. This is how file compression works.
Pedro Britto Right.
But why does the video include DNA, technically different topic matter of a different field.
Because he's using an example so that you can understand practical application of compression. The field of file compression is not limited to Winzip and winrar universal compression freeware.