Python MD5 implementation

Поделиться
HTML-код
  • Опубликовано: 5 окт 2024

Комментарии • 87

  • @royz_1
    @royz_1 3 года назад +66

    1:10 Linus would be proud of you

  • @memespdf
    @memespdf 3 года назад +202

    oh looks like you're using 3b1b's manim, looks good!

    • @mCoding
      @mCoding  3 года назад +86

      Learning it and trying to start incorporating it into my videos!

    • @john.dough.
      @john.dough. 3 года назад +39

      @@mCoding It looks really good! Thanks for the extra effort

    • @mCoding
      @mCoding  3 года назад +38

      Thanks for being such great viewers!

    • @manimtirkey861
      @manimtirkey861 3 года назад +11

      I’m surprised that I share my name with a software tool.

    • @voxelfusion9894
      @voxelfusion9894 3 года назад +2

      @@manimtirkey861 how do we know the software tool didn’t gain sentience and posted this comment? 🤔

  • @FromTheMountain
    @FromTheMountain 3 года назад +159

    This was not very interesting to me. I think you should have spent less time reading the code aloud, and more time explaining the attacks and why they work well on MD5. For example, I felt like I did not get an answer to this question: what does a cryptographic hash function like SHA-256 do differently than MD5, and why does that make it so much safer?

    • @mCoding
      @mCoding  3 года назад +172

      Thanks for letting me know, I'll try to do better in the future.
      As for your question, unfortunately the answer is quite unsatisfactory: there are no proofs of the security of better hash functions like SHA-256 or other hash functions that are currently thought to be secure, and they don't really do anything better than MD5 except having a longer output (256 bits vs 128 for MD5) and more mixing or rounds of mixing. Take a look at the picture on the wiki page for SHA256 vs MD5, they are very similar, just SHA has more parts. Yes that's right, we trust the security of banking, government secrets, bitcoin, and many more things with not much more justification beyond "no one has published a collision yet, so it's probably hard". Hash functions are never proven to be secure, they are hoped to be secure until proven otherwise. At one time MD5 was considered to be secure, then it was broken. At one time SHA-1 was considered to be secure, then it was broken. Some day SHA-256 may be broken as well. How does this happen? More computing power helps, but that's not the main reason (although quantum algorithms may yet defeat entire classes of hash functions regardless of the bitlength). Hash functions are not random, so they have some fixed structure to them. They are just fixed (but very complex) mathematical functions. The outputs may look very "random" but they are not. Researchers (or hackers) can exploit that structure. They find very special execution paths where, e.g. the bit mixing doesn't work as good as it should because of the special inputs they gave. String together enough of these bad mixing paths by tweaking some bits and you may just end up with a collision, this is how MD5 was broken. Then there's always the risk that a hash function has a mathematical property that no one noticed before and once someone notices the property they could use it to break the hash function.
      As for actually going through an attack and explaining "why" it works in a video, such attacks have a very high amount of tedious details in them and a very low amount of reasoning behind them. On the one hand, the tedium is completely necessary to do the attack, if you take out the tedium you end up with a generic bland statement like "MD5 is broken because if you try inputs with special differences then it doesn't mix well". On the other hand, tedium is tedium, no one wants to go line by line though an attack whose implementation is orders of magnitude more boring than even the implementation of MD5 itself. There are no cool revelations or mathematical insights, the justification for all of the known attacks is "I tried this and it worked". See, e.g. the "best public cryptanalysis" of MD5 linked in the description.
      Perhaps it would have been more prudent for me to cover an attack where there was a single "fatal flaw", but I really just wanted to talk about MD5 for a bit. Thanks for listening and again I will try to take your feedback into consideration for my future videos.

    • @StanislavStratiev
      @StanislavStratiev 3 года назад +10

      @@mCoding Great response, thanks!

    • @cmuller1441
      @cmuller1441 3 года назад +19

      @@mCoding All hash functions are insecure. There's always collisions because there's 2^n different hashs possibles (n is the length of the hash) and an infinite number of possible messages of any length.
      The trick is just that we don't know how to create a collision and trying by hand to test all messages until we found a collision would take too much time (like billions of years for billions of time the total processing power available on earth)

    • @Synthetica9
      @Synthetica9 3 года назад

      @@mCoding Note that there are also hashing schemes that don't use the terrible Merkle-Damgård construction (where we need to use this terrible 'compression' function which we don't really understand the security requirements on). Schemes like SHA3 use the much better understood permutation function.

    • @Ezechielpitau
      @Ezechielpitau 3 года назад +1

      I agree. I think this is the first mcoding-video I stopped watching. I only get half of what's going on but I just don't really care whether a is permuted with d or c with b. Way more focus on WHY this is bad pls

  • @kruksog
    @kruksog 3 года назад +41

    Your channel is so good. It really feels like there's a gap when it comes to coding/computer science RUclips. There certainly are a few good ones, but I feel like it's tiny when compared to something like (advanced) math RUclips. It's good to see someone filling that gap!

    • @mCoding
      @mCoding  3 года назад +9

      Wow, thanks! Glad you enjoyed!

    • @kruksog
      @kruksog 3 года назад +1

      @@mCoding thanks for making interesting videos. Keep it up!

  • @voxelfusion9894
    @voxelfusion9894 3 года назад +6

    As someone who does know code, this was very interesting. Often the exact implementation of a hash function is simply glossed over, which was always quite frustrating. Having a brief overview and explanation of the different parts was greatly illuminating, thanks for making this video!

    • @mCoding
      @mCoding  3 года назад

      You are very welcome! I wanted to go a bit into the details, but it seems that most people found this video too "in the weeds". I'm glad that you appreciated the details!

  • @gareth2021
    @gareth2021 3 года назад +31

    Really great and informative video, however I missed some more information regarding the attacks.
    How exactly do you find two different messages with the same hash? Bruteforcing or reverse engineering the algorithm?

    • @A1rPun
      @A1rPun 3 года назад +3

      You can use rainbow tables for that.

  • @nine9nine99
    @nine9nine99 3 года назад +7

    Great video as always! I was just wondering what the assert instructions where doing in the middle of the code, aren't these supposed to be in the tests?

    • @mCoding
      @mCoding  3 года назад +7

      Those were just meant to help the reader understand the state of the system at those points. In a real library I would take them out or extract them to tests.

  • @denizberkinmis
    @denizberkinmis 3 года назад +2

    Manim is just perfect for this type of stuff, great work!

  • @georgesanderson918
    @georgesanderson918 3 года назад +2

    I love these videos so much! Best Python youtuber ever!

    • @mCoding
      @mCoding  3 года назад +1

      Wow, thanks!

  • @דניאלאביב-ו6ת
    @דניאלאביב-ו6ת 6 месяцев назад

    Can you make a video about finding a collision in md5?
    Sounds really interesting!

  • @daleryanaldover6545
    @daleryanaldover6545 3 года назад

    This video might not be enjoying to watch but I like it more than those shallow explanation of things. This gives in depth of how md5 is structured, which if studied by oneself might take a whole lot of time. And yes, a secure algorithm is assumed or hoped to be secure unless reality knocks on the door.

  • @fadeoffical_
    @fadeoffical_ 3 года назад +5

    since i am on the discord, i am legally obligated to write a comment containing "discord gang"

    • @mCoding
      @mCoding  3 года назад +2

      Thank you for your participation in the discord gang

    • @fadeoffical_
      @fadeoffical_ 3 года назад +1

      @@mCoding -i was forced into it- ye ye.. no problem m8
      /s

  • @JohnWalz97
    @JohnWalz97 3 года назад +1

    Md5 is still convenient and useful for non-crypto related tasks. Like generating IDs quickly or nonces or something where you don't really care if its not secure.

    • @mCoding
      @mCoding  3 года назад +2

      Indeed, for non crypto uses it's totally fine. I think there's something like figuring out hard drive partitions that is a common non crypto use.

  • @amidfallen
    @amidfallen 3 года назад

    Would be nice to have video about incorporation of hashing for authentication in backend application :)

  • @staywithmesenpai
    @staywithmesenpai 3 года назад +8

    Discord gang

    • @mCoding
      @mCoding  3 года назад +4

      Thanks for your rapid support! :)

  • @vishalmishra3046
    @vishalmishra3046 3 года назад +1

    Cryptography algorithms get old with time and need to be replaced with modern algorithms. e.g. Blake3 is faster than MD5, more secure than SHA2 and SHA3 and most libraries use type-safe rust implementation. Migration to modern crypto is perceived as more expensive than the cost of all the data breaches over past few years. Industry needs to change that.

  • @drygordspellweaver8761
    @drygordspellweaver8761 9 месяцев назад

    So If I know a 128 bit MD5 origin, and a 128 bit checksum, is it possible to find the original seed phrase being used?

  • @OldestHouse
    @OldestHouse 3 года назад +4

    can you explain to me why in EVERY video you make you always have if __name__ == '__main__'?? are you using the code somewhere else or are you just 'flexing'?

    • @alexismandelias
      @alexismandelias 3 года назад +5

      Mainly for code organisation and consistency

    • @sniggleboots
      @sniggleboots 3 года назад +1

      I think it's so the code doesn't run when it is imported into a different file

    • @nevermind2521
      @nevermind2521 3 года назад

      if_name_ == '_main_' is used so that when you import the file to another file its not going to immediately run the code.

    • @porcorosso4330
      @porcorosso4330 3 года назад

      @@nevermind2521
      Does it not run any of the code? Or just whats inside the if condition?

    • @nevermind2521
      @nevermind2521 3 года назад

      @@porcorosso4330 what?. if _name_ == '__main__' gives you the option to run or not the chunk of code when imported to another python file.

  • @ghouldrago360
    @ghouldrago360 3 года назад +1

    RUclips reccomendations gang

  • @rakeeb1395
    @rakeeb1395 Год назад

    Can you explain to me how to re-encode the md5# through the spectrum of the OC3 optical line

  • @ShaunPatterson
    @ShaunPatterson 3 года назад +9

    Non-discord gang

  • @hakoo2700
    @hakoo2700 3 года назад +2

    tnx! again >_

    • @mCoding
      @mCoding  3 года назад

      No problem 😊

  • @byronwatkins2565
    @byronwatkins2565 3 года назад

    Hash algorithms are inherently many to one mappings. There are far more ways to arrange even 1000 bits than there are 512 bits, so it is not entirely surprising that someone found two of those many inputs that yield the same output. You seem to be saying that MD5 is somehow susceptible to this prefix attack but you are not at all clear about why.

    • @mCoding
      @mCoding  3 года назад +2

      Just because a mapping is many to one does not mean it is feasible to actually find two inputs with the same output. For secure hash functions it should not be possible to do so in, say, the amount of time from the big bang until the heat death of the universe. MD5 has clearly failed at this property given the collision I showed. Indeed, it is also susceptible to a much more real world chosen prefix attack, though, as I mentioned in another comment, there is little explanation for "why" it is vulnerable to such an attack except that you can run the attack and it works.

    • @byronwatkins2565
      @byronwatkins2565 3 года назад

      @@mCoding Perhaps, you could demonstrate how to "run the attack."

  • @PierreSoubourou
    @PierreSoubourou 3 года назад

    almost unreadable on smartphone, although there is some space in the editor :-/

  • @mxskll
    @mxskll 3 года назад +1

    Am I the only one that thinks they sound like the fireship.io person? Haha it must be you! But in a less rushed mode. Great video!

    • @mCoding
      @mCoding  3 года назад

      My voice is pretty generic 😅

  • @pier-oliviermarquis3006
    @pier-oliviermarquis3006 2 года назад

    man, this is way too advanced for me

  • @pilyotuts
    @pilyotuts 2 года назад

    you have tutorial on getting seed of sha256?

  • @muhammad-k3d1j
    @muhammad-k3d1j Год назад

    how to print the result?

  • @1Crivella
    @1Crivella 2 года назад

    The launch the nukes part didn't age well

    • @mCoding
      @mCoding  2 года назад

      That segment is not meant to be a joke, and on the contrary it is more relevant today than ever. Hash functions are used to ensure the integrity of messages, including messages over the internet as well as within many critical systems. It has been a longstanding fear of academics as well as practitioners since the development of hash functions for this purpose that something like a chosen prefix attack could be used to fake messages from military commanders and cause devastation. After all, a launch operator could very well be in a different room than the decision-makers and receive orders by computer. In short, hash functions are very, very important, and so is understanding when they are broken.

  • @multiplysixbynine
    @multiplysixbynine 3 года назад

    Not bad. Would have been nice to hear a discussion of how the weaknesses were discovered and characteristics of the algorithm they exploited.

    • @voxelfusion9894
      @voxelfusion9894 3 года назад

      It really was just trial and error. No fancy mathematics, as disappointing as that is.

  • @vanish3408
    @vanish3408 3 года назад

    What does 2**123 theoretical time mean?

    • @mCoding
      @mCoding  3 года назад +1

      It means theoretically you could perform the attack using 2**123 applications of the md5 compress.

    • @vanish3408
      @vanish3408 3 года назад

      @@mCoding Thanks, also awesome video!

  • @itzblinkzy1728
    @itzblinkzy1728 3 года назад

    Nice vid

  • @SiddharthGawande-bd8do
    @SiddharthGawande-bd8do 5 месяцев назад

    u sound like fireship

  • @lizard450
    @lizard450 3 года назад

    Didn't even go over the google attack... For shame

    • @mCoding
      @mCoding  3 года назад

      I know! I actually had about 2 hours of footage that I cut down to the video you see. Many things get cut out in that process and the attack you mention was one of them. I decided that it was too similar to the more recent and real world applicable code signing attack that I did mention. But I encourage anyone interested to read the md5 wiki page for more details!

  • @КириллКириллович
    @КириллКириллович 3 года назад

    argon2-id fioever

  • @axelanderson2030
    @axelanderson2030 Год назад

    a hashing algo is something you definitely do not implement in python lmao.

  • @markcuello5
    @markcuello5 2 года назад

    HELP

  • @cyrilsli
    @cyrilsli 3 года назад +2

    third

    • @mCoding
      @mCoding  3 года назад +4

      Respectable, but no prize.

  • @fadeoffical_
    @fadeoffical_ 3 года назад +1

    uwu

  • @vlad1500
    @vlad1500 3 года назад +2

    i dont get why you are explaining each line of code. anyone that remotely understands what you are saying already knows what those code does anyways. just go straight to the point next time.

  • @mr2octavio
    @mr2octavio 3 года назад

    I didn't understand a thing

    • @ciniss
      @ciniss 3 года назад

      Me too... And I feel like "low level" python syntax is kind of confusing

  • @FalseDev
    @FalseDev 3 года назад +9

    Discord gang

    • @mCoding
      @mCoding  3 года назад +5

      Fastest comment in the west!