SHA: Secure Hashing Algorithm - Computerphile
HTML-код
- Опубликовано: 4 июн 2024
- Secure Hashing Algorithm (SHA1) explained. Dr Mike Pound explains how files are used to generate seemingly random hash strings.
EXTRA BITS: • EXTRA BITS - SHA1 Prob...
Tom Scott on Hash Algorithms: • Hashing Algorithms and...
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Mike Pound is by far my favorite person on this channel... he has the most interesting subjects, shines with crazy knowledge while still keeping the video fresh and dynamic.
I like him and his topics too, though the AI topics are interesting and the person explaining them is good too
he has great body language, tries to use it as much as possible
And a fair looker.
And the same accent as the 11th Doctor (Matt Smith)! :-D Where is that accent from?
Absolutely agree, Tom Scott is my second favourite, that guy is hillarious
Love how these videos get STRAIGHT to the point.
This is too much work, can’t we just trust each other?
That ,my friend, is the real problem
How can I trust other people when I can't even trust myself
@Mohamed Seid GodisGood666!
Dont trust verify
No Way!!!
I could sit and watch videos from this guy all day long, so informative and laid back
wrg
Been watching a whole bunch of Mike's videos as a complement to my introductory module on Security and Authentication. One of the best teachers I have come across!
Mike Pound is the best! I love hearing him explain things - keep em coming!
I've been trying to understand the concept for 3 days from the slides my teacher covered and the book she shared and ended up with complicated mind, this video gave me a pure understanding in 10 mins. Great job!
This is my favorite guy on this channel. I just love stuff like this.
I am at a hackathon in Chicago Illinois at Illinois Institute of technology and I have to use sha-1 on some facts before I pass then to an api so I can make a project for the Hackathon. You did a wonderful job telling me what she-1 was so I could understand the cryptic api documentation. Thank you very much.
Thanks, Dr Pound (if you read this). I find your demeanour easy to engage with, and you set me off on the journey of understanding fully (with much work!).
I've always loved your videos and now I study computer science and can watch your videos for studying, it's amazing
Roses are red
Violets are blue
Unexpected { on line 32
coding joke
A poetic compiler? I like that idea
Unresolved external symbol
Felt that on a spiritual level
Violets are blue
Roses are red
Your code isn't thread-safe
Use locks instead
The washing machine example really helped seal in this topic I was trying to understand and helped me on my final project. Thank you!!!
Mike you are my favourite person to appear on this channel. I enjoy your clear explanations and like the quite recent toppics like google deep dream, dijkstra and so on.
I love this channel so much...
I love these videos when Dr. Mike Pound is in them.
Hmm, so far this is fairly straightforward, but the interesting part would be how exactly these compression functions work. Will there be a follow-up video on that?
In essence, it generates 80 32 bit words derived from bits of the plaintext, then the state does right circular shifts, some XORs, some bitwise ANDs, addition with the round word and round constant, and then permutation between all state variables
@@liljuan206 thanks, this really helped clearing things up
it isn't compression he is describing it is hashing. which is not what encryption is. which is what sha is. (notice the s part stands for secure).
@@liljuan206 how do they make it so it can't be reversed?
In essence Sha-2 uses 6 primary functions: Choice and Majority, and S0, S1, E0, and E1 all which move and permutate bytes around during compression
Dr Mike Pound is the best! More videos with him please
I always wondered how these things work. Great video
Thank you so much. I had a hard time finding someone to explain it well
pound for pound Mike pound is the best narrator on computerphile
Re watched it at least 10 times. Thank you for this explanation
Can you talk about the colliding prefix issue? As I understand it once I find a collision with a file, I can continue to create collisions by appending the same thing to both files, and some how this allows me to create two meaningful files each with the same hash value where one might expect that any collision which might be found would be obviously fake because it would have to be made up of a bunch of random bits.
Love the Schildt on your wall!
I would love to see a video about the compression function! :)
Would you please explain the workings of the "washing machine"? ;-) I.e. the compression functions?
Thanks. I'll give this snippet a look. :-)
What I want to know, for no particular reason, is if there are cases where a hash of a hash equals itself, of course sticking with one particular algorithm and hash length.
easy-going video which explains just enough about SHA algo to keep it simple. The details are better learnt once you "get" the basic idea.
Excellent as usual, good learning resource
SHA Hashing Algorithm?
Secure Hashing Algorithm Hashing Algorithm
ATM Machine
RAS Syndrome
LAN Network
GNU's Not Unix...wait a minute
LCD Display
Note to self: Don't use a regular monitor as a touch screen
Its a university flatron monitor, probably expendable.
Thought I was following until 9:35
He describes a way of padding that will produce the same padding string for messages with the same length - then says it's important that messages with the same length don't have the same padding string. Did something important end up on the editing room floor?
I'll check with Mike but I think it was just a slip of the tongue - ie The padding would be the same for messages of the same length but the messages would be different if they are different >Sean
No, "0010110" padded would be "0010110100000...", but "001011000" would be "001011000100000...", so the 1 (first bit of padding) would be later.
+Mat2095 He obviously meant if you just pad them with zeros.
My dealer need this.
Appreciate your feed back!
Thanks for watching, for more info and guidance on how to trade and earn.
W…h…a…t…s…A…p…p~~M.E……
+…1…7…2…0…3…1…9…7…5…5…1
😂😂😂😂😂
😆
🤣
It would be amazing a video how you can get tracked for example: ip, mac, canvas, hd serial number, etc
Thanks for your great work!
Haven't seen that computer pyjama paper you are writing on in qute a while. Is it still used or is that just redundant stock?
the video's shoots are like modern family and that make's me happy ! also the information so thanks!
Thank you very much for this video :) It was very helpful and educational!
I'm confused , what is that "abcde" stand for ? and why is the loop be done 80 times ?
and the text is 512 bits long right ? how do I convert them into H0-H4 which is 160 bits in total ?
thanks
Actually that process involves using x-or function ,you can see it on the net about the way the abcde is changed into a different abcde it is pretty interesting
Good job! Your videos are excellent.
How do you know the "1000000..." padding bits are for padding purposes, and not part of the actual data/plaintext itself?
keeps me engaged great explanation
Another video explaining SHA-256 would be awesome.
How does the padding work if a block is 511 bits long?
aullik Considering almost all real-world data is stored as a stream of bytes (8 bit values), That's incredibly unlikely to ever come up.
It could be 504 bits, but 511 is highly improbable.
If your padding has to add at least 8 bits (one byte), then the thing he described works fine.
Remember working with individual bits is almost unheard of in computing.
If you have to store individual bits for storage efficiency, you pack them into bytes.
(similarly, if you store 7 bit values, you either store them in 8 bits and ignore a bit, or you pack it such that you store, say, 56 bit blocks. (7 x 8 - eg, 8 sets of 7 bits stored in 7 bytes)
aullik: Exactly the question that raised to my mind too :-) Since there isn't necessary enough bits left in the block to include the length of actual message.
You could add another block of 512 bits to the end to make it work.
+KuraIthys
Going with bytes, the longest message that could still be padded would be 496 bits long. 504 wouldn't work as you'd only have 8 bits left but 504 in binary is already 9 bits long.
+Kuralthys
I know that we usually work with bytes, But even if we say we have 512-8 = 504 bits Then we add 1 '1' bit to start the padding and now we only have 7 bytes left. The message is 504 bytes long but we can only store 128 in 7 bits.
The only answer is that we expand to 1024 bits. But the question would be how do we expand. What is the "syntax" for the lack of a better word
are the initial values important? any recommended readings on this?
Loved the washing machine demonstration!
Oh nice, string hashing via SHA1 is something I've been interested in.
Thank you! Made hashing much clearer for me now :)
never been this early for a computerphile, dope
Love these videos.
What's amazing is the Tom Scott "rocket" animation didn't show up on a video from Dr. Pound
Nice! Could you make a video about post-quantum cryptography please? It will be a great opportunity to learn more about this stuff
What would be the padding if the final chunk of message is only 502 - 511 bits?
I kinda want to make my own hashing algorithm now. It wouldn't be very good, it would just be some random jostling around of bits until it looks weird.
Elegant explanation. Thank you, Thank you, Thank you 😊👍
That 011001011 he wrote down is actually the start of the SHA hash value for "abd". I wonder if that was intentional, because the odds of that happening randomly are less than one percent.
3:17 And the reasons why the NSA came out with SHA-1 to replace the earlier SHA-0 (or just plain “SHA”) were not revealed publicly. But the weaknesses in the original SHA were discovered independently a few years later. This was part of a sequence of evidence indicating that the gap between public, unclassified crypto technology and what the NSA has was narrowing, and may not be significant any more.
I think it's widening because look at Pegasus and with Pegasus 2.0 you only need phone number to target a victim.
And, Pegasus is joint project between Israel and USA. Imagine what NSA would have kept to themselves.
It is common understanding in computer security feild that if government wants you, they have you.
So can two different string can output the same result after go through the hashing function?
I know youre not 'languagephile' but is there a real reason for nought and zero being so stark in contrast?
also: if oyu hve a message between 502 and 511 (inclusive) the padding would try to tack on 10 extra bits, how is that resolved? (10 bits because 1, then #of bits which is 9 in length)
Can u explain also the "Bundestrojaner"? #Backdoor:W32/R2D2.A #Staatstrojaner #mfc42ul.dll
What happens if your message is, say, 509 bits in length? How do you pad it if the length won't fit?
Excellent, finall a video with subtitles :)
is that u of Nottingham cup supposed to be some kind of product placement? it's like the camera is trying to keep it in frame and it doesn't even look like it been drank out of. also cool rubix cubes on the shelf
Given the whole of Computerphile is to some extent an endorsement of the University of Nottingham it seems unlikely, or at least unnecessary. More likely it happened to be part of the initial framing shot the camera operator wanted to avoid drifting from too much.
I ve always wondered what are those books, Would someone please show me the names of the books on the shelf and their authors?
Isn't padding used even if the message is already a multiply of 512 bits to avoid attacks?
The thumbnail made me think "OSHA" with the O as Dr Pound's head.
hi, please explain how you get new A B C D E? When you put 512 bits with initial A B C D E, you get new 512 bits, is it right?
Tx for the video :-). Maybe someone can help me with this question: What does determine the outcoming hash? At the one hand it is totally random, at the other hand it is consistent? Is it a super hugh complex formula, so that it is better to randomly guess instead of solving the formula? Or is it the NSA the only one who has the formula?
Can you do one of these for bcrypt as well?
This was very informatice!
Question: Is there any significance to the initialization constants
h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0
Or are they chosen "randomly"?
Thanks!
No, hey could be any numbers. BUt the cryptographic comunity is very sceptical of numbers that come out of nowhere.
It'd be amazing to see Dr.Pound reviewing some books from his collection. Get to know his technical interests apart from image analysis.
Superb video! Understood it even better with a lefty teaching me ;)
What to stop someone from precomputing all of the possible hashes, and saving it to a file that can be read as an array, then doing the same with the things it was hashing being saved to a different file. When someone wants the reverse hash of something, open the file and look up the position of the hash within that file, then look up that same position in the un-hashed file.
or is it faster to just generate all possible combinations on-the-fly until finding a hash that matches.
That is actually a possibility, called a rainbow table.
One way around it is to use a salt: when a user first creates an account, you generate a random string of characters, append it to the password and then hash it. The random string is stored in your db alongside the hash.
This also mkaes it so you have to crack each user's password individually.
I feel like a genius learning everything here!
So a hash function can protect against doctoring a message.
How do you prevent the insertion or deletion of a message in stream of messages? Each can be hashed, but you could create a new message, hash it, send it and its deemed good.
Do you have a secure cryptographic sequence number than can be embedded in any way?
"How do you prevent the insertion or deletion of a message in stream of messages?"
Before sha'ing you just append a shared secret. That way someone intercepting the message on route won't be able to produce a valid hash for an altered message. The recipient verifies the integrity of the message by sha'ing the message with the shared secret appended to it.
"Do you have a secure cryptographic sequence number than can be embedded in any way?"
If you mean some "sequence" number that appears to change randomly from one message to another, yet is known/anticipated by the recipient, than that's basically their shared secret, except it's not static.
However, in this scenario getting out of sync would mean that all the following messages would fail their integrity checks, until some sort of reset. That makes it trivial to do a DoS attack on the protocol/exchange. One common way to counter this is to reset every minute or two, but then the communication would have to be (close to) real-time.
Such a sequence can be any sufficiently random pseudo-random number generator sequence.
You explained everything except for the part that actually matters. :(
You may as well have said, sha works by shaing things.
Exactly my thought :/
That they explain complicated things in an easier to understand manner. Sorta like every other video they make.
Ah, I see now...it's a washing machine with some knobs that does the sha'ing.
The compression function of SHA is where it gets quite complicated, and I don't think it would've fit into the scope of one video, as explaining it to someone with no prior knowledge isn't trivial, there's quite a bit of complicated math involved, and very few people actually understand the details of it.
YES exactly this..
9:49 captions about Merkle-Damgard Construction are hilarious
Really interesting videos !
I remember when SHA1 was actually still secure, and people could get away with MD5 (although it was started to be frowned upon). Now I feel old.
Apple once tried to get away with MD4.
what happens if I feed 511 bits? it's not a multiple for 512 but the space left is too short to save the length
Since SHA is deterministic, even though it is non-reversible, it is still possible to guess the hashes of some reasonably short messages. For example, string 'abc' ALWAYS produces ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad. If I have a large enough database plus computational power, I could probably guess some short messages, although not the entire novel.
That's exactly how most cracking is done. Hashed database against hashed database lol
awesome awesome awesome great explanation! ty
Isn't it unsafe to have a padding scheme that leads to pre-image collision? E.g., h(msg) = h(pad(msg)).
So basically it's a randomization function that is seeded with the data you give it, right?
Is it possible to superpose pseudo random number generators to increase the levels of randomness?
tell me which sha to use when finding duplicate files
9:40 I didn't quite understand how that padding scheme guarantees that messages with the same size would not share the same padding.
Mike is the best
So the padding is only denoted by the last one with a trail of zeroes and a length at the end? That is not a prefix and without some other way of indicating that padding is present it is indistinguishable from data.
After a quick google search it appears that the padding is always present so it doesn't need to be a prefix.
Really Great! Thanks alot
Thank you computerphile:-)...
What if the message has 159 bits? How can you add the padding with its length if you just have one available bit to do the padding?
Then you pad up to 320 bits
I'm curious about the books on the shelf whose titles I can't read. They are the 4th, 6th,7th, and 11th books from the left. I don't think I care so much about the 12th book from the left. Does anyone know the titles of those books? I think I want those books.
So sha256 + salt it's better for hashing right?
what books are on the shelf?
What if the message is only a few bits shy of a block, not enough room for padding bits as described?
If there's less than 65 bits of space left in the final block for padding, you just pad toward an extra block. For example if your message is 480 bits, you add a one-bit, 479 zero-bits, and the 64-bit length, giving total length 1024 bits = 2 blocks.
Matthijs van Duin thanks
What happens if a message is smaller than 512 bits but long enough for the padding part to not have any space left to store the length of the message?
Then you pad to 1024 bits(including message length)
What happens if I want to hash a message that looks exactly the same like the padded message? would it produce the same hash?
It wouldn't, because (I could be wrong) if it's a multiple of 512 bits, it's padded towards another block. So if your message is 512 bits, it's made into 1024 bits, therefore making it impossible to get the same message if it's equivalent to a padded message.
5:50 summarised the subject in 1 sentence ;-)
@5:21 "We might talk about that in a bit", proceeds to encrypt that bit in sha and turns it to 160 bits
How would the padding work if the final block of the message was long enough that you don't have enough padding room to say the number of bit in the message? So if the final block contained 510 bits you would have to pad in 9 bits(111111110) to say that the message is 510 bits, but you would end up with more than 512 bits.
The length field has a fixed size (which is sufficient enough) (also the field is not optional). The length of 10...0 is decided including the size of the length field i.e. you could jump over to the next block if required.