From a programmer's perspective it's astounding that the memcpy part of code was peer reviewed and passed all the checks without anyone thinking "But what if someone sends the length that is greater than the actual payload?". Also whoever wrote that file needs to read up about variable naming. bp, lp, p, etc. Jeez. Great video though, thanks for uploading!
I'm not like a pro yet but from my experience some complex or more secure apps do have variables named like this. Idk id guess it's security over readability maybe? And I'll say it's always easier to understand a vulnerability after they happen rather than before. Seems so simple to us but who knows what they were thinking. Or maybe the complexity with the variables actually caused the issue lol
@@patrickconrad396 Security by obscurity isn't really security. It's probably because for people who write this kind of codes, it's kinda obvious. p = pointer, bp = buffer pointer, pl = payload length But i also don't like those short namings.
Great video. XKCD has a nice comic briefly explaining what the bug is (great for your non-tech friends), but this video goes just a little further in explaining how it works.
I'm not a programmer but i can see how coding something which essentially completely trusts the data sent by the client to fit a format without validating it is a bad idea...
I found that to be the case in most of the web. Because of this, this is why myspace worms break out. With all websites trusting eachother, you can do SQL Injection and XSS.
You are, indeed, correct. It is always best practice to check if an email field fits the pattern *@*.* or that a password field is at least 6 characters in length or, if you're accessing a database, that your table variable has greater than 0 rows. Not only does it prevent unforeseen error messages, it prevents an exploit such as this.
I as a programmer am deeply baffled how one could make such kind of error - the level of absolute incompetence is just staggering (programmer/s + QA). It is not even hidden under layers of other code! No validation of external data in security critical code!? Amazing.
You are completely right, but as a programmer I want atleast to explain, how bugs like this can occur: If you are writing several thousand lines of code, it is rather likely to forget the checking processes for the data at one point or the other. And it's even more likely for something like this to happen, if you are coding protocols. (As network protocols usually need to be as performance-efficient as can be and therefore you try to accomplish your goal in general with as few lines of code as possible.)
This is literally the first lesson we learned in computer science classes beyond the basic "Intro to Programming" course; namely, don't trust the end-user. Assume they are either 1) a complete idiot who won't use the software correctly or 2) a malicious user who will exploit your program if possible. NEVER EVER trust data sent from a user without performing sanity checks and validating it
CelmorSmith I believe it was purposely put in there on the behest of government agency. Its seems like a very obvious mistake. This is first year university level logic mistake. Like a situation where the lecturer makes very elementary flaws in the code and students are given 15 min to correct it. As another poster mentioned that not being someone from a programming background even he could see the inherent logic flaw. That is, trusting data sent WITH OUT VERIFYING IT. This is utterly unheard of in any programming practice. So this to have escaped professionals designing security... is highly suspect to say the least. I think you have to include more people then we think in to the "bad guys" group unfortunately. Some of those who run forces are the same who burn crosses ~ RATM
It has been known about for years, as with lots of bugs academics and industry experts are aware of many of these but it simply too costly or not seen as worth fixing unless there is a known or presumed risk. You must remember that the majority of the population are extremely lazy and uneducated in the ways that computers work - and really that is how security is maintained.
Very good explanation! I have seen lots of people try to explain this, and this is by far the easiest to understand for someone unfamiliar with SSL or C
I agree that Java can't contain C code, but C# allows for unsafe native code, yes, usage of native libraries and there is C++/CLI as well. And naming conventions, they could name things well in the C standard libraries, like pascal guys used to do, but, they just chose to not.
+CaptainDuckman Hungarian Notation, the idea is that you include the type of every variable in its name. It makes it more obvious if you are using the wrong type.
Tom did a great job of explaining this, I feel. But I guess some people are looking for more detailed stuff. Crazy how bugs like this are still getting through...
checking if the payload is the length specified by the user would suffice. Sth. like "if(payloadLength == payload.Length)" (but i'm not a C programmer) would be enough if the container has that method. But finding out the Length would be with that Method easier anyway.
Make a video about multi-core CPU's and the benefit of 64-bit architectures. I realized that if 8-bit was enough for instruction sets back in the day. So what do we do with the 56 extra bits? Then I realized maybe it's for sending multiple instructions at once per processor core. So yeah, video's about processor architectures.
Interesting. Nice to see why there was so much noise about this online. Part of me wants to face-palm at this, but it's really quite a simple mistake to make.
We're not going to give you the link for the exploit, no but you did tell us about it and now all we need to do is search for it and we will find it in 0.45 seconds.
This reminds me thematically of the RSA bug half a year ago... What I still don't understand with the heartbleed bug, though, is why it is necessary to tell the server how long the message is. Can't it determine the length of the message on the basis of the message itself? I mean, C is used for high-precision scientific computations in applied mathematics for decades, but it can't count how many bytes a message has? ò.Ô
Short answer: No. Long answer: The computer has no way of telling where an arbitrary sequence ends, unless it uses some sort of terminator value or a predefined size placed in front of the sequence.
clearly there is a way to tell the actual size of the payload since it was needed to apply the patch. the entire issue was caused because the code didn't check if the actual length of the payload matches the integer value provided by the client.
I never even heard of this bug before. Funny thing is, I saw he bug before he described it. See, this is why I would be really reluctant to make code that messes around with memory like that in security. It's amazingly easy to mess it up when you don't have type protection. But I guess it's pretty easy to mess up even if you do sometimes.
Another reason to always memset any temporary buffers in memory containing passwords/keys after you're finished using them. This includes local function variables allocated on the stack before you return.
Thanks for explaining this. I looked at some of the code to test for the heart-bleed bug but not knowing the server side code meant I was unsure why this happened. Nice clear explanation and maybe we could have a video on networks and network protocols. By networks I mean like tor etc. and not just here is a star network and here is a bus network etc.
How did this pass testing? Giving a different payload size than the real one is something very basic, it's so weird it sounds intentional. AWESOME video by the way, thank you!
Crazy bug! What gets me the most is how chronically underfunded OpenSSL apparently was. At least people are pitching in now. Hopefully other important open source projects won't have to go through that.
My father was telling me that the company he worked for knew about this bug for several years but they only fixed it now when it was discovered by hackers.
Not really - ASLR doesn't help you in this instance. Even though the OS gives you memory-pages with "random" starting adresses you still get ~4kb per page. That is, however, much more than a (typical) single variable needs, so you end up storing more than one variable per page. And this again is done sequentially, so the probability of reading actual data via this bug is pretty much the same with or without ASLR ;)
This has nothing to do with it: Address Space Layout Randomization randomizes the loading address of the program and its dynamic libraries, so that it's very difficult (almost impossible) to write shellcode to exploit a vulnerable program. Hearthbleed doesn't inject shellcode; it just tricks the vulnerable client/server in sending what it has in its writeable memory.
OpenBSD was actually the first mainstream operating system to integrate ASLR and activate it on by default. libc support for ASLR doesn't help with this bug because of OpenSSL's use of an internal malloc.
I'm a programmer. I know that programmers make mistakes, it's pretty much unavoidable. A mistake like this is so incredibly easy to make, and when you're working on a piece of code that a percentage of the world's servers will be relying on to keep data secure, the cost of those mistakes are extreme. I pity the programmer(s) who made this mistake.
Honestly, I don't, and i don't trust OpenSSL anymore if only one programmer wrote and checked the codes behavior with the outcome of 2 scenarios that of the right user input and that with the wrong user input.
if this heartbleed never happened, do you guys change your password every once awhile? like half year or so, most of the people I know they don't change their passowrd, is it necessary to change it once awhile?
Why didn't the memcpy cause a segmentation fault when asked for more memory than the variable held? I suppose OpenSSL has to be running their own memory manager that allowed for this segmentation violation.
Segmentation fault happens when a process tries to access memory that doesn't belong to its accessible memory area, or "segment". The operating system catches this kind of errors because it keeps account which area belongs to which process. However, it doesn't or even couldn't in principle "micromanage" whether the accessed memory belongs to a certain _variable_ or not.
Sorry for the ambiguity. I meant to say, "Why didn't the memcpy EVER cause a segmentation fault when asked for more memory than the variable held?" It seems that the bug would have been caught much earlier if segfaults occurred during malicious actions. With the default malloc, the variable would eventually be randomly assigned near a border between two segments and the OS would throw the segfault. I'm thinking they had a custom malloc implementation that placed the variable in front of a big chunk of data managed by that custom memory allocator.
But Psi Mayfield has a valid point, now that I think it. After all, segfaults should happen even when reading from an "unallowed" location - and it certainly could try to read from such a location, I think?
Psi Mayfield memcpy is reading valid memory, it's just uninitialised. A buffer has been allocated based on the length from the client, it's just the client didn't send enough to fill that buffer (client says, "Imma gonna send 64k" - server allocates 64k - client sends 32k - rest of the receive buffer is uninitialised, but valid). That's the bug :)
It is a really careless programming. Avoiding such mistake is very easy if you read the manual. Socket function recv, which is used to read data, takes in the number of bytes you want to receive/read, and returns the number of bytes it received. You tell how much data want, you then use returned value to find out how much data you've actually got.
Brady, could you please keep the camera showing code when it's being discussed? or at not make sudden cuts so often. It breaks focus. Other than that, wonderful video!
Wow. I am quite surprised that whoever wrote that piece of code forgot the length checks to begin with. Seems like something pretty obvious to me anyways.
+zgintasz because the server needs to know when to cut the connection when all data is sent if packets are fragmented. Or when it is not completely sent, tell the client to reset the connection. Also if the server doesn't know how much of the packet is padding if it doesn't know the length of the actual data which means useless padding might be treated as actual data.
Why do you need the padding? Aren't that 16 bytes that slow down the protocol and cause cost (processing and network) uselessly every single heartbeat?
Missing the point that OPENSSL_malloc makes the problem even bigger (almost every time sensitive data, less chance to detect the illegal read from the OS, etc.)
"lets move into the office"
gotta show this b-roll of ducks first
From a programmer's perspective it's astounding that the memcpy part of code was peer reviewed and passed all the checks without anyone thinking "But what if someone sends the length that is greater than the actual payload?". Also whoever wrote that file needs to read up about variable naming. bp, lp, p, etc. Jeez.
Great video though, thanks for uploading!
I'm not like a pro yet but from my experience some complex or more secure apps do have variables named like this. Idk id guess it's security over readability maybe? And I'll say it's always easier to understand a vulnerability after they happen rather than before. Seems so simple to us but who knows what they were thinking. Or maybe the complexity with the variables actually caused the issue lol
@@patrickconrad396 Security by obscurity isn't really security.
It's probably because for people who write this kind of codes, it's kinda obvious.
p = pointer, bp = buffer pointer, pl = payload length
But i also don't like those short namings.
@@mutzikatzi1 Totally agree
Yes. We are never supposed to trust the client
Great video. XKCD has a nice comic briefly explaining what the bug is (great for your non-tech friends), but this video goes just a little further in explaining how it works.
Fantastic work as always. Nice clear explanation of a fairly important subject.
An excellent look at Heartbleed and the nature of security bugs in-general.
I would really love to see more code reviews here. This is great stuff!
I'm not a programmer but i can see how coding something which essentially completely trusts the data sent by the client to fit a format without validating it is a bad idea...
I found that to be the case in most of the web. Because of this, this is why myspace worms break out. With all websites trusting eachother, you can do SQL Injection and XSS.
You are, indeed, correct. It is always best practice to check if an email field fits the pattern *@*.* or that a password field is at least 6 characters in length or, if you're accessing a database, that your table variable has greater than 0 rows. Not only does it prevent unforeseen error messages, it prevents an exploit such as this.
I as a programmer am deeply baffled how one could make such kind of error - the level of absolute incompetence is just staggering (programmer/s + QA). It is not even hidden under layers of other code! No validation of external data in security critical code!?
Amazing.
You are completely right, but as a programmer I want atleast to explain, how bugs like this can occur:
If you are writing several thousand lines of code, it is rather likely to forget the checking processes for the data at one point or the other. And it's even more likely for something like this to happen, if you are coding protocols. (As network protocols usually need to be as performance-efficient as can be and therefore you try to accomplish your goal in general with as few lines of code as possible.)
This is literally the first lesson we learned in computer science classes beyond the basic "Intro to Programming" course; namely, don't trust the end-user. Assume they are either 1) a complete idiot who won't use the software correctly or 2) a malicious user who will exploit your program if possible. NEVER EVER trust data sent from a user without performing sanity checks and validating it
Amazing how this wasn't spotted much earlier
CelmorSmith
I believe it was purposely put in there on the behest of government agency. Its seems like a very obvious mistake. This is first year university level logic mistake. Like a situation where the lecturer makes very elementary flaws in the code and students are given 15 min to correct it. As another poster mentioned that not being someone from a programming background even he could see the inherent logic flaw. That is, trusting data sent WITH OUT VERIFYING IT. This is utterly unheard of in any programming practice.
So this to have escaped professionals designing security... is highly suspect to say the least. I think you have to include more people then we think in to the "bad guys" group unfortunately.
Some of those who run forces are the same who burn crosses ~ RATM
It has been known about for years, as with lots of bugs academics and industry experts are aware of many of these but it simply too costly or not seen as worth fixing unless there is a known or presumed risk. You must remember that the majority of the population are extremely lazy and uneducated in the ways that computers work - and really that is how security is maintained.
Very good explanation! I have seen lots of people try to explain this, and this is by far the easiest to understand for someone unfamiliar with SSL or C
One of your best videos yet Brady
A good explanation of the "heartbeat bug" and why it's so dangerous. I'm surprised that it lasted in the wild so long!
Best computerphile video to this date
I hope nowadays C programmers have learned to create understandable names to functions and members :|
***** you mean lpfstrHW doesn't tell you anything? ;)
Ip from string ...hardware?
+Felype Rennan Nope.
I agree that Java can't contain C code, but C# allows for unsafe native code, yes, usage of native libraries and there is C++/CLI as well.
And naming conventions, they could name things well in the C standard libraries, like pascal guys used to do, but, they just chose to not.
+CaptainDuckman Hungarian Notation, the idea is that you include the type of every variable in its name. It makes it more obvious if you are using the wrong type.
Tom did a great job of explaining this, I feel. But I guess some people are looking for more detailed stuff. Crazy how bugs like this are still getting through...
Despicable that bugs like this are getting through in the very part of the system designed to be extra secure.
Great explanation of the heartbleed bug!
Really good. Thank you.
I'd love to see the fix - the checks they added 7:15. Or at least what type of things can be done.
checking if the payload is the length specified by the user would suffice.
Sth. like "if(payloadLength == payload.Length)" (but i'm not a C programmer) would be enough if the container has that method. But finding out the Length would be with that Method easier anyway.
Make a video about multi-core CPU's and the benefit of 64-bit architectures. I realized that if 8-bit was enough for instruction sets back in the day. So what do we do with the 56 extra bits? Then I realized maybe it's for sending multiple instructions at once per processor core. So yeah, video's about processor architectures.
Dr. Bagley's shirts are fly as shit.
Very good reminder of how important it is to be defensive about your programming, especially in unsafe languages like C!
Well explained, Computerphile!
Best video of your channel! Keep them coming!
i held a architectural speech about this building here in germany! :) nice to see it again this random.
Learning how Heartbleed makes the server send in random memory contents made me laugh so hard...
Fantastic video, thanks Dr. Bagley and Computerphile!
great explanation of heartbleed.
Interesting. Nice to see why there was so much noise about this online. Part of me wants to face-palm at this, but it's really quite a simple mistake to make.
We're not going to give you the link for the exploit, no but you did tell us about it and now all we need to do is search for it and we will find it in 0.45 seconds.
Where was the opening filmed? Its beautiful
That's the University of Nottingham Jubilee Campus, home to their Computer Science building :) >Sean
***** Thanks Brady, I'll have to check out that campus!
+Sean Haggard Looks a lot like York's new place. Very similar to Nott's obviously.
This is disturbingly easy. How could have gone unnoticed for such a long time?
This reminds me thematically of the RSA bug half a year ago... What I still don't understand with the heartbleed bug, though, is why it is necessary to tell the server how long the message is. Can't it determine the length of the message on the basis of the message itself? I mean, C is used for high-precision scientific computations in applied mathematics for decades, but it can't count how many bytes a message has? ò.Ô
Strings don't have a length parameter.
Say the next 6 letters: Badeth haha
Would be the same as
Say this: Badeth
@@natnew32 Yes, and string isn't even a data type in C, they're just an array of characters.
Short answer: No.
Long answer: The computer has no way of telling where an arbitrary sequence ends, unless it uses some sort of terminator value or a predefined size placed in front of the sequence.
clearly there is a way to tell the actual size of the payload since it was needed to apply the patch. the entire issue was caused because the code didn't check if the actual length of the payload matches the integer value provided by the client.
Should'nt you link to the XKCD explanation? It's ingenious.
beautiful. Thanks. we need more videos from this gentleman
I never even heard of this bug before. Funny thing is, I saw he bug before he described it. See, this is why I would be really reluctant to make code that messes around with memory like that in security. It's amazingly easy to mess it up when you don't have type protection. But I guess it's pretty easy to mess up even if you do sometimes.
Another reason to always memset any temporary buffers in memory containing passwords/keys after you're finished using them. This includes local function variables allocated on the stack before you return.
How many processor cycles would it take? When you would do that to every variable in your code.
The best explication about Heart Bleed I've found. Thank you very much!
Nice video. Great explanation.
Thank you for not dumbing it down! :D
Great explanation.. Clears it all up for me.. Thanks
Thanks for explaining this. I looked at some of the code to test for the heart-bleed bug but not knowing the server side code meant I was unsure why this happened.
Nice clear explanation and maybe we could have a video on networks and network protocols. By networks I mean like tor etc. and not just here is a star network and here is a bus network etc.
Thanks, very interesting to see an explanation of the code!
Absolutely loved your explaination.
please provide subtitles.
best content.
great explanation of the the heartbleed bug
Very nice look at how it works
How did this pass testing? Giving a different payload size than the real one is something very basic, it's so weird it sounds intentional.
AWESOME video by the way, thank you!
Crazy bug! What gets me the most is how chronically underfunded OpenSSL apparently was. At least people are pitching in now. Hopefully other important open source projects won't have to go through that.
Wow, I get it now! Great explanation!
Ooooh! Very nice Ataris in the background! Cool! :D
Another great video, Where can I get a link to the code that computerphile won't give us , for educational purposes of course
Thank you for that explanation. Helped me a lot.
I had no idea what this was about until 7 minutes in lol
Great video always amazing to see the exploits being exploited in action ;)
Brilliant as always.
My father was telling me that the company he worked for knew about this bug for several years but they only fixed it now when it was discovered by hackers.
That's a cool looking area! Where was this shot?
great explanation, didn't expect it to be that good :)
that was really informative and excellently explained!
Give me the 500 Letters of Tom has a cat: Tom has a cat (other unrelated information)
Thanks for this! I was wondering how the bug worked
Thank you for explaining, very interesting stuff and great video!
Very interesting stuff!
Nothing beats XKCD's explanation.
Brilliant video, I heard this on the news and wanted to find out how it actually worked
I like the scene with the Ducks.
The font is so lovely! Is it comic sans?
Thanks Steven Bagley.
The ASLR lack of *BSD and the weak version in Linux are also, I think, make this attack to be more successfull. If not please correct me.
Not really - ASLR doesn't help you in this instance. Even though the OS gives you memory-pages with "random" starting adresses you still get ~4kb per page. That is, however, much more than a (typical) single variable needs, so you end up storing more than one variable per page. And this again is done sequentially, so the probability of reading actual data via this bug is pretty much the same with or without ASLR ;)
This has nothing to do with it: Address Space Layout Randomization randomizes the loading address of the program and its dynamic libraries, so that it's very difficult (almost impossible) to write shellcode to exploit a vulnerable program.
Hearthbleed doesn't inject shellcode; it just tricks the vulnerable client/server in sending what it has in its writeable memory.
OpenBSD was actually the first mainstream operating system to integrate ASLR and activate it on by default. libc support for ASLR doesn't help with this bug because of OpenSSL's use of an internal malloc.
Great video although I'm still trying to figure out the purpose of the printed code. Just something to give a visual?
was waiting for this!!!
Thanks for the info :)
I'm a programmer. I know that programmers make mistakes, it's pretty much unavoidable. A mistake like this is so incredibly easy to make, and when you're working on a piece of code that a percentage of the world's servers will be relying on to keep data secure, the cost of those mistakes are extreme. I pity the programmer(s) who made this mistake.
It is OpenSSL
Honestly, I don't, and i don't trust OpenSSL anymore if only one programmer wrote and checked the codes behavior with the outcome of 2 scenarios that of the right user input and that with the wrong user input.
not a programmer but, that block of code bout unchecked payload seems easy to understand for a programmer. the exploit was there for long time?
if this heartbleed never happened, do you guys change your password every once awhile? like half year or so, most of the people I know they don't change their passowrd, is it necessary to change it once awhile?
Great explanation!
Nice! Seeing the bug in action makes the news story way more interesting. TV stations, take note of this!
The bracket style is making me twitch. Let the holy war commence.
Why didn't the memcpy cause a segmentation fault when asked for more memory than the variable held? I suppose OpenSSL has to be running their own memory manager that allowed for this segmentation violation.
Segmentation fault happens when a process tries to access memory that doesn't belong to its accessible memory area, or "segment". The operating system catches this kind of errors because it keeps account which area belongs to which process. However, it doesn't or even couldn't in principle "micromanage" whether the accessed memory belongs to a certain _variable_ or not.
Sorry for the ambiguity. I meant to say, "Why didn't the memcpy EVER cause a segmentation fault when asked for more memory than the variable held?" It seems that the bug would have been caught much earlier if segfaults occurred during malicious actions.
With the default malloc, the variable would eventually be randomly assigned near a border between two segments and the OS would throw the segfault.
I'm thinking they had a custom malloc implementation that placed the variable in front of a big chunk of data managed by that custom memory allocator.
But Psi Mayfield has a valid point, now that I think it. After all, segfaults should happen even when reading from an "unallowed" location - and it certainly could try to read from such a location, I think?
Psi Mayfield memcpy is reading valid memory, it's just uninitialised. A buffer has been allocated based on the length from the client, it's just the client didn't send enough to fill that buffer (client says, "Imma gonna send 64k" - server allocates 64k - client sends 32k - rest of the receive buffer is uninitialised, but valid). That's the bug :)
It is a really careless programming. Avoiding such mistake is very easy if you read the manual.
Socket function recv, which is used to read data, takes in the number of bytes you want to receive/read, and returns the number of bytes it received. You tell how much data want, you then use returned value to find out how much data you've actually got.
Yesterday I said "I wish computerphile would make a heartbleed video." I didn't think it would happen though!
Tom Scott also did a great one on his own channel.
Nice explanation. Well done.
Very informative! Thanks!
The problem is that languages like C with pointer arithmetic allows procedures shoot past array boundaries and read into other parts of the heap.
Kudos for the Atari ST sitting in the background!
XKCD 1354: explains it REALLY simply...
Does this never cause an access violation in the OpenSSL process? I would think eventually it would run out of bounds and crash the server.
Brady, could you please keep the camera showing code when it's being discussed? or at not make sudden cuts so often. It breaks focus. Other than that, wonderful video!
no.
accessing other rams over Internet is awesome
But if you are trying to read beyond your memory, shouldn't the program sigsegfault occasionally?
Wow. I am quite surprised that whoever wrote that piece of code forgot the length checks to begin with. Seems like something pretty obvious to me anyways.
Comprehensive, this explanation. Thank you
how did I end up watching this ..... I have no Idea what he was talkin about lol
It helps to pay attention.
Isn't it possible to override the sensitive memory after usage by default? Obviously you will never know if someone reads the systems memory later.
what was that editor you were using earlier on your mac?
"Heartbleed" sounds like a great title for an anime series.
gezz what took you guys so long.
Thanks so much for some actual journalism. Everyone else in the media are like "ermahgerd enternet ermergherdon".
why can't server just count the length itself?
+zgintasz
because the server needs to know when to cut the connection when all data is sent if packets are fragmented. Or when it is not completely sent, tell the client to reset the connection.
Also if the server doesn't know how much of the packet is padding if it doesn't know the length of the actual data which means useless padding might be treated as actual data.
isn't this functionality already implemented in udp/tcp? I mean server/client can't get half of a packet from each other.
I assume that the payload is there to let the requester validate the integrity of the reply, but what is the purpose of the padding?
Why do you need the padding? Aren't that 16 bytes that slow down the protocol and cause cost (processing and network) uselessly every single heartbeat?
nice colorscheme, which one is that?
Great explanation. Thanks
Missing the point that OPENSSL_malloc makes the problem even bigger (almost every time sensitive data, less chance to detect the illegal read from the OS, etc.)