why do hackers love strings?

Поделиться
HTML-код
  • Опубликовано: 1 ноя 2022
  • Hackers have been trying to steal information since the beginning of the information age. Buffer overflow attacks have been one of the ways they do it. By taking advantage of logic bugs in programs, hackers have been able to get access to computers and steal information which they later sell on the dark web. Buffer overflows have been one of the most common ways they get in.
    In C, strings are a little weird. Because there is no length encoded with the string type, string functions in C are extremely easy to use incorrectly. When used in an unsafe way, hackers can abuse the way that functions call each other to give them access to your computer.
    🏫 COURSES 🏫 Learn to code in C at lowlevel.academy
    📰 NEWSLETTER 📰 Sign up for our newsletter at mailchi.mp/lowlevel/the-low-down
    🙌 SUPPORT THE CHANNEL 🙌 Become a Low Level Associate and support the channel at / lowlevellearning
    🛒 GREAT BOOKS FOR THE LOWEST LEVEL🛒
    C Programming Language, 2nd Edition: amzn.to/3OKh3q2
    Computer Systems: A Programmer's Perspective: amzn.to/3N3PqHe
    Blue Fox: Arm Assembly Internals and Reverse Engineering: amzn.to/4394t87
    Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation : amzn.to/3C1z4sk
    Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software : amzn.to/3C1daFy
    The Ghidra Book: The Definitive Guide: amzn.to/3WC2Vkg
    🔥🔥🔥 SOCIALS 🔥🔥🔥
    Low Level Merch!: www.linktr.ee/lowlevellearning
    Follow me on Twitter: / lowleveltweets
    Follow me on Twitch: / lowlevellearning
    Join me on Discord!: / discord
  • НаукаНаука

Комментарии • 566

  • @anon_y_mousse
    @anon_y_mousse Год назад +1992

    Most important message to be conveyed here, *never* trust user input. Always check it, always restrict what you do with it.

    • @31redorange08
      @31redorange08 Год назад +304

      Most important message to be conveyed here, never have users.

    • @MorningNapalm
      @MorningNapalm Год назад +40

      Most important message here: make your code safter and safter, until it is as saft as possible.

    • @peesicle
      @peesicle Год назад +60

      @@31redorange08 Most important message to be conveyed here, never have code

    • @MizzMaster_
      @MizzMaster_ Год назад +27

      @@peesicle Most important message to be conveyed here, never have a programming experience

    • @peesicle
      @peesicle Год назад +34

      @@MizzMaster_ Most important message to be conveyed here, never have a computer

  • @RobbCorp
    @RobbCorp Год назад +678

    Be really fun to see your 'secure' server broken live and record the actual memory. Great video!

    • @LowLevelLearning
      @LowLevelLearning  Год назад +58

      Glad you enjoyed it!

    • @adammontgomery7980
      @adammontgomery7980 Год назад +16

      I don't know how that's done. As far as I know, the program is loaded into a random address so you would need to know the offset from the program counter to the desired function.

    • @geekzombie8795
      @geekzombie8795 7 месяцев назад +3

      @@adammontgomery7980 Trial and error then? I’m not quite too sure…

    • @Zooiest
      @Zooiest 5 месяцев назад +2

      ​@@adammontgomery7980 yeah, that's address space layout randomization. You'd probably need to exploit the global offset table (is there a Windows equivalent?) to get remote code execution, although I'm not sure if it's _completely_ impossible without it.

  • @eluraedae
    @eluraedae Год назад +159

    In this reality some hackers love strings more than physicists.

    • @patrickday4206
      @patrickday4206 Год назад +1

      Lol

    • @eluraedae
      @eluraedae Год назад +1

      @enrique amaya Bro tell that to the starving kids in other countries.

    • @ishid_anfarded_king
      @ishid_anfarded_king 6 месяцев назад +7

      I don't know why hackers need to like physicists, but okay.

    • @luvmakin9342
      @luvmakin9342 3 месяца назад +1

      @@ishid_anfarded_king playing a pun to "string theory"

    • @Anonymous-fr2op
      @Anonymous-fr2op 2 месяца назад

      ​@@luvmakin9342damn, it slipped through my mind entirely😂😂 f'cking genius

  • @sledgex9
    @sledgex9 Год назад +338

    Technically you must read at most 63 bytes/characters. The 64th byte in the array is the null byte. And you need to remember to set it to null when creating the array.

    • @pawmeowzing2906
      @pawmeowzing2906 Год назад +3

      I don't get it why at most 64?

    • @sledgex9
      @sledgex9 Год назад +50

      @@pawmeowzing2906 I was refering to the example code in the video. He creates an array of 64 elements. The last element must be the null byte.

    • @damurichannel
      @damurichannel Год назад +8

      @@pawmeowzing2906 just dont code in C 😂. Use C++ or Java

    • @Gupatik
      @Gupatik Год назад +72

      @@damurichannel bro r u serious, the power of c is only limited by your skills

    • @def_not_here
      @def_not_here Год назад +74

      @@damurichannel Really? I'd love to see you write a program in Java for a microcontroller, which has like 16kb of memory

  • @mk72v2oq
    @mk72v2oq Год назад +363

    And all this just because someone decided that extra few bytes for storing the length is too expensive.
    Which is kinda strange, because in fact pre-determined length can increase performance dramatically. There is a known recent story about GTA modder who cut the game loading time by 70% just by eliminating strlen() calls.

    • @rb1471
      @rb1471 Год назад +61

      Automation? In my C?

    • @jeffspaulding9834
      @jeffspaulding9834 Год назад +165

      That someone is probably Dennis Ritchie, who created C for the purpose of porting assembly code between architectures for machines that would only ever be accessed by his fellow employees and never connect to a network (networking barely existed then). It was a useful way of getting around the hard coded size limits of strings in languages like Pascal. User accounts and passwords mostly existed to prevent people from modifying other people's files, 'cause accidents happen and some people are jerks. Security was handled by the guard at the gate and the lock on the computer room door.
      If he'd been able to see into the future and know how popular C (and computing in general) would become, maybe he'd have done it differently.

    • @asston712
      @asston712 Год назад +41

      @@rb1471Nahhhh no one would be so heartless as to make C *easier* (🤢) and *safer* (🤬).

    • @5FT6MAN
      @5FT6MAN Год назад +2

      how tf did he get the codellll

    • @taragnor
      @taragnor Год назад +83

      The other thing to remember is that back then, those machines were super primitive, so optimizing things for speed and minimal memory usage was absolutely necessary. Back then the size of things was a lot more important than it is now, so adding that extra 2 bytes to store an integer length was something they probably weren't willing to commit to unless it was needed for the task. It's hard to imagine with a modern mindset where we can throw a few extra megabytes at something and not even care.

  • @jorgeherrera1074
    @jorgeherrera1074 Год назад +109

    Honestly there are no excuses for buffer overflows in your programs today. With all the tools available to devs you have no reason for this to still happen.

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад +2

      Hell, even in ASM it's not hard to prevent.

    • @aurelia8028
      @aurelia8028 5 месяцев назад

      I just tried creating the same program on my computer and it didn't even give me a segmentation fault, so I don't think it's possible to even do this anymore.

    • @williamdrum9899
      @williamdrum9899 5 месяцев назад

      @@aurelia8028 Oh don't be so sure! Computers actually randomize the locations of things in memory just to make this exploit harder. But it is still very possible. x86 hasn't changed in that regard

    • @jp46614
      @jp46614 5 месяцев назад +18

      I have to disagree. As a security consultant, I see a lot of low level applications being developed and these mistakes happen all the time, even when using modern suites and tools which should in theory make this behaviour difficult or warn developers. But often low level APIs have to be used, especially when it comes to writing performant code (a lot of guaranteed safe alternatives for code often have a cost of performance) and that's where the protections go away and mistakes are made. Developers are also often under a lot of stress and pressure to meet deadlines, and corners have to be cut (think TODO's you never come back to).

    • @fw-190
      @fw-190 2 месяца назад

      We are human, we make mistakes, no matter our experience and our skills, if you think I am wrong, look on the internet why planes started to use checklist

  • @FreshSmog
    @FreshSmog Год назад +89

    You can always implement strings as structs and store the length data. It's C, you can do anything. Unfortunately, you still need to get the data back out pretty frequently as the usual null terminated char arrays in order to use other functions.

    • @petrlaskevic1948
      @petrlaskevic1948 9 месяцев назад +2

      Unless you modified other functions too

    • @nunnukanunnukalailailai1767
      @nunnukanunnukalailailai1767 7 месяцев назад +3

      That's a cool way to do it as it makes strlen an o(1). Also renders str(n)cat pretty useless as memcpy works for that, and fwrite can be used in place of printf. But like you said, most of string.h expects null termination. It also has to be ensured that the length stays in sync with the actual contents always.

    • @poutineausyropderable7108
      @poutineausyropderable7108 6 месяцев назад +2

      struct for the wins.
      I don't know why some C coder don't like using them.

    • @EphemeralObsequious
      @EphemeralObsequious 5 месяцев назад

      When you save the string at the outset you just always set the last slot in the array as \0 as a guarantee then never check again.

  • @Agryphos
    @Agryphos Год назад +101

    We should all be grateful that code is getting safter

    • @jollybobbyoger
      @jollybobbyoger Год назад +15

      The safter the better!

    • @SkegAudio
      @SkegAudio Год назад +29

      safter? I hardly know her

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      I don't like how it's getting more bloated

  • @ReptilianXHologram
    @ReptilianXHologram Год назад +134

    I think you should make a course on how to program in C securely/safely for beginners.

    • @mfaizsyahmi
      @mfaizsyahmi Год назад +42

      Abandon C. Become a Rustacean.

    • @SkegAudio
      @SkegAudio Год назад +21

      Just learn Rust at this point

    • @ReptilianXHologram
      @ReptilianXHologram Год назад +22

      @@mfaizsyahmi Linux is still written in C so I have to learn it.

    • @ReptilianXHologram
      @ReptilianXHologram Год назад +12

      @Clemens Horn Wait what? Rust is more secure but what's the point if it is rarely used yet? I'm focused on learning things that will apply to what I will be doing for a job as I am going for Computer Science.

    • @wrong1029
      @wrong1029 Год назад +11

      @@ReptilianXHologram rust will only grow from here on out. I graduated in 2020 and had no struggle finding a rust backend engineer position that pays super well. I'm literally getting spammed on LinkedIn by recruiters looking for rust devs

  • @donjindra
    @donjindra Год назад +7

    As a C programmer I would never uses gets() in a professional program. I always bound check when copying to buffers.

    • @hikefka8001
      @hikefka8001 Год назад

      Yeah and compilers should warn about it.

    • @donjindra
      @donjindra Год назад

      @@hikefka8001 Good programmers shouldn't have to be warned.

  • @nick9198
    @nick9198 Год назад +52

    Remember to pass the -fno-stack-protector flag when compiling your C programs for added stack based security.

  • @coolbrotherf127
    @coolbrotherf127 7 месяцев назад +3

    That's why the newer secure versions of these input functions also include a max data value so they can ignore any input over the intended amount making them much more difficult to exploit with buffer over flows.

  • @younesmdarhrialaoui643
    @younesmdarhrialaoui643 5 месяцев назад +1

    I have to say this channel is very, very, very good. You really are delivering quality wise.

  • @RealNekoGamer
    @RealNekoGamer Год назад +19

    In Pascal-style strings, the length is encoded as a byte at string[0], or sometimes the first 2 bytes (first 2 indices). This is a practice that the Macintosh pre-Intel era used in its API, and how strings are usually stored in binary file formats.

    • @BitwiseMobile
      @BitwiseMobile 8 месяцев назад

      ASCIIZ has been around since PDP-11 times. It was an assembler directive. Pascal does go back that far though. I believe it was released in the early 70s - maybe even late 60s I don't recall .... edit, I just looked it up - 1970 - so yeah, it was around during the PDP-11 heyday.

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      Who was writing strings longer than 255 bytes back then lmao

  • @quipyowert9933
    @quipyowert9933 11 месяцев назад +2

    4:59 "Randomize Memory Data Structure" Perl 5 does this with hashes starting in 5.8.1 and the Perl porters improved the feature in version 5.18. It's in the perlsec manual under the section "Algorithmic Complexity Attacks".

  • @stevenc5140
    @stevenc5140 Год назад +7

    Coding is definitely getting SAFTER

  • @garyhalsey7693
    @garyhalsey7693 Год назад +10

    Having just completed my CompTIA Network+, Security+ & PenTest+, this is a perfect example of the need for sanitisation of user input!!! Great video and you’ve just got a new subscriber!!

    • @LowLevelLearning
      @LowLevelLearning  Год назад +2

      Glad it was helpful!

    • @noviccen388
      @noviccen388 8 месяцев назад +1

      @@LowLevelLearning Please make vidoes on how to secure C from Buffer Overflow. :)

  • @unknownlordd
    @unknownlordd Год назад +11

    more buffer overflow please because I've always watched videos about them but never have "clicked" as simple as this one

  • @davidgomez79
    @davidgomez79 Год назад +13

    Only thing more fun than a buffer overflow is a string format exploit Stack canaries, ASLR, not executable stack made buffer overflows rare now.

    • @spambot7110
      @spambot7110 Год назад +2

      no they don't. they make it harder / less likely for it to be possible to exploit a given buffer overflow, that's it.

    • @davidgomez79
      @davidgomez79 Год назад

      @@spambot7110 you're not contradicting me. I work so I type in a hurry. "rarer".

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад +1

      Never let the user write the format string

  • @vladislavkaras491
    @vladislavkaras491 5 месяцев назад

    It was informative!
    Thank you!

  • @adibemaxwell6111
    @adibemaxwell6111 7 месяцев назад

    This is why when printing something, I was told it's always best to use fprintf() or sprintf() instead of basic printf().

  • @jamesleecoleman
    @jamesleecoleman Год назад +5

    This is great! I wish I had this years ago when I was learning more about Buffer Overflows for a pentesting cert. It took me like a year to actually perform my first BO but it was still messed up but it worked lol.

  • @omegahaxors3306
    @omegahaxors3306 55 минут назад

    Even a small buffer over/underflow can result in complete arbitrary code execution.

  • @23trekkie
    @23trekkie 6 месяцев назад +1

    That's why you're supposed to use fgets, which gets both string and it's size.
    fgets(string, sizeof(string), stdin);
    In case of scanf, you can define the number of characters...
    char array[64];
    scanf("%63s", &array);
    But I guess there is a way to overflow those too... Not that I know about it.

  • @HenrikBerg1965
    @HenrikBerg1965 Год назад +4

    If someone reads this know that strncpy is broken and should not be used.
    If input is larger than buffer, no nul is added to the buffer. Never use strncpy without the line after:
    buffer[sizeof(buffer) - 1] = ‘\0’;
    If you don’t like to write this every time, use strlcpy.

  • @jonsunderland7708
    @jonsunderland7708 Год назад +6

    Current GCC wont even compile gets()
    It will explicitly tell you "Dont be a dumbass, use fgets()"

  • @vanlepthien6768
    @vanlepthien6768 5 месяцев назад

    The problem comes down to - in a large part - null terminated strings. I thought it was a lousy string implementation back in the dark ages. Nothing since has changed my mind.

  • @rodtronics771
    @rodtronics771 Год назад

    I am so glad your videos are short. Theyre so good.

  • @hanzo2228
    @hanzo2228 Год назад +5

    great video ! how did you get the function description in c 2:30 ? as a beginner that would really help me understanding the functions in depth

    • @surv5k
      @surv5k 7 месяцев назад +3

      youve probably found the answer by now, but on a unix-like system you can type `man (name of function)` and it will show a manual page for it

  • @rebok232
    @rebok232 8 месяцев назад

    To don't have buffer overflow there is a simple way: ditch low level langs, for langs like: for ex. kotlin, or rust

  • @ZeroDayEx
    @ZeroDayEx 3 месяца назад +1

    why is the function still there if even the dev said to not use it????

  • @sp3ct3r71
    @sp3ct3r71 Год назад +6

    crystal clear explanation.. could you please make a video about basics of assembly also??

    • @stopper0203
      @stopper0203 Год назад +1

      I would love to know the assembly code for Intel's processors. I only learnt the assembly code for Atmel's microcontrollers at university, so knowing a more widespread assembly language would be great.

    • @deang5622
      @deang5622 Год назад +1

      It's easy. Just go look up the instruction set for the processor you are interested in.
      You don't need someone else to do that for you. And you will learn more by obtaining the instruction set for you.
      And over the decades authors have written books for each microprocessor teaching you the instruction set and how to program it. "Programming the xxxx" where xxxx is the part number of the microprocessor you are interested in, was one popular family of books.

    • @williamdrum9899
      @williamdrum9899 Год назад +2

      Chibiakumas has a few good videos on it. Watching the Z80 or 8086 tutorials will give you a good idea about how Intel CPUs work

    • @stopper0203
      @stopper0203 Год назад

      @@williamdrum9899 Thanks, I'll definitely check these out

  • @Endar92
    @Endar92 3 месяца назад

    Interesting fact, nowadays as memory is relatively cheap, programmers reserve more bytes for a string as would be needed. Instead of 64 bytes, they reservre 1024 bytes and so on. This simple trick can be helpful in mitigating the overflow attack as it's harder to overflow.

    • @ZT1ST
      @ZT1ST Месяц назад

      Not just mitigating the overflow, but making it significantly more obvious if someone is trying to overflow it - because they'll be using all 1024 bytes worth of characters.

  • @hiftu
    @hiftu Год назад +35

    The real issue is that a huge amount of projects rely on c libraries.
    It would take decades the change all the programs.
    Than you have to check all the c++ libraries as well.
    After that all the libraries from another unsafe programming languages.

    • @5FT6MAN
      @5FT6MAN Год назад

      examples?

    • @trayambakrai
      @trayambakrai Год назад +9

      Most of these, if not all can be mitigated via simply taking care of what you're writing -- no need to use a "safe" programming language like Rust.

    • @techsupport1294
      @techsupport1294 Год назад

      If you know how those libraries are used then you'd know the library simply changing a gets isn't going to require changes in your code.

    • @deang5622
      @deang5622 Год назад +6

      No. Buffer overflows occur in C because a string is just a pointer to a block of memory and there is no bounds checking done, which is down to you as the programmer to code up, and check you don't go beyond the end of the allocated memory for the string.
      So it's within your hands to go through the code and identify everywhere you are writing into a string or block of malloc'ed memory and ensure you don't write beyond it.
      You don't need to edit libraries. Now if those library functions are fairly sophisticated and are doing tasks such as getting the input and writing it into the string for you, then you have a problem.

    • @white-bunny
      @white-bunny Год назад +1

      @@trayambakrai You can use a library like GLib, as it has the GString struct which is used in a lot of code in C (as long as you're not writing for upstream kernel or upstream projects).

  • @user-rh2xc4eq7d
    @user-rh2xc4eq7d 4 месяца назад

    What I don't understand is how the hacker would know what value to set the return address to. They would have to have access to the code or decompile the binary or something.

  • @mostafaseyedashor8768
    @mostafaseyedashor8768 Год назад +3

    what a good and neat explanation and not a fast and nerdy voice! thank you

  • @alfredomenezes8814
    @alfredomenezes8814 Год назад +2

    Very interesting, please do a video about the mitigations!

    • @ytg6663
      @ytg6663 Год назад

      Use modern Compilers they already warn you when you use unsafe functions

    • @JorgetePanete
      @JorgetePanete Год назад

      Going to Rust

  • @stopper0203
    @stopper0203 Год назад +6

    I absolutely LOVED this video! Keep up the good work 😊!

  • @Jmcgee1125
    @Jmcgee1125 Год назад +6

    Read the gets(3) man page if you feel like having some fun. It's not long. Opens with "Never use this function" and ends with "ISO C11 removes the specification of gets() from the C language." It also mentions that "there can be no guarantees that the function will even return" - which is due to the return address manipulation exploit in this video.
    5:14 I wouldn't recommend using strncpy(3). It does not guarantee that the result will be null-terminated, so it's just another thing to keep in mind when copying. Use snprintf(3) instead, even if strncpy(3) is technically safe.

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      I suppose it was a product of its time, if gets() is used correctly it's fine but there are bad people on the internet who turn it into goto()

  • @eduardostarz
    @eduardostarz 24 дня назад

    3:50 it's written "incldue" instead of "include" in line two that's why it's unsafe code, hope it helps
    great video btw

  • @shadowchasernql
    @shadowchasernql 9 месяцев назад +1

    I love how c is inexplicably the scapegoat for every single buffer overflow ever

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      I don't even see assembly get any grief for it either lol

    • @shadowchasernql
      @shadowchasernql 8 месяцев назад

      @@williamdrum9899 It's funny because the only languages I have ever experienced a memory overflow in have been in ones without manual memory management.

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      @@shadowchasernql You mean like Java and its NullPointerException?

    • @shadowchasernql
      @shadowchasernql 8 месяцев назад

      @@williamdrum9899 Not quite

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      @@shadowchasernql I'm curious how that's even possible anyway since the language won't let you fix it

  • @platin2148
    @platin2148 Год назад +7

    The dumb idea of zero termination pascal has a bit more safety. Also in C you typically use the n functions to minimize potential damage. Store strings in strechy buffs and never use functions that don’t allow you to take the max len with them.
    I doubt this add will push anything to the stack i suspect it will directly use registers as there are still plenty of them open and passing via stack is in comparison really expensive. The macOS arm64 calling convention is a but of a weird one when we then go to structures..

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      Until you add 2 to your string pointer and now you have no idea when it enda

    • @platin2148
      @platin2148 8 месяцев назад

      @@williamdrum9899 That’s why you don’t pass a pointer to the memory of the string but just a slice/view which is nothing more than ptr+offset and it doesn’t matter if you lose track down the line as you aren’t the owner of the memory in the first place also never ever move a pointer in place.

  • @shadowchasernql
    @shadowchasernql 9 месяцев назад

    I find it funny that c is the scapegoat for string overflow

  • @fabienb3432
    @fabienb3432 Год назад +10

    Awesome video ! Thanks a lot for explaning about buffer overflows with an example.

  • @christiangamers2254
    @christiangamers2254 4 месяца назад +1

    In C++, we have a similar problem when we use a char pointer.
    One solution I found was to use cin.width(length) and cin.ignore(some large number)
    These two commands ensure to consider a string upto length and ignore the rest of the characters entered. Is there a problem with this approach?

  • @TheGeckoIsKing
    @TheGeckoIsKing 11 месяцев назад

    So happy “Code is getting safter”

  • @ancientgearsynchro
    @ancientgearsynchro 5 месяцев назад

    I always questioned why my teacher told me to put hard limits on the length of passwords my users can even input, let alone check. Guess this was one of the reasons.

    • @ThatJay283
      @ThatJay283 3 месяца назад

      another safe way is to use heap allocated strings with %ms. for example, using scanf with %ms, you can pass a pointer to an unallocated string, and the result will be allocated by scanf.
      also ideally, there should even be a limit for how long a password should be. how long a password is shouldn't matter, because you shouldn't be storing passwords anyways. processing passwords properly will make them a fixed size.

  • @MisterDan
    @MisterDan Год назад +1

    Nice and clear presentation

  • @mastershooter64
    @mastershooter64 Год назад +2

    Didnt know hackers were interested in theories of quantum gravity as well ;)

  • @rb1471
    @rb1471 Год назад +2

    I'd be interested in how they know the address of the function they would want

  • @semibiotic
    @semibiotic Год назад

    Unfortunatelly, strncpy(3) (and strncat(3)) is vulnerable too. It adds no NUL if source string size is equal to "n" argument.
    You should use strlcpy(3) instead, if impemented (or invoke strncpy(buffer, src, sizeof(buffer)-1); and make sure that buffer[sizeof(buffer)-1] = '\0';).
    Also, strncpy(3) fills the rest of buffer with NULs, which creates performance issue.

  • @US-Air-Forces
    @US-Air-Forces 5 месяцев назад

    Just don't keep sensitive information in memory unsecured

  • @wsurferdude_ct
    @wsurferdude_ct Год назад

    Well explained! Thanks

  • @MegaCrossfire007
    @MegaCrossfire007 5 месяцев назад

    I don't quite get it. If we change the return address with the buffer overflow we are outside the main function, but that doesn't automatically mean that we end up in the debug function, does it? And why does the buffer owerflow changes the return address, shouldnt it simply change the value of the address which comes after the allocated memory for the string?

  • @jackpatteeuw9244
    @jackpatteeuw9244 Год назад +2

    I come from the world of VAX/VMS. They handled strings/pointers VERY DIFFERENTLY, likely because one of their priorities during development of the heavily intertwined hardware and software, SECURITY WAS A PRIOTIY !
    This was the early 80s, so VAX was a 32 bit computer. A "descriptor" (fancy C pointer) was FOUR entities. An 8 bit "class", an 8 bit "data type" and a 16 bit "length". A CLASS could be a fixed or variable length entity. Fixed length was most common. The data type (DTYPE) could be many things making it a descriptor to a byte, word, long word, quad word, etc. So CLASSS=1 and DTYPE=24 was a fixed lenght string.
    Buffers were ALWAYS defined to have a fixed length and any system library/service routine respected that length !
    There were other hardware devices that prevented access to different parts of memory that did not belong to your process.

    • @williamdrum9899
      @williamdrum9899 Год назад

      I guess C was meant to be as lean as possible. I code in assembly and I found myself creating something similar to your method when programming a neo geo game. For sprite data I stored the width and height at the start of the array and used those as loop counters to copy what came after into video memory, that way the routine always stopped when it was supposed to

  • @joeldoonan-ketteringham5174
    @joeldoonan-ketteringham5174 2 месяца назад

    Never trust a user, always assume that they are trying to hack you and your other users.

  • @GamersGoneExtinct
    @GamersGoneExtinct Год назад +7

    Would love to hear more about the other mitigations and how to use them

  • @guilherme5094
    @guilherme5094 Год назад +1

    Thanks again👍!

  • @noam2802
    @noam2802 2 месяца назад

    but isnt it solvable like in 3 lines of code?
    In a project im currently doing I just did a loop that kept reading from the buffer until it reached '\0'

  • @Masa-san
    @Masa-san 5 месяцев назад

    3:00 - this is incorrect in AArch64/64-bit ARM. Without compiler optimizations turned on, arguments would be pushed to first and second general registers. With optimizations turned on, this function would be completely ignored and compiler would just calculate it in compile time. So this issue is defined by different rules under different architectures, and I guess (since I'm not working with X86 assembly) what you have pointed out is X86_64 thing.

  • @raptoress6131
    @raptoress6131 Год назад +4

    I'm just going to assume that all users are criminals

  • @allumallu6580
    @allumallu6580 5 месяцев назад

    Now i get it ! thx

  • @spacespyder
    @spacespyder Год назад

    Great Video thank you for explanation :)

  • @Stratelier
    @Stratelier 5 месяцев назад

    Makes me wonder why gets() is even still supported in any compilers if it's _so_ deprecated that its documentation literally says "don't do this". (Okay, apparently it was removed entirely from the C11 standard, but anything based on C99 is still more or less required to include it.)
    And why null-terminated strings became the preferred implementation is weird when (at least for common strings less than 256 chars) adding a separate byte to specify the length up front not only avoids much of the buffer overflow scenarios, but doesn't even add any memory cost! You'll have to reserve one extra byte in memory either way (either a null byte, or a length byte), and how much space you might need to (dynamically) allocate to hold a string is a separate problem.

  • @markchristophergemzon1052
    @markchristophergemzon1052 Год назад +2

    If you are trying to overflow the buffer and placing the address of the 'debug' function as the return address, how do you find out the address of the debug function? Also, how do you know when to input the address of the debug function so that it exactly goes into the return address?

    • @dzhimy6266
      @dzhimy6266 Год назад +1

      You know when the return address is on the stack because the stack frame will (in almost every single case) be the same size on subsequent executions of that function.
      How you actually get the return address of the debug function depends. Without modern exploit mitigations, you could simply just hard-code the address into the overflowed buffer.
      With modern exploit mitigations, you will need to chain the buffer overflow vulnerability with a memory address leak vulnerability that points to somewhere within the .data section of the program because one of the many modern mitigations that exist is Address Space Layout Randomisation (ASLR), which randomises the virtual address that the executable code is loaded into memory at on a per-run basis of the program.
      There are other exploit mitigations similar to this, but with just ASLR, you will need to leak a memory address, and then in the same runtime of that program use that address in the buffer overflow in another part of the program to successfully redirect code execution to the debug function.

    • @nick9198
      @nick9198 Год назад

      Use a debugger, if there is a potential overflow you can just overflow the buffer with random characters and then the debugger will crash and dump the part of the string you supplied and based on that you can calculate the distance, if you will, from the overflow to the return address on the stack. Getting the address of the function is harder (if not impossible) because of ASLR (address space layout randomization), there are ways to get around this though, one popular method is known as ROP (return oriented programming), which can in some instances allow you to circumvent ASLR and execute code arbitrarily.

    • @williamdrum9899
      @williamdrum9899 Год назад

      For your second question it depends on the CPU architecture. Typically, a C function in x86 assembly will CALL main (which pushes RIP) and then PUSH RBP at the start of the function. So the offset RSP+8 is where the return address will be.

  • @smeboo9044
    @smeboo9044 Год назад +3

    If the string has to be local to the return instruction on the stack, how would you ensure that where you’re storing is actually on the stack and not in a register? I’ve had this question for a while now and I feel like I’m missing some important concept.

    • @Ronaldo-se3ff
      @Ronaldo-se3ff Год назад +2

      it's how a value in a subroutine is returned at the machine level. some insight into how subroutines work in assembly language should clear this up.

    • @daryl9915
      @daryl9915 Год назад +2

      The compiler usually adds instructions to do that. I think I saw a machine code/compiler tutorial somewhere years ago that explains how the stack works, but I can't remember exactly where now.
      Either way, registers are too small for most strings (usually 8 bytes for general purpose registers in 64-bit architectures); it's reasonable to expect any compiler to store strings somewhere in memory - stack or heap

    • @dzhimy6266
      @dzhimy6266 Год назад +2

      Strings are never stored in a register unless you pack the data in such a way that they make up the bytes of a multi-byte integer. Generally you will store strings on the heap, not the stack. Default configurations for modern compilers such as CLANG will - by default store local variables such as strings in a secondary stack known as the shadow stack, which does not store return addresses and therefore cannot be used to redirect code execution using the return-value overwrite method. To guarantee your data is stored on _the_ stack and not a shadow stack or the heap, you will use a function called "alloca", which is similar to malloc, except that you don't have to free the address it returns because that address is on the stack, all this function does is move the stack pointer further down the stack by the size you provide it (usually this is compiled inline, but can appear at runtime too).

    • @smeboo9044
      @smeboo9044 Год назад +1

      Okay! That makes sense. I might have been overthinking it a bit then, but just to clear up in my head a bit, if a function took 3 arguments; int x, string y, int z, is it correct to assume that %rdi would hold x, %rsi would hold a pointer to the stack where y is stored, and %rdx would hold z.
      That is if we’re using the same x86 isa when I “learned” this all? Lol

    • @daryl9915
      @daryl9915 Год назад +2

      Pretty much. Depending on the language, string y in the function parameters is most likely a pointer to either a char array allocated on the heap (via malloc() or new) or an object/instance of the language's string class, again probably allocated from heap memory rather than on the stack. Been a long time since I actually did any programming though, especially in C

  • @jaszczurtd
    @jaszczurtd Год назад +10

    Nice video, but it did not actually show how buffer overflow can be used to steal data. All that was shown was that buffer overflow while overwriting memory causes the application to crash, because a segmentation fault occurs and.... that's the end of the hacker's adventure in stealing information.

    • @reroman
      @reroman Год назад +1

      Segmentation fault occurs when you try to access to memory you're not allowed. Buffer overflow can overwrites values in the same program altering its behavior without crashing.

    • @williamdrum9899
      @williamdrum9899 Год назад +1

      But if you overflow just the right amount you can turn that gets() into any function call the user wants

  • @hodayfa000h
    @hodayfa000h 2 месяца назад

    🤯
    I never knew it was like this

  • @ExecutionErrorSmough
    @ExecutionErrorSmough 8 месяцев назад

    I got a good laugh out of the stock video at 0:10

  • @19adhin
    @19adhin Год назад +1

    How would a hacker find the address to the debug function? @4:15

  • @bdidue6998
    @bdidue6998 Год назад +3

    At this point in time, buffer overflows should be inexcusable.

  • @LoesserOf2Evils
    @LoesserOf2Evils Год назад +3

    Please: more on the buffer overflow; and some on mitigating against it and protecting against it.

    • @JorgetePanete
      @JorgetePanete Год назад +1

      Going to Rust fixes it

    • @williamdrum9899
      @williamdrum9899 Год назад +1

      Easiest way is to use a language that manages it for you. If that's not an option, the 2nd best way is to restrict user input. If you're expecting a 10 byte string, don't let the user overfill it.

  • @DeeArr
    @DeeArr Год назад

    ..Good reason to use a managed language such as open source C#

  • @nathanoy_
    @nathanoy_ Год назад

    0:09 did someone notice the RUclips video playing on this mans computer in the stock footage?

  • @0xfrijolito
    @0xfrijolito Год назад

    4:45 printf is also a vulnerable fuction if you direcly pass a string without format

    • @erikkonstas
      @erikkonstas Год назад

      Eh, I'm not very sure about that... the danger comes if your string might have "%" characters in it, but I don't know of any format specifier in any implementation that could execute code or write to memory.

    • @piaIy
      @piaIy 8 месяцев назад

      @@erikkonstas %n writes to memory

    • @erikkonstas
      @erikkonstas 8 месяцев назад

      @@piaIy Hm, you'd have to have the correct pointers very strategically in the stack, and also potentially a very, very long string to make the value from %n large enough... oh, and you also cannot decrement that value.

    • @williamdrum9899
      @williamdrum9899 8 месяцев назад

      It's vulnerable if you allow the end user to write the format string

  • @xr.spedtech
    @xr.spedtech Год назад +1

    No more null bytes from now on ..
    Use unsigned for a counter ...

  • @anonymous_4276
    @anonymous_4276 Год назад +1

    So if the return address is corrupted thanks to buffer overflow in C, why does the program enter the debug function? I don't see the connection there. It seems to me like the program should just crash incase the return address is overwritten as it now has nowhere to return after encountering the return statement.

    • @YouAreUnimportant
      @YouAreUnimportant Год назад

      you are right. you have to place the right number on the right position of the stack for that to work. otherwise you most probably access memory you aint have permission to and the operating system will stop you.

    • @williamdrum9899
      @williamdrum9899 Год назад

      If a hacker knows the address of the code they want to run, putting it on the stack where the old return address was will cause the return command to act as a goto. Effectively the RET instruction on most CPUs is "goto TopOfStack"

  • @liquidsnake6879
    @liquidsnake6879 2 месяца назад

    Sucks that strings make up like 80% of all programming lol seriously, command line tools, compilers, web browsers everything relies so heavily on regexes and string comparisons it's insane

  • @nitsuj1001
    @nitsuj1001 Год назад

    This mitigation wouldnt take only a video rather than a 2 full university courses

  • @372leonard
    @372leonard Год назад +2

    is this just a security issue with c and c++ or are other languages like javascript and C# also vulnerable to it?

    • @v01d_r34l1ty
      @v01d_r34l1ty Год назад

      They can exist everywhere, however 99% of them are probably just from C programs. C operates on a very low level with a very minimalist STL. Makes it very hard to secure and optimize. With modern development standards and procedures, languages like C++ and Rust make it much much less likely to be a problem. Even higher level languages can still suffer from it, but this typically comes down to a flaw in the compiler or interpreter rather than your code itself.

    • @ailijic
      @ailijic Год назад +2

      This might be a bit padantic. At this time all of the CPUs that is mere mortals will encounter are stack machines. So they are all vulnerable to buffer overflow attacks. That being said, we have come up with a lot of ways to mitigate this type of attack; try searching for address space layout randomization (ASLR). An attack is composed of multiple parts. Simple terms would be vulnerability, exploit, and payload; I might be missing one or two. The buffer overflow is the vulnerability. Next, is the exploit; the attacker has to figure out how to use the exploit to run other parts of the code. Finally, you have the payload; the code that is run via the exploit. It could be a virus, a worm, randomware, or code that gives you access to the system. Moral of the story, there is no such thing as a nontrivial bug free program. You can reduce the number of bugs but you never get rid of all of them. It's an arms race, hackers and security go back and forth.

    • @v01d_r34l1ty
      @v01d_r34l1ty Год назад

      @@ailijic +1 for mentioning ASLR

    • @sledgex9
      @sledgex9 Год назад

      Not a problem with C++, unless you specifically decide to do strings with C way in C++.

    • @v01d_r34l1ty
      @v01d_r34l1ty Год назад

      @@sledgex9 automatic allocators in C++ are a godsend

  • @oglothenerd
    @oglothenerd 2 месяца назад +1

    Hey Rust, we need your help.

  • @lambiwins4499
    @lambiwins4499 Год назад +1

    Guys I started studying in a university ( computer science and programming ) so I have a question how can I detect if my Programm is getting buffer overflow ?

    • @williamdrum9899
      @williamdrum9899 Год назад

      Some compilers add extra code using what's called a stack canary. It's a secret variable stored between your string and the return address, that the programmer can't access at compile time. Its value is set at the start of a function, and checked at the end. If it changed, then the function won't be allowed to return and will instead segfault

  • @JarppaGuru
    @JarppaGuru Год назад +1

    4:26 yeah lets change gets function not accept more than lenght is. fixed! and if write bigger then password is wrong it would not run debug. kinda bad example. overflow would not correct password. it would return anyway correct position

  • @aaaaanh
    @aaaaanh Год назад +1

    Basically the user/threat actor gaslights the vulnerable program into submission and giving up the data it has

    • @williamdrum9899
      @williamdrum9899 Год назад

      It's more like they're in the passenger seat while the computer is using a GPS to drive home, and the hacker takes the GPS while the driver isn't looking and sets the destination to somebody else's house.

  • @IamSholiSJ
    @IamSholiSJ Год назад +1

    These vulnerabilities r present in C++ iostream library functions or not?

    • @tbyz2572
      @tbyz2572 Год назад

      存不存在取决于你是怎么写的

  • @mobslicer1529
    @mobslicer1529 Год назад +2

    i love me a good buffer overflow

  • @insoft_uk
    @insoft_uk 8 месяцев назад

    It’s not that prints gets were bad at the time, just sloppy programming not checking before using, now newer safer replacements were added to enforce checks tho still majority of programmers are not fully checking and I don’t think this will ever change and forever have exploits that should of never been

  • @-21_26
    @-21_26 Год назад +2

    Hey, I am at a freshman year at automation of atomic and energetic process at my university. I was told to not use gets() no way, only fgets() , but they didn't told me about this part . I hope to see more videos about C from you.

    • @williamdrum9899
      @williamdrum9899 Год назад

      This is pretty much the reason they tell you not to use gets()

  • @awesomesauce1157
    @awesomesauce1157 7 месяцев назад

    bro your thumbnails are absolutely hilarious

  • @THEBESTPINEAPPLEY
    @THEBESTPINEAPPLEY 5 месяцев назад +1

    4:10 Line 2 Syntax Error: Did you mean "#include"?

  • @TonyDaExpert
    @TonyDaExpert Год назад +1

    Using sprintf while watching this 💀
    Was just trying to convert a float to string on a micro controller 😢

    • @reroman
      @reroman Год назад

      Use snprintf function.

    • @TonyDaExpert
      @TonyDaExpert Год назад

      @@reroman the compiler I was using had an issue with using that lol

  • @enderger5308
    @enderger5308 Год назад +4

    That's one of Zig's nicer improvements: it uses fat pointers for strings so that a class of buffer overflows can be ignored (and, provided you pre-allocate buffers, most user input is safe by virtue of using the buffer size).

  • @thescientist8599
    @thescientist8599 Год назад +1

    Lol, I literally had a lesson about the NUL an hour ago

  • @tonym5857
    @tonym5857 Год назад

    Small and powerfull video 👏👏👏👏

  • @yepee1
    @yepee1 Год назад

    Why do we still have gets() if it's broken??

  • @mahsumsaetgareev
    @mahsumsaetgareev Год назад +2

    4:16 but we should know the address of the debug function in memory right? How can we do that? (for learning purpose only)

    • @DerMigi
      @DerMigi Год назад

      I would like to know that as well!

    • @jonshouse1
      @jonshouse1 Год назад

      printf("Address of function debug is %p
      ",debug);

    • @jonshouse1
      @jonshouse1 Год назад +1

      PS If you mean "know" the address externally from the executable the answer is you cant. The best you could do if it was open source would be the compile it on the same arch (or dig into binary) and guess the most probable. See "Address_space_layout_randomization" on Wikipedia for why that probably wont help much (in the case of Linux).. Ignoring the minor issues that physical to virtual mappings are also dynamic, so guessing an offset into a binary will still fail to give an absolute address even the stack use where not also obfuscated.

    • @LowLevelLearning
      @LowLevelLearning  Год назад

      If the binary is not compiled position independent, the address of debug is the same every time regardless of ASLR.

    • @user-dh8oi2mk4f
      @user-dh8oi2mk4f Год назад

      @@LowLevelLearning So is it still possible to get the address of debug() if it is compiled position independent? And how would you get the address even if it wasn't?

  • @vikingthedude
    @vikingthedude 8 месяцев назад

    Does Zig have these problems?

  • @danielcoffman1022
    @danielcoffman1022 Год назад +1

    Would you say the C++ equivalent ( getline (std::cin,UserIn); ) have the same vulnerability?

    • @sledgex9
      @sledgex9 Год назад +3

      No. getline() uses std::string. std::string will automatically grow to hold extra characters AND it will know how many characters it actually holds.

    • @danielcoffman1022
      @danielcoffman1022 Год назад

      @@sledgex9 that’s good to know. Would you also say that it is better, from the perspective you describe in this video, C++ programs are better for security than a straight C program?

    • @sledgex9
      @sledgex9 Год назад

      @@danielcoffman1022 *Generally* speaking and in the context of the video C++ should more secure IF you use containers to manage objects/memory. eg std::string and std::vector. And also IF you use smart pointers where applicable. Manually managing stuff is possible in C++ too. However, every time you do stuff entirely manually you run the risk of "off-by-one" errors.

    • @danielcoffman1022
      @danielcoffman1022 Год назад

      @@sledgex9 I see, so if you (the person) are handling this stuff yourself, it should be thought of as a risk. But these std and smart pointers are just a safer way to go?

    • @sledgex9
      @sledgex9 Год назад

      @@danielcoffman1022 Yes, the STL (Standard Template Library (the std stuff)) and smart pointers are generally the way to go. They greatly reduce the possibility of coding errors. However, they don't always eliminate it. If you're persistent enough, you can still make errors with them that will result in crashes or security issues.

  • @Enter_channel_name
    @Enter_channel_name 8 месяцев назад

    Is it really that hard to bounds check user input?

  • @ejonesss
    @ejonesss 5 месяцев назад

    if gets is so dangerous then why is it still included?
    unless to maintain backward compatibility as some binaries still use that and so it was left in the os to support the older binaries.
    for example you may have a binary blob from the 80s or 90s and it still serves a valid function today but the source code is lost or it is proprietary so the blob may be included to provide the function since a replacement is not around.
    it could be a print driver for an interface like parallel port and there is no usb driver not even a usb to parallel adaptor.