Advanced C: The UB and optimizations that trick good programmers.

Поделиться
HTML-код
  • Опубликовано: 28 сен 2024
  • This is a video that will talk about some less know things in the programming language C, and how these things impact optimizations and the kinds of bugs that they can produce. This is not a video for beginner programmers.
    twitter:
    @quelsolaar (professional)
    @eskilsteenberg (personal)
    www.quelsolaar.com
    www.gamepipelin...

Комментарии • 378

  • @TotalTimoTime
    @TotalTimoTime Год назад +410

    I like that you chose a dark color scheme for the slides but the random white flashes in between really hurt my eyes because of the stark contrast to the rest of the video

    • @faultboy
      @faultboy Год назад +40

      It is like a very effective Flashbang

    • @malusmundus-9605
      @malusmundus-9605 8 месяцев назад +20

      It's my 2nd favorite part of the video. The whole time I was like, "lmao off someone is probably really pissed about this...".

    • @macicoinc9363
      @macicoinc9363 7 месяцев назад +2

      Dark color schemes hurt my eyes. I can't even look at them for a minute without getting a horrible headache.

    • @roymarshall_
      @roymarshall_ 3 месяца назад

      ​@@macicoinc9363your brain terrifies me

    • @Sinthoras155
      @Sinthoras155 3 месяца назад +2

      I once blinked while he was trying to flashbang me that was funny

  • @catcatcatcatcatcatcatcatcatca
    @catcatcatcatcatcatcatcatcatca 3 месяца назад +36

    As someone learning C/CPP this is a true goldmine. I feel like I have managed to at least experience time travel and issues caused by volatile values not being declared as such, when writing code for arduino.
    I just wish GCC was more helpful. This might be a RTFM issue on my part, but it would be nice to get a hint like ”maybe you meant to write a function that has defined behaviour?” or something.

    • @xugro
      @xugro 3 месяца назад +1

      I guess checking for undefined behaviour is slow but I wish there was an option to warn if they exist. (At least the known ones)

    • @noctiflorous1337
      @noctiflorous1337 3 месяца назад +7

      Not sure if that's what you mean, but you can use the flag "-fsanitize=undefined" with gcc. And also don't forget to add the same thing to the linker flags, if you're doing separate compilation-linking.

    • @henrikoldcorn
      @henrikoldcorn Месяц назад +1

      Bash can tell me I probably meant a different command when I typo but gcc can’t tell me that I forgot to close a bracket, fantastic.

  • @dickheadrecs
    @dickheadrecs 2 года назад +889

    so many “what is a pointer?” videos and not much else. thanks for adding some interesting matter from the depths

    • @puppergump4117
      @puppergump4117 Год назад +17

      On the bright side, I'm having a very easy time making a program that has to juggle pointers around.

    • @CounterStrik614
      @CounterStrik614 Год назад +5

      I still don't know what pointer is

    • @undeniablySomeGuy
      @undeniablySomeGuy Год назад +43

      @@CounterStrik614 it points

    • @CounterStrik614
      @CounterStrik614 Год назад +6

      @@undeniablySomeGuy holy variables! will know that, thank you

    • @dickheadrecs
      @dickheadrecs Год назад +35

      @@CounterStrik614 i cast to a void pointer once and i haven’t seen the cat since

  • @b4rnm4n
    @b4rnm4n Год назад +62

    Why am I getting flashbanged on a video about c

  • @grimvian
    @grimvian Год назад +34

    Fantastic - the only video I know of, that reach the level of "How I program C".
    And no music and other disturbing video stuff - just pure and clean.

  • @codewizard58
    @codewizard58 Год назад +91

    Wrote my first C compiler in 1982 for CDC6400 machine. 60 bit words so 60 bit chars, pointers, ints. Just enough memory to do simple constant folding of expressions.

  • @jonsunderland7708
    @jonsunderland7708 2 года назад +15

    I wish they had a C con the way they have Cpp cons. C is like a fine wine and I wish there were conferences.

  • @autumn-876
    @autumn-876 Год назад +10

    Ty for making a video abt this that doesnt feel like it relies on peoples short attention span. This is exactly what I'm looking for when I look for a coding video on youtube

  • @damercy
    @damercy Год назад +6

    Native Android dev here. Such good explanation, you captured my attention. Thanks for this!

  • @Fasteroid
    @Fasteroid Год назад +8

    49:29 bamboozled me a lot. Binging RUclips in bed on my iPad, apparently with it about half an arm's length away, BOTH my blind spots converge on the closing curly brace when I look at the second _i_ in the for loop.
    Was kinda freaky seeing an instance of UB in my own retinas after you talked about instances of it in C so much.
    Sub earned.

  • @roboticbrain2027
    @roboticbrain2027 Год назад +23

    How is ommiting the malloc() == NULL not a compiler bug?
    The standard clearly defines this to be a possible error case which has to be checked against?
    Edit: the real issue seems to be that the compiler optimizes the malloc itself away, because it knows the memory is never used.
    Therefore it can assume it always succeeds because it never called it in the first place.

    • @zombie_pigdragon
      @zombie_pigdragon Год назад +2

      I didn't realize that was the issue! Thanks for pointing it out, was also confused why it was misbehaving.

  • @agehall
    @agehall Год назад +8

    I thought I knew C on an above-average-level at least but after watching this video, the only thing I know is that I’m scared of C now…

  • @ruslikaici
    @ruslikaici Год назад +4

    The whole talk I had a feeling that it's John Carmack talking (funny fact, he also mentioned he prefers Visual Studio debugger)

  • @greyfade
    @greyfade Год назад +17

    Two things: C23 now requires VLAs again, rather ridiculously. And, GDB has a TUI mode that is a little buggy, but quite good, and gives you a visual debugger featureset.

    • @RobBCactive
      @RobBCactive 11 месяцев назад

      Why are VLA requirements ridiculous? What's feasible for implementations can change with time.
      The first cc(1) I used had =+ & =- and didn't even support K&R C or the C widely published in books.
      BTW the VLA inclusion unbroke the single error I made in an exam at Uni. which cost me a 100% result long before the C standard inclusion so you need a really good rationale.

    • @greyfade
      @greyfade 11 месяцев назад +1

      @@RobBCactive It's ridiculous because only GCC properly supports it, and the feature was added and then deprecated and then re-added to the standard. This is an absurd thing to do, especially for a committee that is so overwhelmingly committed to keeping the language as much the same as possible over the decades.

    • @RobBCactive
      @RobBCactive 11 месяцев назад

      @@greyfade compilers didn't support ANSI C until they did, obviously function prototypes are absurd by your reasoning.

    • @greyfade
      @greyfade 11 месяцев назад +1

      @@RobBCactive That's a disingenuous argument. The situation is not comparable. C didn't add function prototypes to the standard and then remove them in the next version and then add them back in the next version. They didn't do that with any feature except VLAs. And they haven't done that with any other compiler-specific feature, either. They didn't add an MSVC-specific extension or a Clang-specific extension or a Sun extension that no one else implemented. They only did that with GCC's VLAs.

    • @RobBCactive
      @RobBCactive 11 месяцев назад

      @@greyfade nope, VLA is implementable and understandable by competent people.
      You've made no case why VLA is not useful or impractical.

  • @kiseitai2
    @kiseitai2 Год назад +4

    The optimizations flags in gcc bit me a long time ago. My code had no bugs without optimization flags on but then would develop a bug after O2. I don’t recall what the exact issue was, but from then on I would run my unit tests with and without optimization flags to minimize the potential for aggressive optimizations or missing a keyword to force the compiler to be more careful with a function.

    • @eskilsteenberg
      @eskilsteenberg  Год назад +15

      Your code was buggy before you turned on the optimization flags, the optimization flags just revealed them. your strategy of testing in multiple different optimization modes is the right one!

  • @blakebaird119
    @blakebaird119 Год назад +8

    Definitely request more of these detailed C videos from Eskil. Its a space where there just is not a lot of content

    • @eskilsteenberg
      @eskilsteenberg  Год назад +7

      Ill try but im not a youtuber so i have limited time.

    • @blakebaird119
      @blakebaird119 Год назад +5

      @@eskilsteenberg understood just if you ever feel a little inspired like this one and How I Program C, a lot of fans will appreciate it I think 👍

    • @sophiacristina
      @sophiacristina Год назад +4

      Agreed, so cool that i accidentally stumbled on this video!

  • @porky1118
    @porky1118 Год назад +2

    48:30 That's why the Rust rules for mutable references are so nice.

  • @raihanulbashirhridoy6122
    @raihanulbashirhridoy6122 2 года назад +6

    Those white flashes (when changing slides) are hurting my eyes 😕

  • @puppergump4117
    @puppergump4117 Год назад +2

    At least I definitely know when the slide changes

  • @georgederleres8489
    @georgederleres8489 3 месяца назад +1

    I now have a clearer understanding as to why C is :
    a) Fast
    b) Dangerous
    :D

    • @eskilsteenberg
      @eskilsteenberg  3 месяца назад +1

      Thats what the video is meant to do! Thank you!

  • @felixfourcolor
    @felixfourcolor Год назад +1

    I don't even know C but I find this extremely entertaining

  • @porky1118
    @porky1118 Год назад +3

    35:38 C without automatic casting would be nice, I guess. Especially when having such weird casting rules.

  • @epsi
    @epsi Год назад +1

    39:04
    There's a new proposal document, N3128, on the WG14 site that actually wants to stop this exact thing because of how observable behavior (printf) is affected.

  • @TomStorey96
    @TomStorey96 Год назад +2

    Great video, but the flashes between slides are quite irritating.

  • @puncherinokripperino2500
    @puncherinokripperino2500 2 года назад +2

    hmm, weird that slides are not text files in Visual Studio (-:

  • @rogo7330
    @rogo7330 Год назад

    The bit with if(x==NULL) printf("Error
    "); did not happening is making perfect sence. We are not avoiding access to the memory at NULL address, thus we assuming that x is not NULL, otherwise it would create segfault. If we were calling goto or put the access to the x inside else block, we would avoid this issue.

    • @ABaumstumpf
      @ABaumstumpf Год назад

      But that is not what is happening. His claim about the optimised being allowed to assume malloc always returns memory is strictly wrong. You can easily check that by looking at things like compiler-explorer.
      The problem is with Linux as the kernel will return a memory-address even if it does not have anymore free memory.

  • @raywang2061
    @raywang2061 Год назад +1

    Great video, although it would be better not to have those white flashes in between slides. Really hurts my eyes when watching this at night.

  • @odissey2
    @odissey2 10 месяцев назад

    What puzzles me is that a+b operation takes one clock, but 'if (a

  • @LogicEu
    @LogicEu 2 года назад

    Thank you vey much for this! Absolutely love your talks

  • @martixy2
    @martixy2 Год назад

    God, I hope no epileptics find this video.
    Recommend you watch the video in a very small window.

  • @abenarroch
    @abenarroch 2 года назад +1

    Soo many, inexplicable behavior explained!

  • @Bolpat
    @Bolpat Год назад

    34:40, Wow I knew about promotion, but not that small unsigned types promote to a signed int. That’s -stupid- really surprising and inconvenient.

  • @MrTrollland
    @MrTrollland 2 месяца назад

    At 47:29 why is this not optimizable because of aliasing concerns? `count` will be derefed once, used for a calculation and the result will be passed to `memset` by value. Even if while `memset` runs `*count` gets overwritten, that should not affect the behavior, provided `*count` was valid to begin with

    • @ronald3836
      @ronald3836 21 день назад

      The point is that the code shown before the memset() code cannot be optimized into the memset() code.

  • @dudamoos
    @dudamoos Год назад

    One thing I see/hear often regarding C++ is that the compiler defines the behavior of your program in terms of the "Abstract Machine". UB and the "as-if" rule are consequences of this machine's behavior, even if it would be ok on real hardware. Does C have a similar concept? For example, what you say at 55:46: In the C++ Abstract Machine, every allocation is effectively its own address space. This has important consequences: no allocation can be reached by a pointer to another allocation, comparison of pointers to different allocations is not well defined, etc.

    • @CruelNoise
      @CruelNoise 10 месяцев назад

      Yes, the C language standard also uses the abstract machine concept.

  • @andrewj1939
    @andrewj1939 2 года назад +1

    Great video Eskil!

  • @RC-1290
    @RC-1290 2 года назад +1

    1:11:07 You seem to vastly overestimate my reading speed, especially when I try to understand what I'm reading.

  • @ahdog8
    @ahdog8 3 месяца назад +1

    26:37 OK, that's a pretty cool idea, but surely compilers aren't this smart, right? That seems like it would be hard af to deduce

    • @ronald3836
      @ronald3836 21 день назад +1

      Compilers have no problem deducing that x < 5 and using that to eliminate the conditional statement.

  • @porky1118
    @porky1118 Год назад +2

    25:12 In Rust, wrapping is an error. But the compiler might optimize it, so there won't be an error. I wonder if it also takes advantage of such optimizations.

    • @porky1118
      @porky1118 Год назад

      I think, I need to use unchecked (unsafe) functions.

  • @Bolpat
    @Bolpat Год назад

    33:10 This is imprecise. You cannot test whether an operation has -wrapped- _overflown_ (that would be UB), but you very much can test if an operation would -wrap- _overflow_ and if it would, do something else. In the case of addition, subtraction, and multiplication, you can check if they would -wrap- _overflow._ The compiler will optimize this to actually do what an unaware programmer would have put into code (namely do the operation and then check if it did overflow), which is really ironic.

    • @eskilsteenberg
      @eskilsteenberg  Год назад

      That's a behaviour supported by some implementations but not by the C standard. Some implementations and c2x does support specific functions that lets users detect wrapping.

    • @Bolpat
      @Bolpat Год назад

      @@eskilsteenberg I don't know what you mean.
      For addition, subtraction, and multiplication, you can check using INT_MAX and INT_MIN if the actual operation would wrap.
      Out of the blue, I'd say it's something like
      If ((a < 0) != (b < 0)
      || a > 0 && INT_MAX - a >= b
      || a < 0 && INT_MIN - a

    • @eskilsteenberg
      @eskilsteenberg  Год назад +1

      @@Bolpat There is no wrapping behaviour for signed int. its UB. If you write if(a + 10 < a), to test if a + 10 will overflow, then the compiler can just look at that and say "you cant make a number smaller by adding 10 to it so this will always be false and i can therefore optimize it away."

    • @Bolpat
      @Bolpat Год назад

      @@eskilsteenberg When I wrote “wrapping,” I meant overflow (which is UB). This is on me, I didn't have the nomenclature right. I meant to say: “You can check if overflow would happen if the addition were executed and do something else like report an error.”
      “Wrapping around” is what unsigned integers are guaranteed to do.
      Correct me, if I'm wrong, because I'm not actually a C programmer, but a professional C++ programmer. I also use C# for making custom GUI tools and I really like the _checked_ vs. _unchecked_ system.

  • @sharpfang
    @sharpfang 3 месяца назад

    41:30 this makes me angry. buf is volatile. How the heck does the compiler know at the time of assigning buf[0] to c1 and c2 it hasn't been assigned by an external resource? It makes assumptions about the value despite volatile!

  • @flippert0
    @flippert0 9 месяцев назад

    23:05 for me, this doesn't illustrate the power of C, but how vague the semantics of the this language are. It probably also will work differently in debug and release/optimized builds.

    • @eskilsteenberg
      @eskilsteenberg  9 месяцев назад

      What isnt defined, isnt defined in any mode.

  • @flatfingertuning727
    @flatfingertuning727 Год назад

    While your claim at 56:00 that programmers aren't allowed to "guess" at addresses may be consistent with the way clang and gcc actually behae, it's not consistent with what N1570 6.5.9 actually says: "Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space."
    Also, I think a good talk about UB should consider the fact that gcc in C++ mode, and clang in either C or C++ mode, will interpret side-effect-free endless loops as an invitation to trigger arbitrary side effects and corrupt memory (as opposed to merely deciding that if no individual action within a loop is observably sequenced before some later operation, the loop as a whole need not be treated as sequenced either). If programmers could rely upon compilers to limit themselves to the latter optimization, it could allow programs to be more efficient, but under the former semantics, such optimization opportuntiies woudl generally only exist in incorrect programs.

  • @porky1118
    @porky1118 Год назад +2

    I couldn't find part A and B of this advanced series.

  • @grimvian
    @grimvian 9 месяцев назад +1

    Hi Eskil, just a funny thought. I think your name should have been Skill. :o)

  • @GeorgeTsiros
    @GeorgeTsiros 3 месяца назад +1

    why does your voice go away, starts croaking, at the end of every sentence?

  • @dengan699
    @dengan699 Год назад +2

    It's absolutely insane how compilers went from "detect errors and report them to user" to "You know what? User is right, take what he/she wrote at face value".
    That's a complete, 180° turn in paradigm. I don't understand how it was even decided in the first place. If GCC was a student project which said "hey I am doing this weird trick because look, fast!" it would be laughed at and got an F | 5/20 | 2 whatever your grading system is

    • @eskilsteenberg
      @eskilsteenberg  Год назад

      You are right that few people know about this shift, but Its worth noting that over the last 20 years, compilers have improved C performance with about 25%. That's probably a trillion dollars of hardware, and untold energy/carbon emission savings.

  • @zeratulofaiur2589
    @zeratulofaiur2589 2 года назад +1

    Most problems in the world are the result of some people trying to be smart on your behalf.

  • @shoulderstack5527
    @shoulderstack5527 Год назад

    My favourite is @1:04:04. The compiler assumes malloc can't return null when it literally can!? Am I understanding that correctly!?
    I wonder, were there ever compiler wars? Like the browser wars that gave us so much crap.
    There's that saying in coding: "You should throw the first one away"; I'm beginning to think it applies to the whole industry. We just need to learn from our mistakes and design a new one.

    • @ABaumstumpf
      @ABaumstumpf Год назад

      It just isn't true: the compiler can not and DOES NOT assume that malloc always returns a non-null value. But malloc is performing syscalls to ask the OS for dynamic memory and things like the linux memory allocation scheme is opportunistic: It will always give you a valid address and only when trying to access that memory will you know if you can really use that.
      but that is not a problem of C but of Linux.

    • @shoulderstack5527
      @shoulderstack5527 Год назад

      I don't understand!? Who highlighted your reply to my post? If it was Eskil Steenberg then he seems to be disagreeing with his own statement at 1:04:04.
      What's going on? BTW, I don't claim to know the answer, I was commenting on the statement in the video, and assuming it was something Eskil Steenberg had experienced.
      @@ABaumstumpf

  • @dleiferives
    @dleiferives Год назад

    Thank you for this video

  • @IIJamesII
    @IIJamesII 2 года назад +2

    Wow this is insane.

  • @sandessharma8195
    @sandessharma8195 Год назад

    Eskil Steenberg: "This is stupid right. ? " Me: If you say so.. 😅.. Excellent Video.. thank you very much.. Learned a whole lot..

  • @dmitry.shpakov
    @dmitry.shpakov Год назад

    Observation principle for me looks like wave function collapse in quantum mechanics, weird

  • @nevafrost956
    @nevafrost956 2 года назад +3

    Amazing video. May I ask what resources you recommend for understanding about memory, cache and optimization ? Where did you learn these stuff, I know it is probably and mostly from experience but is there any good books or resources you recommend ?

    • @eskilsteenberg
      @eskilsteenberg  2 года назад +5

      I strongly recommend "What every programmer should know about memory". Google it and you will find the PDF, its long deep, but its the bible!

    • @nevafrost956
      @nevafrost956 2 года назад

      @@eskilsteenberg Thanks a lot !

  • @dreanwave
    @dreanwave 2 года назад

    Oh yes, the content I crave.

  • @623-x7b
    @623-x7b 2 года назад

    45:38 shouldn't that be sizeof(Mystruct) ?

    • @Spiderboydk
      @Spiderboydk 2 года назад

      Yes.

    • @micromashor
      @micromashor 2 года назад +6

      No. It is better to use *s because they produce the exact same output, but if s changes type later on, the code won't break because it's tied to "whatever the type s points to" rather than the hard-coded "MyStruct".

  • @tomaszstanislawski457
    @tomaszstanislawski457 Год назад

    I don't agree with ruclips.net/video/w3_e9vZj7D8/видео.html. The strict aliasing rule 6.5p7 tells that the `unsigned int` object can be changed via l-value expression of a union that contains `unsigned int`. So `convert.b=...` modifies object of union type that contains `unsigned int`. Therefore other l-value `unsigned` expression like `*p` should observe the effect as well. What is wrong in my thinking?

  • @SB-rf2ye
    @SB-rf2ye 2 года назад

    a[-5] is very useful in mathematical or scientific codes.
    I don't want the compiler to optimize it away, or cause bad behavior. I hope you notice this and convey it to your colleagues in the standard.

    • @trungngo9169
      @trungngo9169 Год назад

      if you use it on a pointer its will be fine because compiler can not see where mem block start and end. but you use on a static array and compiler see it some thing might break. int a[1]; int b = 5; a[-1] = 6; then you do a read to b compile may assume not thing change value of b(overflow or underflow is UB) then it just give you value of 5.

  • @lauri268
    @lauri268 2 года назад

    Note for me:
    45:02 memset
    MISRA C

  • @vmannn4259
    @vmannn4259 Год назад

    you kinda sound like todd howard

  • @UGPepe
    @UGPepe 2 года назад +3

    all this complexity in the compiler that increases compile times, adds cognitive load and makes us paranoid in order to optimize some straw man code that nobody writes, great engineering

    • @eskilsteenberg
      @eskilsteenberg  2 года назад +13

      This stuff saves a huge amount of power, time and CO2. Its incredibly valuable for the world to have languages that allow for optimizations like these. It is great engineering.

    • @UGPepe
      @UGPepe 2 года назад +1

      Eskil Steenberg I very much doubt these claims. Huge amounts of power? These optimizations are at the very tail end of the things that programmers can do to increase performance and that they very seldom do anyway.

    • @eskilsteenberg
      @eskilsteenberg  2 года назад +7

      @@UGPepe it obviously depends on the application. Most of these are small optimizations, but some like Aliasing can in rare cases mean 100x performance. It does not need to be that be to be worth it. A single 1% of increased performance can be worth hundreds of millions of dollars if you operate at scale. Imagine making all of Google/Microsoft/AWS just one percent more efficient. Imagine how much power that is. So much of the software that isn't written in C relies on C implementations. By making C 1% faster, Python, Java, Linux, PHP, Curl, Apache, SQL, SSL and almost every other software becomes 1% faster. It really matters.

    • @UGPepe
      @UGPepe 2 года назад +1

      @@eskilsteenberg I like your videos very much but when it comes to UB I think your viewers would benefit from knowing that there's another side of this debate and that not everybody is on-board with this whole UB thing. For instance, see my comments on Chandler's 2016 video on UB (ruclips.net/video/yG1OZ69H_-o/видео.html) you'll see that many people agreed with me (my tone was not the best but I was angry).

    • @UGPepe
      @UGPepe 2 года назад +1

      @@eskilsteenberg Another thing, at 3:00 sure that's a misconception today, but people should know that the original intent was to make a hi-level assembler and keep programming the actual hardware not some abstract machine. That idea came later from different people with different mentality and goals, that not everybody agrees with.

  • @logangraham2956
    @logangraham2956 Год назад

    EH! so it's all witchcraft, all the way down the tree.
    good to know.

  • @v1Broadcaster
    @v1Broadcaster Год назад

    thank god for rust. C gives me nightmares, literally prefer ASM

  • @ross9263
    @ross9263 Год назад

    Isnt rust faster now 😌😌😌

    • @ABaumstumpf
      @ABaumstumpf Год назад

      For many things sure. Not always.

  • @10e999
    @10e999 2 года назад +202

    Nice to see a new C lang focussed video.
    Your "How I program C" video was great.

    • @aziskgarion378
      @aziskgarion378 2 года назад +12

      It's probably the best video on how to structure C programs, in history. I have downloaded it and kept it in every media files I got. Hopefully Eskil realizes how changing that video is. Even though I don't program in C in anything, it's really the best general programming ethics guide.

  • @user-hk3ej4hk7m
    @user-hk3ej4hk7m 3 месяца назад +28

    C: I'm not going to crash therefore I don't need a seatbelt

  • @Sarsanoa
    @Sarsanoa Год назад +194

    I am a bit baffled that when a C compiler encounters user code that does the impossible (such as a range check that always passes/fails at compile time, or guaranteed undefined behaviour detectable at compile time) that its first instinct is "how can I exploit this to make the code run faster" rather than "tell the user their code probably has a bug".

    • @eskilsteenberg
      @eskilsteenberg  Год назад +79

      FI agree that compilers should be a lot better at explaining what they are doing. For instance syntax highlight code deletion. However, it should also do the optimizations that the standard affords it.

    • @cosmic3689
      @cosmic3689 Год назад +32

      @@eskilsteenberg yeah it would be great if the compiler gave some notice that its just ignoring code because it thinks its pointless, like 'hey maybe use volatile' or 'this expression is always true' etc.
      its been a while so maybe they are warnings now but it doesnt sound like it lol

    • @zombie_pigdragon
      @zombie_pigdragon Год назад +37

      The story here isn't actually too hard to explain! If you remember back when GCC and clang/LLVM were at each other's throats for being the "better compiler", the number one issue was speed- the faster compiler, the one that won all the benchmarks, was expected to win the compiler holy war. Therefore, compiler developers put massive numbers of hours into making their compiler generate the fastest code possible. Until shockingly recently, they didn't really worry about the effects this would have on developers, so they didn't put nearly as many hours into warnings and heuristics that warn when the code exerts unexpected behavior. As a result, the warnings that exist are mostly for simple rule breaks, and there's just not enough reporting infrastructure for the optimizer to report that some function is being optimized out of existence in a way that's probably not what the programmer intended. The fix is to put pressure on the devs- either make the patches on your own and contribute them to the projects (the best option!), or repeatedly ask for improved UB detection and ask others to advocate with you.

    • @Hauketal
      @Hauketal Год назад +11

      @@cosmic3689 Right, more warnings about those strange optimizations are wanted. But there is a catch: macros sometimes result in such code, especially when used with literal arguments. So at the same time, there must be some method to avoid overwhelming the developer with such warnings.

    • @RobBCactive
      @RobBCactive 11 месяцев назад +1

      ​@@HauketalThe compiler invokes the pre-processor, it can report a few instances of an error, then summarise repetitions.

  • @diketarogg
    @diketarogg 2 года назад +30

    Don't do the flashing white screen. It hurts my eyes and is annoying in general.

    • @DrGreenGiant
      @DrGreenGiant Год назад

      I had to stop watching about 10 mins in because of it. Seizure inducing.

  • @tomaspecl1082
    @tomaspecl1082 Год назад +33

    This was interesting! I did not know that the compiler did (or could do) such weird and scary optimizations. Now I appreciate that I know assembly even more because at least there you know what you write is gonna stay there no matter what. Or at least I can debug C code by viewing the assembly.

  • @lennyphoenixc
    @lennyphoenixc 11 месяцев назад +20

    The C compiler is really that guy that says "oh yeah buddy you didnt mean to do that did you, lemme get that for ya *deletes code block*"

  • @kkpdk
    @kkpdk Год назад +114

    Thank you for making this. As someone who gets asked when 'the compiler does weird easily biodegradable matter', being able to point people to this is gold. Restrict is something I miss in C++, it is so useful for SIMD intrinsics.

    • @smellytaint
      @smellytaint 10 месяцев назад

      most c++ compilers allow for a restrict extension like __restrict for g++ and clang.........

    • @69696969696969666
      @69696969696969666 6 месяцев назад +1

      Restrict, or an equivalent, is available in all major C++ compilers. That said, restrict itself is a woefully inadequate tool for working with aliasing semantics. It hasn't been standardized in C++ because it's fundamentally a dead end.
      Also, C++ is leaps and bounds better for SIMD programming relative to C. Libraries like E.V.E. or Eigen are literally impossible to write in C.

  • @pierreollivier1
    @pierreollivier1 Год назад +4

    ahah so basically the compiler is just a RUclips comment troll, that look at your code and respond with "Ahaha too long didn't read".

  • @andrasfogarasi5014
    @andrasfogarasi5014 Год назад +9

    40:53 Assembly jumpscare. I'm damn terrified.

  • @dashl5069
    @dashl5069 Год назад +4

    Is the explanation at 7:57 actually correct? I would have assumed the problem is that *a* can change elsewhere, meaning x and y are not necessarily the same, not that x can change elsewhere, causing y to be equal to x but not a.

  • @canaDavid1
    @canaDavid1 Год назад +19

    33:00 another way to understand this issue is:
    The multiplication of a and b first multiply as shorts, wrapping if needed, and then is cast to an unsigned int. This means that the highest 16 bits will always be 0, and it will eliminate the if.

    • @adityajain2839
      @adityajain2839 Месяц назад

      This can explain the branch decision. But the the result won't be 4 billion this way (?)

  • @michaelutech4786
    @michaelutech4786 2 года назад +66

    I spend my time working with people who ponder sources of truth and believe that there is one true dogma that will safe our souls (keep it simple).
    I learned C some 30 years ago and when I feel nostalgia watching this video it's not because I miss C. What I miss are people who actually know what they're talking about and why, people like Eskil.

    • @BenVanCamp
      @BenVanCamp 16 дней назад +1

      Jesus is that truth.

  • @michaelzomsuv3631
    @michaelzomsuv3631 2 года назад +5

    Hi Mister Steenberg! If you happen to read this message, would you consider doing a video about C23? I'd like to hear what you think about the new features coming in C23.

  • @RC-1290
    @RC-1290 2 года назад +4

    24:42 Subtitles aren't helping me here, because it also hears both signed and unsigned having possible optimizations. I *think* the second one is "can't, but let's just say it's not clearly defined ;)

  • @deniszaika9534
    @deniszaika9534 2 года назад +8

    Clean and constructive talk about great language.

  • @Lantalia
    @Lantalia Год назад +20

    I've found I prefer Rust these days, but I have fond memories of the C and C++ standards from 20 years ago, thanks for the fun video

    • @Heater-v1.0.0
      @Heater-v1.0.0 10 месяцев назад +2

      Rust is my language of choice these last three years or so. However I still love C and would be happy to use it where needed. I love it for its hard core simplicity.I love it because it has hardly changed in decades and I hope that remains the case. However I've have also use C++ a lot and absolutely refuse to ever go back to that deranged monster.

    • @69696969696969666
      @69696969696969666 6 месяцев назад

      @@Heater-v1.0.0 Say what you will about C++, you'll have to square it with the fact that even the major C implementations (Clang/GCC/MSVC/etc) choose the "deranged monster" of C++ over the "hard core simplicity" of C. Simply put, the fact is that C++ is more popular than ever because it's actually *more* insane to use C lmfao

    • @Heater-v1.0.0
      @Heater-v1.0.0 6 месяцев назад

      @@69696969696969666 The is true. Most of the worlds C compilers were written in C. C++ evolved from C and the compiler implementations followed. All seems quite reasonable. I agree that C++ offers a lot of conveniences that can make life much easier than C, although I'm still happy to use C or the C subset of C++ where appropriate. It is possible to write nice C++ code if one stays away form much of the ugliness the language. Unfortunately it's hard to do that on a large project with many people working on it as they tend to start introducing all kind of C++ weirdness.
      Anyway, all that long and tortuous history does not mean we have ended up in a good place with C++. Many agree with me. Like Hurb Sutter with.his C++Front work. And Hurb is on the C++ committee!

  • @tomaszstanislawski457
    @tomaszstanislawski457 Год назад +4

    Even though VLA objects with automatic storage (stack-allocated) are not very useful in practice, the VLA **types** are really useful for handling multidimensional arrays.

  • @Ryan-in3ot
    @Ryan-in3ot Год назад +3

    this is maximum anxiety for everything ive ever written. At first it was like "alright, perhaps i should reorganise some things for better performance" and then it was "oh god, i hope i didn't implicitly assume that the padding in my structs would be persistent."

  • @viacheslav1392
    @viacheslav1392 Год назад +2

    Great video, except the nonsense on 43:20

  • @zabotheother423
    @zabotheother423 2 года назад +10

    Interesting talk! It’s always fun seeing C code and realizing that it’s undefined :)
    One thing I don’t understand is: in what scenario would you ever free an array and then check that you didn’t reallocate the same block? I kind of get if thread A allocates, thread B does some calculation, thread A frees and reallocates, then thread B checks if it’s already done the calculation for the current block. Seems like a flawed architecture though, if this is the case then A should trigger B on a reallocation and B will wait otherwise. Maybe I just don’t get it though

    • @eskilsteenberg
      @eskilsteenberg  2 года назад +13

      There is a common pattern using a mechanic called "compare and exchange". Lets say you have a shared counter that is shared and many threads want to merriment the value. Each thread wants to access this value and add one to it. To do this you read the value, add one to it, and write it back. The problem with this is that between reading and writing it back some other thread may have incremented the value. so if thread one reads the value 5, adds one to it, then tread two reads the value and adds one to it, and then both write it back, then the value is set to 6, not 7, even though 2 threads have added 1 to 5.
      To deal with this processors, have a set of instructions that are called "compare and exchange" , thy let a thread say "if this calye is X, change it to Y". So our threads that use that to say: if the shared value is still 5 change it to 6. If two threads try to change 5 to 6, the first one will succeed, and the second one will fail, and will have to re-read the value and try again.
      This teqniqe is often used with pointer swaps. So you have a pointer to some data that describes a state, you read that state, creates some new state, and then uses compare and exchange to swap out the pointer to the new state. In this case you are using the pointer to see if the pointer has changed since you read it, and this is where an ABA bug can happen, if two states have the same pointer.

    • @deniszaika9534
      @deniszaika9534 2 года назад

      Yes, some kind of smart pointers can be easily implemented with C.

    • @zabotheother423
      @zabotheother423 2 года назад

      @@eskilsteenberg why is this advantageous to using a lock? Seems like a rather roundabout way to solve the shared resource problem

    • @eskilsteenberg
      @eskilsteenberg  2 года назад +20

      @@zabotheother423 Lockless algorithms are generally faster because they don't require any operating system intervention. Mutexes are convenient because if you use a function that locks them, any thread that gets stuck on a lock will sleep until the lock is available, and the operating system can wake up the thread when the lock gets unlocked. This OS intervention is good, because threads don't take up CPU while waiting for each other. On the other hand, sleeping and waking threads take many cycles, so if you really want good performance its better not to have a sleeping lock but just do a spin lock if you expect to wait for a few cycles for the resource to become available. This means that you can only hold things for very short amounts of time, so its harder to design lockless systems, but also more fun!

    • @zabotheother423
      @zabotheother423 2 года назад +5

      @@eskilsteenberg interesting. I’ve heard of lockless designs before but never really explored them. Thanks

  • @xplinux22
    @xplinux22 Год назад +6

    Such an amazing video! I loved all these fascinating tidbits about C (and compiler design in general) and you held my attention the entire time. I think I'll watch it a few more times to really grok the material. Bravo!

  • @kellybmackenzie
    @kellybmackenzie Год назад +4

    Thank you so much, this video is awesome! I appreciate this a lot

  • @svaira
    @svaira Год назад +7

    30:30 I think here it might be better to define an enum with values 0,1,2,3 and to cast a to that type / to have it that type. With -Wswitch, I would hope that means that the value being outside of that enum should also be UB / unreachable (although I would have to look it up, it also depends on how the compiler warnings work here). I would prefer that since it doesn't depend on compiler intrinsics, and it also doesn't let you skip values in between (at least if it's a sensible enum like "enum value_t {A,B,C,D};" and not something strange like "enum weird_t {A=55, B=17, C=1, D= -1854};").

    • @ronald3836
      @ronald3836 21 день назад +1

      in C, it is not undefined behavior for an enum to have a value that is not enumerated. Basically enums are just ints or whatever integer type you picked.

    • @svaira
      @svaira 21 день назад

      @@ronald3836 absolutely, that's why I pointed to -Wswitch, which makes it a warning (hopefully). It's not in the standard, but it is a pretty typical optional limitation of what you can do in most compilers.
      Also I should say that I usually use -Werror with lots of warnings turned on. I know many people are not as diligent tho

  • @thesenamesaretaken
    @thesenamesaretaken Год назад +5

    I'm looking forward to compilers optimising away array index checks, assuming programmers are too clever to make mistakes is obviously the way forward.

    • @69696969696969666
      @69696969696969666 6 месяцев назад +1

      The compiler can't optimize away my bounds checks because I don't check in the first place. Hopefully in the long term the undefined behavior in my out-of-bounds array accesses will result in even greater performance. Ideally compilers will become sophisticated enough to replace my entire code base with "return 0".

    • @ronald3836
      @ronald3836 21 день назад

      If you check array bounds and then access the array ANYWAY, then the compiler is indeed free to remove the bounds check.

  • @chriss3404
    @chriss3404 Год назад +12

    So much genuinely valuable information that contextualizes and explains many C intuitions that I've built over time.
    Seriously one of the best quality videos I've seen on this platform in recent memory.

  • @turun_ambartanen
    @turun_ambartanen Год назад +3

    The white flashes whenever the slide changes make this impossible to watch.

  • @codegeek98
    @codegeek98 Год назад +4

    34:00 Would be fun to see this run on an architecture that uses something other than 2's complement for hardware acceleration of signed integer operations

  • @kalebdodge775
    @kalebdodge775 3 месяца назад +1

    Moral of the story: Initialize your variables.

  • @zxuiji
    @zxuiji Год назад +2

    29:35, I actually have a better way to write that code, make member 0 a default function that does nothing or at least assumes invalid input, then MULTIPLY a against it's boolean check, so in this example it would be
    a *= (a >= 0 && a < 4);
    func[a]();
    Notice how there's no if statement that would result in a jump instruction which in turn slows down the code, if the functions are all in the same memory chunk then even if the cpu assumes a is not 0 it only has to read backwards a bit to get the correct function and from my understanding reading backwards in memory is faster than reading forwards

  • @SB-rf2ye
    @SB-rf2ye 2 года назад +4

    c is awesome! please make more about EVERYTHING you would like to share! 🥺

  • @michaelclift6849
    @michaelclift6849 2 года назад +39

    Thank you, this is terrifying. Compilers are amazing. So many times I think I've found a faster way to do something, then the compiler just shakes it's head at me and produces the same binary.
    @23:47 Sometimes I depend on overlap. Splitting the operation into multiple statements ie. x *= 2; x /= 2; has always produced the behaviour I want. It is interesting that x *= 2; x/= 2; is not always the same as x = (x*2)/2.
    @34:32 I'm sceptical that this can happen. I can't reproduce it on GCC 8.3.0, even if I add the casts!
    @51:08 there's something wrong with your newline here ;-)

    • @gregorymorse8423
      @gregorymorse8423 Год назад +1

      If you write nonsense code that gets into language or compiler details unnecessarily, you are not doing anyone any favors. Clearing the high bit can be done by masking e.g. x &= (1

    • @michaelclift6849
      @michaelclift6849 Год назад +2

      @@gregorymorse8423 I don't. It was a bad example. I would never intentionally overflow a multiply. The only times I depend on overflow are for addition and subtraction. In 8 bits 2-251=7. This is necessary if you want to calculate the elapsed time of a free running 8 bit timer. People tend to think of number ranges as lines, which is why overflow causes some confusion. For addition and subtraction It can help to think of number ranges as circular, or dials. Then the boundaries become irrelevant.

    • @gregorymorse8423
      @gregorymorse8423 Год назад +3

      @Michael Clift overflows are well defined behavior in twos complement number systems. And applications like cryptography rely on this, and deliberately overflowing multiplication when doing modular arithmetic is practically vital to schieve performance. That C has tried to be low level but introduced bizarre undefined behavior concepts all over to capture generality that is useless is beyond me. The formal concept is beyond the dial analogy that a+b is e.g. for 32 bit unsigned (a+b) % 2^32 or likewise for multiplication. C does seem to respect thus for unsigned numbers in fact, it's signed ones that are trickier to describe so they chickened out.

    • @nim64
      @nim64 Год назад +2

      with regards to 34:32, copying the code as written in the video and compiling with just "gcc -O3 t.c -o t" reproduced the result for me on gcc 9.3.0 (ubuntu, wsl)

    • @michaelclift6849
      @michaelclift6849 Год назад +2

      @@nim64 Thanks nim. I tried it with -O3 and now I see the symptom too (still on GCC 8.3.0). It appears to happen with any optimisation level apart from -O0

  • @robert36902
    @robert36902 2 года назад +18

    I loved this video, thanks for sharing! As someone who started programming on the x86 processor, which I think has a more forgiving memory model, it's great to review the acquire/release semantics and other little things that may trip me up.
    Regarding undefined behavior: Do you have an estimate on how often the compiler will raise a warning before relying on the UD to delete a bunch of code? To me it seems most or all of these should be a big red flag that there's an error in the program - even thought the C language assumes the programmer knows what they're doing.

  • @nrncproductions
    @nrncproductions 2 года назад +6

    Thank you for this new C lesson.Great as always.

  • @regularsalamander
    @regularsalamander Год назад +3

    growing up is realizing C is the best programming language

  • @thomasfink2385
    @thomasfink2385 Год назад +3

    It seems to start boring and thick and slow but it gets interesting fast. Excellent.

  • @sheeftz
    @sheeftz Год назад +4

    Finally a good explanation of what the volatile keyword does mean in c\c++.
    Just finished watching. VERY GOOD stuff here. It's shame that this's no mention of how do the things relate to c++. Is it same or different in c++. I wish I had the same quality video about c++.

    • @ABaumstumpf
      @ABaumstumpf Год назад

      Yeah, i was surprise that he started explaining volatile accurately. So often, even from very brilliant people, you hear rants about volatile and how it does not mean what we think it means - and then it turns out they them self are giving false explanations.

  • @ismbks
    @ismbks 3 месяца назад +2

    please put a seizure warning next time