This is *so* C++. You have a lot of nice tools available to safely do an operation. There are several variants so you can pick the one that fits your usecase best. At the cost of language spec size, sure, but that's an understandable tradeoff. And then you have the one way that is easiest to write, inherited from C and also the most dangerous option of all.
Welcome to the world of embedded. Even when you want to go with C++ you will end up using some C because all the drivers are C-only. Your precious vectors will be casted to scary pointers (and DMA doesn't know any other way anyways). Also say hi to bitfields and magic bytes
@@brylozketrzynFortunately even if library is written in C this mess of a language can be wrapped into something useful. Unfortunately many times the C code quality/practices can be so bad that the best you can do (if possible) is to write it from the scratch and don't even bother wrapping it.
I love that you mention how obvious it is that public_cast should never be used, because I 100% agree with you - and yet, I've actually ran into a problem once in the past where using it was legitimately my best option (and I ended up solving it in an even more horrible way)! 😅 So the situation came up when we remastered an old video game, and in the process, we needed to update the Havok physics middlewere used in it to a newer version. This is always a horrible process (and one of many reasons I hate using Havok, or middleware in general), but it was made especially difficult for us thanks to another problem: The binary assets for the game had Havok data embedded directly into them, and since the majority of the original game's converters were no longer usable for us, we couldn't just recreate those assets in the new format. Yet we also couldn't use the original game assets without modifications in the game, since the new Havok version would just reject the old serialized data. What we had to do to solve the problem was to create a new converter that just reads in the game's old binary assets, deserialize the Havok data within them using the old Havok version, re-serialize it using the new Havok version, and finally write the entire asset back out to a file. However, there was a small problem there: This kind of migration process wasn't built directly into Havok. I know it has some support for version migration, but not for this specific use case. So instead of migrating the data directly, we had to create new Havok data from scratch by using the raw deserialized data as a base. Except, there was yet another problem: The old data we needed access to in order to create the new data was completely private in the headers of the old Havok version - and that's where dirty tricks come into play, like your public_cast shown here. Another option would have been to simply edit the headers of the legacy Havok version (since we really didn't care about legacy Havok anymore). Instead, I ultimately decided to go with an even nastier (and even funnier) solution: #define private public #include #undef private Yes, this actually worked - and I was laughing my ass off for hours after writing it! 😄 Since this specific converter was just a quick and dirty solution to a very ugly problem and we didn't even intend to maintain it long-term, we were fine with getting our hands a little dirty there. I guess this goes to show that no matter how obviously horrible something is in coding, there might always come a time when using it is legitimately your best option.
Thanks for the scary story prequel. :) This sounds like a nightmare! I’m glad you heightened the comedy by cracking it in the most cursed way possible. Desperate times call for desperate measures!
There's a worse version of the C-style cast example. In this version, S isn't refactored to remove the IFoo superclass. S doesn't change at all. Instead, you just happen to do a cast from a source file that only has a forward declaration of S. That is, you have some code like this: void foo(S *s) { bar((IFoo *)s); } If S is fully defined, then this will compile as a static_cast, offsetting the pointer. If S is only forward declared, then it will compile as a reinterpret_cast, not offsetting the pointer. For bonus points, `foo` might be an inline function defined in a header, in which case it might mean different things when compiled in different source files! (Which is also undefined behavior due to being an ODR violation.)
IIRC, the standards committee agree that: it *is* allowed, but that it *shouldn't* be allowed, but they also haven't been able to do anything about it since 2015 when this cursed technique was discovered.
Ok ok ok, I'm done using C-style casts (not that i ever really used them). Instead, you've convinced me to implement a c_style_cast(...) template to hide away everything I'm doing wrong into one place. Thanks!
So glad I use C-style casts all willy-nilly. Now that I'm seeing more proper C++ casts I can feel like a man when I swing out a C-style cast in accordance to my Principle of Most Power.
That violates the principle of most power. You should be opening /proc/[PID]/mem and copying the bits on the stack over instead. Bonus points: It even bypasses Rust safety guidelines, so you can piss off two language committees at once.
sounds good! use Rust instead if you have the chance, it's so much nicer and you're relieved of a lot of the burden C++ places on you (for example things like thinking about move constructors, copy constructors etc. and generally being concerned about memory safety)
Great video. I was amused when you said about omitting constexpr, nodiscard, ... Not withstanding the craziness of casts, I think that the ridiculous number of decorators we add to code is a huge danger to future of the language. It's starting to become quite unreadable and the barrier to entry to understand all this stuff for a junior is insane to "correctly" write a function that adds two numbers together, for example.
I’m with you 100% on this. It’s getting ridiculous the number of magic words you have to add to your function declarations to fulfill all the best practices. Not to mention that they’re easy to get wrong.
@@_noisecode Or just plainly forget.... Btw, shouldn't the argument of that function be an r-value reference to be completely correct? One of those things I guess..
@@peterbonnema8913 I wouldn't make it an rvalue reference--that would add some syntactic and type system noise and not really change the semantics in any beneficial way (except I think for pathological cases like implicit_cast(move(existingValue))?). But, yes, even having to ask the question and painstakingly reason though the answer is definitely a symptom of C++ being overly complicated. ;)
As a C++ expert, “public_cast” and even C-casts aren’t the most evil ones possible. For example, with c++20 concepts, partial template specialization, and constexpr/consteval I can write a cast whose behavior changes based on both the type AND value. And I can even make it recursive. I can write a relative of “public_cast” that alters the input object’s identity by swapping out vtable pointers with a related class’s (yes this is technically undefined in the spec, but it’ll work on all 3 major compilers). And there are others. See, what you have to remember about C and C++ is that 1) at the bottom, everything is ultimately a bag of bytes and 2) their design is all about giving you as many tools as possible, rather than preventing misuse. It’s a classic case of freedom+ power = potential for misuse. If you think the abuses that c++ enables or permits are bad, take a look at what’s possible with straight assembly.
6:04 I’m so glad that literally every popular compiler doesn’t care what the standard says and makes this implementation defined behavior. So much production code takes advantage of reinterpret_cast for bit shenanigans
@@noamtashma617 Well among others reinterpret_cast has one primary usage which is (quoting CPP-Reference) converting "an expression of integral, enumeration, pointer, or pointer-to-member type to its own type. The resulting value is the same as the value of expression.". Meaning legally speaking it is only defined if you cast an object back to its original type. You can also use this to cast to an object that is able to hold any value from the type you are casting from. Meaning you could cast the address of an int to a std::uintptr_t. To understand this, it basically means that int i = 7; std::uintptr_t v1 = reinterpret_cast(&i); is a legal use of reinterpret_cast. If you use a static cast you would get a compile error because it cannot cast from type int * to a std::uintptr_t. The reasoning is that different types have different size and alignment requirements. So you are left with a reinterpret_cast to convert the address of an int to the address of an unsigned long, which fulfills the second criteria (casting to a type that can hold any value of another type). They give a good example for another use with the following code: struct S1 { int a; } s1; int* p1 = reinterpret_cast(&s1); This is legal to do because S1 == sizeof(int*). So no harm is done by converting the types. This is referred to as "pointer-interconvertible objects". All of that said, sometimes you just want a quick and easy fix to interpret the bytes of a float as an int and vice versa, like in Quake. Since most production code is older than C++20, if that was ever required and you _really_ wanted to avoid std::memcpy then this was the way to do it.
well, it actually is not undefined behaviour it is unspecified behaviour, which means each compiler might do whatever it finds easier to implement. Unspecified Behaviour may be read as try to use something else, or if you know your compiler implementation good enough
@@noamtashma617 See GCC, and linus' rant on strict aliasing. GCC explicitly doesn't care about this rule by default because it would break their most compiled project of all time (the linux kernel)
@@Caellyanso many times I've had to copy/paste an entire class only to be able to change/access one private member instead of a simple inheritance... I really wish libraries just used protected. But then again, in my line of work I have to constantly modify things like cryptography classes that explicitly don't want the average user messing with their members.
const_cast is useful if you have to use an old C library that was made without using the const keyword, when you know that it doesn't modify some function parameters.
3:56 Fun fact: the Typescript compiler has no qualms about taking the inverse of a template type to do type deduction. This is actually quite useful, except for the part where there's absolutely no spec for when it will or won't work.
I think the issue is that the result is not deterministic which doesn't matter for TS because it doesn't have to do much with type results and it can resolve to `A | B | ...` while C++ doesn't have an implicit type union and has to know the actual type as the result will affect the behavior of the program. I'm not 100% sure I got that part of the video but that's why if I did.
So the 0x5f3759df from the Quake inverse squareroot algorithm has to do with the constants associated with a Taylor series expansion of the logarithm of 1/sqrt(x), if I remember correctly. The bit shift is done as a more efficient (read lossy) way of multiplying/dividing by 2 depending on shifting to the left or the right. There was a video that went over the algorithm in great detail that made it super easy to follow.
@@TrimeshSZ Unless you mix signed and unsigned integers. In this case, make sure you are doing the good casts at the good times 🙂(whatever cast type it is... I don't see C-casts as "cursed", but the C++ language is entirely "cursed", lol)
My understanding is that doing that trick on a float, like is being done in the algorithm, will lose at least some information. If it were an unsigned int or whatever it would be exact, but I didn't think that was the case for floats. @@TrimeshSZ
The key insight is probably that halving the exponent gives the approximate square root, to within a factor of 2. Then the rest is just finding the best approximation
C-Style casts also cast away volatile (and yes volatile is useful in some contexts, namely for variables that are expected to be modified externally by e.g. interrupts, IPC, network callbacks, so the optimizer shouldn't reason about their value).
6:40 is not entirely correct. The C++ standard has an explicit note about accessing an inactive union member for special cases like struct with common beginnings /* One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (11.4), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see 11.4. - end note */ This doesn't work in constexpr context though :) Also, for 11:30 it's probably doable with stateful metaprogramming amazing video btw :)
It doesn't work in constexpr because it's still UB, you are incorrect. This isn't what the common initial sequence rule refers to, it's for when you have a union of 2 structs, and both structs have a common member -of the same type-, you can access that member through either type, regardless of which struct is the active member of the union. I dont fault you for misunderstanding this rule, it's a bit of a doozy, even language lawyers have a hard time with this one.
The C-style cast is just how reinterpret_cast should've been. If I'm willing to cast away the actual type and treat your data like the series of bits it is, do you think I care about it being const? No, not a bit!
Heh. I was very careful to call the second base class `IFoo` in order to make it a case of classical OOP--one base class, followed by multiple interfaces. That's a flavor of "multiple inheritance" that's supported by plenty of languages and considered safe because it doesn't suffer from the diamond problem... but in C++, it still has this footgun surrounding C-style casts.
I have to say, aggregate multiple inheritance never made much sense for me in C++. It causes endless complexities trying to hide the dirty trick that it is implemented as composition, which ruins address identity.
@@noamtashma617 No, the pointer doesn't offset with single inheritance. The pointer to Base and the pointer to Derived are the one and the same. Only the length of the object is different.
1:05 - 4:36 This is one reason why type inference is so much easier when you don't have subtyping. In languages like Haskell and Rust, the bidirectional type inference can in principle allow you to omit every single type annotation and still figure out the types everywhere. now, rust requires you to specify types on (non-lambda) functions, but that's just because it's good practice to do so, not because of any technical limitations. This is because the lack of subtypes allows there to always be a single most generic canonical type of any value, so you will never have to try to guess which of the possible types (including upcasting) a value should have, so the "implicit cast" function is not needed.
@PeterAuto1 is correct. Rust *does* have subtyping via its trait system, so it is sometimes impossible to infer with missing type annotations (i.e. sometimes type annotations are required).
@@strager_ but from what I can tell, subtyping via parametric polymorphism and unification still makes it a lot easier to make type inference. Even in the case PeterAuto1 mentioned, there is still a principal (canonical most generic) type, the only issue is that there's no canonical monomorphic type if it's also consumed by a polymorphic function. Which does indeed mean we need to specify the type there, but not because there's no principal type in general, just because the principal type is polymorphic
The public cast as an idea almost seemed like something good when I didn't know much about programming, but the more I learned, the more I appreciate local variables. Scoped variables are everything
public_cast reminds me of C# reflection which enables every private property and private field to be gettable and settable, even from other assemblies. It's interesting that some of that is possible in C++.
You can do the same sort of f*ckery in java. that one time a year when you need it it's wonderful. That time someone uses it because it would be fun? NOT wonderful... :D
If you are super explicit about that things, they may be very convenient. But the moment when developers starts to make too much magic, I say it's time too stop. I usually hate implicit things besides rare exception
The part where the c-style cast "changes its meaning without telling you about it" seems to be the reason why C++ has created this "zoo" of different casts in the first place. A valid design decision, but I guess it's not for everyone.
I mean you can still do the 'reinterpret cast' in C by just casting a pointer, for example float f = *(float*)&i; Where i is an integer. So there are at least two types of casts in C, direct one and the one reinterpreting pointer (equivalent to reinterpret cast on C++)
When you first started talking about pun_cast I was thinking "isn't that just bit_cast? How would it be different?" Guess it wasn't different at all lol
I genuinely have a hard time understanding why bit_cast is better than reinterpret_cast. I've never seen people use bit_cast on pointers or references, only values. I didn't even know reinterpret_cast could be used on values.
and yet i hear it all around "but i can write correct C code" while writing C++, with blissful ignorance of the dangers that lurk under the C, or maybe just they know their stuff and do it anyway because it is such a nice touch!
C++ has "better ways of doing things than C", but in a significantly more complex environment, while retaining backwards compatibility with the C way, its 90s way of doing things and an extra random way that got immediately depracted 2 standards later. C casting words fine in C because you can't have ultra fancy/complex types, so the reinterpretation is always pretty simple. C function pointers are ugly and incomprehensible, but you can typedef. You have to use raw pointers in C, but since it's the only way to pass things by reference, the code is very self-documenting and stays simple. Overall, C is much simpler than C++
@@vercolit I often see this argument that C is simple while C++ is not, therefore C++ is more unsafe. And frankly I don't think this makes any sense. While C is simple, as in the language definition is small, that doesn't mean making programs is simple. Imagine you're trying to build a car. On one hand, you can have a simple toolbox - just a screwdriver. On the other, a complex toolbox, including various tools. But which toolbox makes constructing the car simpler? Counterintuitively... not the the simple toolbox. You can do it, in very hacky ways. It is common place to see extremely hacky solutions in C that just perform basic C++ actions, but in an incredibly unsafe and hard to reason about ways. For example, OOP. OOP is useful, particularly for GUIs. Now we have GTK, that hacks together objects in C using function pointers. But... the compiler enforces very little type checking on function pointers. And, there's no access modifiers. And, now a simple function call can actually result in memory errors, like segfaults. All to achieve what's trivial (and safe) in C++. Or, how about generic functions? Seems simple enough... take libc qsort(). We want to take in a generic function for comparison. But, we have to use void pointers. Now, we have circumvented the type system. We have pushed errors to runtime. Not only is this horribly inefficient in every sense of the word, it also doesn't make sense. 99% of the time the function we want to use is known at compile time, but we HAVE to force runtime behavior. Again we've opened the door to memory errors that have no reason to exist, and as a nice side effect our sort is ~10x slower than an equivalent algorithm in C++. Whoops! Or, what about generic containers? Also a very useful concept. Trivial, type safe, and memory safe in C++. In C...? Well, we could use void pointers. But this is verbose to work with, and again pushes behavior to run time. If we want a vector of int, we should enforce everything in the vector is an int... you can't safely do that with void pointers. So we need code generation. Ah, macros! Except, macros aren't templates. They aren't type aware. They're primitive. Yet again we're side stepping the type system and hoping that users use the macro correctly. If they don't, then whoops! We can use the newish _Generic... but these are fake generics. They're not general purpose. So we end up writing 10x as much C code to do simple and easy things in C++. And, along the way, we sacrifice our safety. All in the name of a "simple" language. It's not like these are cherry picked examples, these will pop up in literally every non-trivial piece of software. There are countless things that are trivial (and commonplace) to implement in C++, but nightmare-ish in C.
The problem is multiple inheritance. The more I learned of the "magic" going on in the background to make it possible, the more I think it was not the best idea.
Somehow C-style 'type(value)' casts always felt most aesthetic to me since it just looks like a function you give some other data type as an argument that then turns into the data type the function is named after.
This is why I like Nim’s cast system. When using Nim, anything with the word “cast” both appears like a sore thumb in the code, and is guaranteed to be unsafe.
Who would have thought that making your language more complicated would necessitate even more complextiy to fix the mess? As someone who primarily uses C for his projects, I can proudly say I am sticking with C-style casts.
For the quake example, you should probably static assert on std::numeric_limits::is_iec559 Although it is unlikely to find an implementation where it is false.
I wasn't aware that the "standard" C++ behavior was to only allow reading from the one member of a union was most recently written. Who came up with that idea? That defeats one of the main purposes of unions.
Yes! After learning about how the C-style cast works in C++ I had a small and slightly incredulous but ultimately easily resolved conversation with my manager in a PR, and my guess was that it would top this list. So easy to *think* you know what you're doing with it and so easy to break things.
I imagine that most people "learn" c++ by seeing enough of the short videos and second-hand books of someone trying to describe individual details about the language, and then try to put what they remember into their code, hoping to have expressed what they wanted, before duckduckgoing corrections to errors to get their project to compile.
For the pun_cast implementation you mentioned UB in C++ when reading an inactive union member. While this is true, the standard enforces this but technically most if not all modern C++ compilers support it, and even use it in their implementations (short string optimization). I guess it depends on how portable you need your implementation to be.
I do believe type punning through unions tends to be “supported” by compilers yeah, but mostly just due to necessity, since so much code relies on it. If we want to one day allow compilers to have true freedom to do all the amazing optimizations they want to do to our code, we need to move away from “eh, it’s not technically allowed but it works in practice,” and toward what is strictly allowed. That sentiment does break down in lots of cases though, since C++ doesn’t even always give you the tools to express yourself both efficiently and legally. It seems like the committee is working hard on this though; the object model has been a big focus of both C++20 and 23. I’m curious if SBO string implementations actually type pun through unions. I understand that a union is likely used for the storage, but since SBO is either active or it isn’t, it seems like the access to the different variants would be non-overlapping. I’m curious why an implementation would ever access data from the inactive state.
@@_noisecode If I'm not mistaken, SBO is specifically for vectors, where the type it is storing is not known, and thus restricts any funny "exploits " you can do. But SSO is specifically for strings and has the unique ability to only care about the char type (well all the different char types I guess). There is incredible bit manipulation done to achieve this, reserving the LSb as a flag to check whether it's short or long, and so between the 2, you only allocate even lengths to satisfy this (which also turns out to be more efficient anyway). There's much more but it wouldn't suffice trying to explain it in a RUclips comment. I believe libc++ implements it this way but MSTL and libstdc++ don't and instead use an approach almost identical to SBO. If you're interested I tried to write a legible version: github.com/PremanJeyabalan/stdcpplib/blob/main/src/String.h , but please know I'm just a undergrad student who doesn't write C++ professionally (yet). I'd love to hear your thoughts if any.
Ah right, I knew about this trick at one point but had forgotten. For what it's worth, I believe this could be done without undefined behavior; since in your implementation, the size is the first byte of the short variant, and it's legal to reinterpret_cast to inspect the byte representation of any object, you could just reinterpret_cast(the_entire_union) and then inspect the resulting first byte, which will be the value of your short_t::size_flag, accessed legally even when the string is long. Nice string implementation! It's awesome that you're digging into this stuff so deeply at an undergrad level, and I'm sure any number of places will very eagerly hire you up as soon as they can. :) Lastly, the term SSO is just a special case of SBO--they both just refer to storing data inline under a certain capacity rather than spilling to the heap. You're right that if you have knowledge of the specific types involved you can often do better than a generic implementation (so a string can be better optimized than a small_vector), but terminology-wise, it's correct to say either that std::string has a Small Buffer Optimization or a Small String Optimization (or a Short String Optimization or whatever else); they're all interchangeable terms.
@@_noisecode You are definitely right! I will make it a point to change that to use a reinterpret_cast instead, never thought of that before. Ah your SBO explanation helped alot as I was struggling to realise the relationship between it and SSO. A string is just a char buffer anyway so it makes sense that the umbrella term is SBO and SSO being a special case of that. Thanks so much for taking a look at my code btw, and love the vids you've made so far. I close a video pretty quickly when people unnecessarily start bashing C++ for whatever reason and why Rust is better (ThePrimeagen would be a good example), but the way you cover the concepts are fun to follow and I always gain a deeper understanding of the language after watching them. Personally, the pedantic nature of C++ is exactly the reason I love it, except when it then comes to adding 3rd Party Libs to a project. Look forward to watching whatever you release next! Cheers.
public_cast is actually occasionally useful, I use it for debugging code where I don't want to have to modify the original code (though I could), and the "alternative" is to #define private public
#define public private is UB. If you define any keyword or name in the standard library or use any name anywhere starting with underscore + capital letter except user defined literals, or any name with two underscores, that's UB for you.
@@Bolpat no shit, Sherlock, some compilers even flat out refuse to do it. And that’s why I mention the publiv cast sometimes is usefull, but with a very natrow scope and a very solid asterix.
@@Bolpat Don't fear the UB, embrace it! If you know *why* it's UB you can safely use it. For example the underscore names are UB because the standard library uses them, so defining them could break it. But so does defining any name that any header uses that you include. So it's actually a very well defined behavior.
@@EvanOfTheDarkness"If you know why it's UB you can safely use it." Only today. Tomorrow, you upgrade your compiler, or switch operating systems, and your need to re-do your careful analysis. That doesn't sound "safe" to me.
I thought I knew the dangers of C-style casts. My compiler throws an error when C-style casts interfere with const-ness. I am appalled that this practice isn't standard!
Great story, and nicely told too, but I have to disagree. The most evil cast is the implicit cast we didn't ask for, silently converting something to another type, not because it's the right thing to do, but just because it could be done.
So bit cast ist how you also do things like receiving bytewise messages and interpret them as a struct or convert a struct into a byte array. I usually use memcopy or a union with the byte array and the target struct. Before I knew these techniques I just casted the pointer type into a struct pointer (In C).
Nice video If you did another video about casts you may want to talk about const_cast a little bit, they're really funny if the pointed memory was really not writable
This video was beautiful. I suppose it makes sense, but it is *really truly unfortunate* that the C casts also become the most tempting ones to use lmfao.
only for most performance critical engineering and scientific applications and the backend of every web browser and most drivers and to write the runtime of higher level languages 😂 when it’s not cpp it’s c which does as much cursed stuff with less.
First let me thank you for that public_cast code. I never knew Cpp allowed that. I had a raw_cast that I had to implement class by class. Now I can just use public_cast and not have to manually implement anything. Second thing is; I always use C style cast, unless I need a specific cast. And I always use C style cast as a static cast or dynamic cast.
Careful with that… C-style casts never do dynamic_casts, so if you’re looking for the safe+checked behavior of dynamic_cast, you won’t find it. All your C-style casts will always “succeed” but may give you a pointer you can’t safely dereference, and you won’t have any way to tell.
public_cast can be implemented with friend functions, i.e. declaring the function outside of the class then doing a friend definition from within the class where the pointer to the private member is returned
Then it has nothing to do with generic public_cast, because you, as an author of the class, explicitely granted access to the internal structure of your class.
@@antagonista8122oh you don't declare it friend in the class with the private member. You declare a friend method in a helper class that returns the member and then somehow call that through argument dependent lookup.
@@antagonista8122 Not from inside of the class you're stealing from, but having a separate class that takes a template parameter and defines an external function using friend. Then do the explicit instantiation of the class with the pointer to member and it will work.
my implementation: template class access_private { constexpr friend auto public_cast(K) { return M; }}; class CxSecret { constexpr friend auto public_cast(CxSecret); }; template class access_private; C c; int x = c.*public_cast(CxSecret{}); nice! this approach also allows avoiding the type through auto and making it constexpr.
I use the Chapter 3 one relatively often (more than once, I guess) to force my way into the game engine's node system. It's nice when you're certain it's safe.
@@TankorSmash Safe, in the programming sense, is that part of the language in which no UB can happen. For example, memcpy is not a safe function because it can have UB when called incorrectly. C++ is generally very unsafe, almost anything can give you UB when used improperly, the primary example being int overflow. As someone once told me unrelated to programming, “safe” is when you know it's harmless, not when it happens to be harmless; and I agree. If you really understand that game engine, it might actually be safe. Otherwise it's harmless (happens to be not UB) at best.
@@TankorSmash I've worked with a programming language called D which has a fairly large safe part, while also offering to go low-level like C++. It's generally similar to C++, e.g. it has templates, static typing, constexpr, but also has an explicit notion of safe, as well as other properties functions can have, such as pure (referential transparency). It really opened my eyes to a lot of interesting ideas and fueled my creativity having worked with it. The main downside is you start hating working with anything else and you hate it when something is supposed to work, but doesn't due to a bug.
I am sooo happy that compilers allow warning on any use of C style casts. Combine that with -Werror and that particular problem is solved. Then again: Just use Rust 😉
"It's the reason I don't cut my sandwich in half with a chainsaw, it's a reason I don't hang out in my computer as root..." Me who just migrated from Windows and I didn't make a root profile so I'm technically always as root: *nervous laughter*
@@_noisecode, haha. I have decided that you are right and no one should wield such power. It was funny how the code broke Intellisense in Visual Studio 2022, though.
@@randomcatdude I see what you're saying but I still think "can't" and "less likely" are worlds apart, in this example of casts. As for the rest of the complexity of C++ vs C, it's insane and getting worse, which C isn't. (much as it pains me to say as a C++ lover lol)
@@DrGreenGiant Unlike C++, C does not change the address of pointers when doing casts, because it doesn't have inheritance, let alone multiple inheritance. C++ changed its behaviour.
@@randomcatdude There are absolutely insanely complex fancy types in C. They're just written using Macros and void pointers. Wow, much better and easier to reason about!
The type punning cast from float to int and back could be done in C++14 constexpr without undefined behavior (maybes even in C++11 constexpr). Because int could be "deconstructed" to individual bytes by repeatedly shifting by eight and masking by hex ff. You can do this to float too, but it is more, very more complicated. Basically converting int into float mantissa equivalent and then repeatedly dividing by ten or two. It should work for any float implementation - I'm thinking here IEEE754 and VAX floats.
Every time I have to write C++ I hate its complexity and end up writinga mix of C and C++ because it is simpler that way and I am kinda dumb... I like to live dangerously :D
I have been learning C++ recently, and also like made up human languages. There is this thing called "Agma Schwa's Cursed Conlang Circus" going around, where people crate intentionally obtuse and painful languages, with idiotic rules like having grammatical genders based on Belgian givernment branches, and somehow C++ is still more bonkers inane aneurysm-inflicting than all of the entries together. 😂
@@Greeem jokes aside grammatical gender is when words have different forms or are used differently in sentences depending on some category of the thing they apply to. Classic example is gender gender (English actor/actress etc) but this conlang had different forms depending on what part of Belgium government had authority over the thing, if any. ruclips.net/video/Wr_tyM8pdXk/видео.html Or something like that I am not a linguist
C-style cast will break your code, IF you use inheritance*** I rarely use inheritance in my projects. Is there a problem to use c-style cast in this case?
No. It only breaks your code, if you rely solely on the compile errors, to tell you how to refactor your class. Which you should not do, for many reasons.
@@EvanOfTheDarkness Counter-point, the compiler should aid you as much as possible. Ideally everything that can be determined at compiletime, should. Errors are much easier to find that way. We should build a better language so that we CAN rely on the compiler. Its the frontline defense against ill-formed programs. That means... avoiding C-style casts. It's nonsensical to push errors that can trivially be found at compiletime to runtime.
@@lucass8119 You can avoid C style cast, if you want. Might as well avoid function overloads because adding a new overload can break existing ones (shocking!). And templates are probably not something you should even attempt to do since they can break all the time, when you change something. Actually you might just wanna avoid C++ all together and use Ada or Rust, so you can never write anything that might break someday (allegedly).
@@EvanOfTheDarkness Lol, templates break at compile time. Its a counter point to you... that's one of the reasons templates are superior to runtime generics like in Java or C#. Obviously nothing will be perfect, but C-style casts are just bad. Period. They only have disadvantages. There's valid reasons to use templates, or function overloads, or whatever. None for C-style casts tho... just use C++ casts. They exist for a reason. They're safer, usually faster too.
@@lucass8119 Not all templates break at compile time, but that was not my point. Simple function overloads: you delete one, the code can simply call a different one, silently. And also he's very wrong in thinking that "private" members or base classes are internal to a class and don't matter. Yes, that's how it _should_ be, but in C++ access checking happens _last._ Basically, public or private doesn't matter for overloads and type casting, it is *only* used to tell when the compiler needs to give you an error. Function-style casts (which do the same as C-style casts) remain the shortest, and most readable casts in the language, and that's what matters to me. If you need training wheels, go for the overly verbose C++ casts, but they do the same in the end.
The public_cast might have great potential for evil, but sometimes it's ok to do what the ruling class calls evil in order to free an internal API that was greedily imprisoned ten namespaces deep in code no mere developer is permitted to modify. It's a better option, at least, than punning the class into a duplicated copy you've written to have the same byte layout, or memcpying at blind pointer offsets.
Fun fact: one of the main reasons i picked C++ up again after coding my home projects in Go for some time was to get a basic reinterpret cast working because fucking Go wont let me do that, but i just want to optimize the 16 bits of my array item down to 8, without having to iterate through the array and reallocating the memory, both of which i really dont want to do that (or can) because were edging at the capability of what fits into my 32GB of RAM with the 8 bit values alone
I propose slitely better sintax for public_cast. The only thing I don't like is that one still has to specify the member type when declaring the accessor. template struct static_constructor { struct value_type{ value_type(){ initialize();} void Do(){} }; static value_type construct; }; template typename static_constructor::value_type static_constructor::construct; class X{ int a= 42; double b= 43.2; float c; friend void print(const X&b) { std::cout
from a later comment of mine: template class access_private { constexpr friend auto public_cast(K) { return M; }}; class CxSecret { constexpr friend auto public_cast(CxSecret); }; template class access_private; C c; int x = c.*public_cast(CxSecret{});
I concur. I had a bug like that after making a class (which used to be a base) inherit from a new base. Compiler happily built but because the class also inherited a mix-in, the C-style casts got the wrong addresses and Bad Things happened; the mix-in was now sizeof(ActualBase) offset. Oh C++ how could you 😪
i've got a way to avoid specifying the type in public_cast! and this version is also constexpr. the idea is to create a function returning the value, which allows us to use auto. template class access_private { constexpr friend auto public_cast(K) { return M; }}; class CxSecret { constexpr friend auto public_cast(CxSecret); }; template class access_private; C c; int x = c.*public_cast(CxSecret{}); i tried making K a template type for public_cast to allow typing it as public_cast, but i haven't managed to do that. this uses 0 extra memory and is constexpr. the compiler properly inlines everything as if this was normal access to a public variable.
@8:40 This feels like a real hostile code design view on the nature of private/public variables lol @14:00 Again, I feel like it's the user's job to know what they're doing if they're accessing the internals of a library, but this frames it as if it's a conflict between the library developer and the library user or whatever instead of a collaborative effort.
This is *so* C++. You have a lot of nice tools available to safely do an operation. There are several variants so you can pick the one that fits your usecase best. At the cost of language spec size, sure, but that's an understandable tradeoff.
And then you have the one way that is easiest to write, inherited from C and also the most dangerous option of all.
Your function can either take an std::vector::iterator, or a T*...
Yeah? I've never used C or C++, but like, what the hell? I think i'm just gonna stick to Java...
Welcome to the world of embedded. Even when you want to go with C++ you will end up using some C because all the drivers are C-only. Your precious vectors will be casted to scary pointers (and DMA doesn't know any other way anyways). Also say hi to bitfields and magic bytes
@@brylozketrzynthere's no such thing as "c only". C++ was originally implemented by transpiling it to C and this can still be done.
@@brylozketrzynFortunately even if library is written in C this mess of a language can be wrapped into something useful. Unfortunately many times the C code quality/practices can be so bad that the best you can do (if possible) is to write it from the scratch and don't even bother wrapping it.
I love that you mention how obvious it is that public_cast should never be used, because I 100% agree with you - and yet, I've actually ran into a problem once in the past where using it was legitimately my best option (and I ended up solving it in an even more horrible way)! 😅
So the situation came up when we remastered an old video game, and in the process, we needed to update the Havok physics middlewere used in it to a newer version. This is always a horrible process (and one of many reasons I hate using Havok, or middleware in general), but it was made especially difficult for us thanks to another problem: The binary assets for the game had Havok data embedded directly into them, and since the majority of the original game's converters were no longer usable for us, we couldn't just recreate those assets in the new format. Yet we also couldn't use the original game assets without modifications in the game, since the new Havok version would just reject the old serialized data.
What we had to do to solve the problem was to create a new converter that just reads in the game's old binary assets, deserialize the Havok data within them using the old Havok version, re-serialize it using the new Havok version, and finally write the entire asset back out to a file. However, there was a small problem there: This kind of migration process wasn't built directly into Havok. I know it has some support for version migration, but not for this specific use case. So instead of migrating the data directly, we had to create new Havok data from scratch by using the raw deserialized data as a base.
Except, there was yet another problem: The old data we needed access to in order to create the new data was completely private in the headers of the old Havok version - and that's where dirty tricks come into play, like your public_cast shown here. Another option would have been to simply edit the headers of the legacy Havok version (since we really didn't care about legacy Havok anymore). Instead, I ultimately decided to go with an even nastier (and even funnier) solution:
#define private public
#include
#undef private
Yes, this actually worked - and I was laughing my ass off for hours after writing it! 😄
Since this specific converter was just a quick and dirty solution to a very ugly problem and we didn't even intend to maintain it long-term, we were fine with getting our hands a little dirty there. I guess this goes to show that no matter how obviously horrible something is in coding, there might always come a time when using it is legitimately your best option.
Thanks for the scary story prequel. :) This sounds like a nightmare! I’m glad you heightened the comedy by cracking it in the most cursed way possible. Desperate times call for desperate measures!
@@_noisecode We coders do what coders gotta do, even if it means getting our hands dirty along the way! 😤
Holy shit lol. Actually laughed out loud..
visibility was always merely a suggestion... :)
Sounds like the most diabolical laugh ever! Were there thunderclaps and ominous organ music in the background too? 😄
There's a worse version of the C-style cast example. In this version, S isn't refactored to remove the IFoo superclass. S doesn't change at all. Instead, you just happen to do a cast from a source file that only has a forward declaration of S.
That is, you have some code like this:
void foo(S *s) {
bar((IFoo *)s);
}
If S is fully defined, then this will compile as a static_cast, offsetting the pointer.
If S is only forward declared, then it will compile as a reinterpret_cast, not offsetting the pointer.
For bonus points, `foo` might be an inline function defined in a header, in which case it might mean different things when compiled in different source files! (Which is also undefined behavior due to being an ODR violation.)
My god...
Jesus Christ
That public_cast is truly one of the most cursed things I've ever seen. It makes me feel a deep unease.
IIRC, the standards committee agree that: it *is* allowed, but that it *shouldn't* be allowed, but they also haven't been able to do anything about it since 2015 when this cursed technique was discovered.
@@ultradude5410 ultimately who cares, #define private public is a much simpler, much more effective technique of defeating encapsulation.
@@ultradude5410 There's something unreasonably funny to me about "but they also haven't been able to do anything about it" 🤣
@@ZeroPlayerGameI am almost sure that defining any keyword is UB.
@@ZeroPlayerGame#define 0 1
Ok ok ok, I'm done using C-style casts (not that i ever really used them). Instead, you've convinced me to implement a c_style_cast(...) template to hide away everything I'm doing wrong into one place. Thanks!
A perfectly C++ solution!
This comment wins.
Now, implement c_style_cast *without* using a C-style cast.
@@gblargg nobody has to know
So glad I use C-style casts all willy-nilly. Now that I'm seeing more proper C++ casts I can feel like a man when I swing out a C-style cast in accordance to my Principle of Most Power.
😂😂😂
Same, ever since I discovered void *, I only use this type, for every function parameters and return and cast them inside.
@@briannormant3622 the real man's va_list lmao
That violates the principle of most power. You should be opening /proc/[PID]/mem and copying the bits on the stack over instead.
Bonus points: It even bypasses Rust safety guidelines, so you can piss off two language committees at once.
@@briannormant3622 void * is basically a char * that doesn't let you do pointer arithmetics
This is terrifying. I'll make sure to recount it in detail to all my friends for Halloween.
You know, you're not so much scaring me away from particular usage, rather you're scaring me away from C++ altogether.
5 years of commercial software engineering, still scared of C/C++
id just embrace the chaos at this point
@@not_herobrine3752I'd rather use Rust to protect my sanity
sounds good! use Rust instead if you have the chance, it's so much nicer and you're relieved of a lot of the burden C++ places on you (for example things like thinking about move constructors, copy constructors etc. and generally being concerned about memory safety)
@@asdfghyter No. Go away shill.
Great video. I was amused when you said about omitting constexpr, nodiscard, ... Not withstanding the craziness of casts, I think that the ridiculous number of decorators we add to code is a huge danger to future of the language. It's starting to become quite unreadable and the barrier to entry to understand all this stuff for a junior is insane to "correctly" write a function that adds two numbers together, for example.
I’m with you 100% on this. It’s getting ridiculous the number of magic words you have to add to your function declarations to fulfill all the best practices. Not to mention that they’re easy to get wrong.
@@_noisecode Or just plainly forget....
Btw, shouldn't the argument of that function be an r-value reference to be completely correct?
One of those things I guess..
@@peterbonnema8913 I wouldn't make it an rvalue reference--that would add some syntactic and type system noise and not really change the semantics in any beneficial way (except I think for pathological cases like implicit_cast(move(existingValue))?).
But, yes, even having to ask the question and painstakingly reason though the answer is definitely a symptom of C++ being overly complicated. ;)
Great video! Love the description of all the stuff, especially refactoring the fast inverse square root not to be UB by using bit_cast
I have no idea what was going on during the public_cast section
Thanks, felt the same. The rest makes perfect sense but chapter 3 - wtf 😂
Arcane magic
As a C++ expert, “public_cast” and even C-casts aren’t the most evil ones possible.
For example, with c++20 concepts, partial template specialization, and constexpr/consteval I can write a cast whose behavior changes based on both the type AND value. And I can even make it recursive. I can write a relative of “public_cast” that alters the input object’s identity by swapping out vtable pointers with a related class’s (yes this is technically undefined in the spec, but it’ll work on all 3 major compilers). And there are others.
See, what you have to remember about C and C++ is that 1) at the bottom, everything is ultimately a bag of bytes and 2) their design is all about giving you as many tools as possible, rather than preventing misuse. It’s a classic case of freedom+ power = potential for misuse. If you think the abuses that c++ enables or permits are bad, take a look at what’s possible with straight assembly.
C++ standard not allow more.
At least, assembly doesn't have scary casts.
I shivered looking at the thumbnail ☠️
Edit after watching the video: truly terrifying stuff 😮
6:04 I’m so glad that literally every popular compiler doesn’t care what the standard says and makes this implementation defined behavior. So much production code takes advantage of reinterpret_cast for bit shenanigans
Interesting... can you give some reference to that?
@@noamtashma617 Well among others reinterpret_cast has one primary usage which is (quoting CPP-Reference) converting "an expression of integral, enumeration, pointer, or pointer-to-member type to its own type. The resulting value is the same as the value of expression.". Meaning legally speaking it is only defined if you cast an object back to its original type. You can also use this to cast to an object that is able to hold any value from the type you are casting from.
Meaning you could cast the address of an int to a std::uintptr_t.
To understand this, it basically means that
int i = 7;
std::uintptr_t v1 = reinterpret_cast(&i);
is a legal use of reinterpret_cast. If you use a static cast you would get a compile error because it cannot cast from type int * to a std::uintptr_t. The reasoning is that different types have different size and alignment requirements. So you are left with a reinterpret_cast to convert the address of an int to the address of an unsigned long, which fulfills the second criteria (casting to a type that can hold any value of another type).
They give a good example for another use with the following code:
struct S1 { int a; } s1;
int* p1 = reinterpret_cast(&s1);
This is legal to do because S1 == sizeof(int*). So no harm is done by converting the types. This is referred to as "pointer-interconvertible objects".
All of that said, sometimes you just want a quick and easy fix to interpret the bytes of a float as an int and vice versa, like in Quake.
Since most production code is older than C++20, if that was ever required and you _really_ wanted to avoid std::memcpy then this was the way to do it.
well, it actually is not undefined behaviour it is unspecified behaviour, which means each compiler might do whatever it finds easier to implement. Unspecified Behaviour may be read as try to use something else, or if you know your compiler implementation good enough
@@noamtashma617 See GCC, and linus' rant on strict aliasing. GCC explicitly doesn't care about this rule by default because it would break their most compiled project of all time (the linux kernel)
Before bit_cast, there was no portable option. That's why bit_cast was added, after 22 years.
I always thought that const_cast was the most insane thing. And then I learned that you can cast away "private".
I think casting away private is something that's incredibly useful when you need to read the data that the API didn't expose for some unknown reason.
@@Caellyanso many times I've had to copy/paste an entire class only to be able to change/access one private member instead of a simple inheritance... I really wish libraries just used protected. But then again, in my line of work I have to constantly modify things like cryptography classes that explicitly don't want the average user messing with their members.
const_cast is useful if you have to use an old C library that was made without using the const keyword, when you know that it doesn't modify some function parameters.
"Undefined behaviour" words to stop the weak from treading where the strong stride.
I'm compiling using a compiler, not a coding standard!
3:56
Fun fact: the Typescript compiler has no qualms about taking the inverse of a template type to do type deduction. This is actually quite useful, except for the part where there's absolutely no spec for when it will or won't work.
God that is the most JavaScript shit
I think the issue is that the result is not deterministic which doesn't matter for TS because it doesn't have to do much with type results and it can resolve to `A | B | ...` while C++ doesn't have an implicit type union and has to know the actual type as the result will affect the behavior of the program. I'm not 100% sure I got that part of the video but that's why if I did.
So the 0x5f3759df from the Quake inverse squareroot algorithm has to do with the constants associated with a Taylor series expansion of the logarithm of 1/sqrt(x), if I remember correctly. The bit shift is done as a more efficient (read lossy) way of multiplying/dividing by 2 depending on shifting to the left or the right. There was a video that went over the algorithm in great detail that made it super easy to follow.
Nothing lossy about implementing a multiplication or division by 2 on a binary machine using a shift - the result is exact.
@@TrimeshSZ Unless you mix signed and unsigned integers. In this case, make sure you are doing the good casts at the good times 🙂(whatever cast type it is... I don't see C-casts as "cursed", but the C++ language is entirely "cursed", lol)
My understanding is that doing that trick on a float, like is being done in the algorithm, will lose at least some information. If it were an unsigned int or whatever it would be exact, but I didn't think that was the case for floats.
@@TrimeshSZ
thank you for the correction, I couldn't remember exactly which step was which. I should have checked before commenting.@@leeroyjenkins0
The key insight is probably that halving the exponent gives the approximate square root, to within a factor of 2. Then the rest is just finding the best approximation
The "explicity blessed by the C++ standard" pharse, made me subscribe to your channel
C-Style casts also cast away volatile (and yes volatile is useful in some contexts, namely for variables that are expected to be modified externally by e.g. interrupts, IPC, network callbacks, so the optimizer shouldn't reason about their value).
This was a fantastic video - great story - great technical detail
I loved the way this video was structured, amazing narrative
truly one of the best c++ videos ive seen. amazing :)
6:40 is not entirely correct. The C++ standard has an explicit note about accessing an inactive union member for special cases like struct with common beginnings
/*
One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (11.4), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see 11.4. - end note
*/
This doesn't work in constexpr context though :)
Also, for 11:30 it's probably doable with stateful metaprogramming
amazing video btw :)
Ah yes, the "C inheritance" exception to union access. Thanks for the correction. :)
It doesn't work in constexpr because it's still UB, you are incorrect. This isn't what the common initial sequence rule refers to, it's for when you have a union of 2 structs, and both structs have a common member -of the same type-, you can access that member through either type, regardless of which struct is the active member of the union.
I dont fault you for misunderstanding this rule, it's a bit of a doozy, even language lawyers have a hard time with this one.
@@alexb5594 Isn't this essentially what I said? By "common beginnings" I meant "structs that start with members of the same type".
Alright, you got me, I'm adding a new line to my personal style guide.
I've been using c++ for years at this point. Had no clue how scary type casting can be. Great video!
The C-style cast is just how reinterpret_cast should've been. If I'm willing to cast away the actual type and treat your data like the series of bits it is, do you think I care about it being const? No, not a bit!
@@EvanOfTheDarkness But then how do the reinterpret_cast discussed at 15:34? Would I need two casts or something?
The scariest thing about any of this code is using multiple inheritance.
Heh. I was very careful to call the second base class `IFoo` in order to make it a case of classical OOP--one base class, followed by multiple interfaces. That's a flavor of "multiple inheritance" that's supported by plenty of languages and considered safe because it doesn't suffer from the diamond problem... but in C++, it still has this footgun surrounding C-style casts.
It doesn't really matter because you have the same problem even with only single inheritance
I have to say, aggregate multiple inheritance never made much sense for me in C++.
It causes endless complexities trying to hide the dirty trick that it is implemented as composition, which ruins address identity.
@@noamtashma617 No, the pointer doesn't offset with single inheritance. The pointer to Base and the pointer to Derived are the one and the same. Only the length of the object is different.
@@Carewolf Ah!
Does that mean that, with single inheritance, the object and sub object share the same v-table?
1:05 - 4:36 This is one reason why type inference is so much easier when you don't have subtyping. In languages like Haskell and Rust, the bidirectional type inference can in principle allow you to omit every single type annotation and still figure out the types everywhere. now, rust requires you to specify types on (non-lambda) functions, but that's just because it's good practice to do so, not because of any technical limitations. This is because the lack of subtypes allows there to always be a single most generic canonical type of any value, so you will never have to try to guess which of the possible types (including upcasting) a value should have, so the "implicit cast" function is not needed.
there is an exception. there are methods that allow multiple return types, like .into()
in that case the return type has to be stated.
@PeterAuto1 is correct. Rust *does* have subtyping via its trait system, so it is sometimes impossible to infer with missing type annotations (i.e. sometimes type annotations are required).
@@strager_ but from what I can tell, subtyping via parametric polymorphism and unification still makes it a lot easier to make type inference. Even in the case PeterAuto1 mentioned, there is still a principal (canonical most generic) type, the only issue is that there's no canonical monomorphic type if it's also consumed by a polymorphic function. Which does indeed mean we need to specify the type there, but not because there's no principal type in general, just because the principal type is polymorphic
The public cast as an idea almost seemed like something good when I didn't know much about programming, but the more I learned, the more I appreciate local variables. Scoped variables are everything
If the reason you thought public cast makes sense was ignorance, then why make your variables private in the first place?
I love type punning (the word, not the action).
Amazing video, I love your presentation style.
public_cast reminds me of C# reflection which enables every private property and private field to be gettable and settable, even from other assemblies. It's interesting that some of that is possible in C++.
You can do the same sort of f*ckery in java. that one time a year when you need it it's wonderful.
That time someone uses it because it would be fun? NOT wonderful... :D
If you are super explicit about that things, they may be very convenient. But the moment when developers starts to make too much magic, I say it's time too stop. I usually hate implicit things besides rare exception
@@TheRPGminer I firmly believe doing magic is a good thing if it will make life easier and code run faster for the next person in line, and it's fun 😂
Unofficially and therefore not guaranteed to work, you can do it with pointer arithmetic.
@@Caellyan yes, until the code actually needs to be maintained and the original developer has left...
Another excellent video! Please keep up the great work.
The first example could be called like `foo (base, derived);`, which isn't ideal but is at least simple and avoids casting
This is exactly what I was looking for in the comments. Why would you start casting when the solution is so simple.
You failed to convince me, I will continue to use the c-style cast.
The part where the c-style cast "changes its meaning without telling you about it" seems to be the reason why C++ has created this "zoo" of different casts in the first place. A valid design decision, but I guess it's not for everyone.
I mean you can still do the 'reinterpret cast' in C by just casting a pointer, for example
float f = *(float*)&i;
Where i is an integer. So there are at least two types of casts in C, direct one and the one reinterpreting pointer (equivalent to reinterpret cast on C++)
When you first started talking about pun_cast I was thinking "isn't that just bit_cast? How would it be different?" Guess it wasn't different at all lol
I genuinely have a hard time understanding why bit_cast is better than reinterpret_cast. I've never seen people use bit_cast on pointers or references, only values. I didn't even know reinterpret_cast could be used on values.
C++ is the definition of inventing new solutions for problems caused by previous "solutions"
This was my favorite Halloween story
and yet i hear it all around "but i can write correct C code" while writing C++, with blissful ignorance of the dangers that lurk under the C, or maybe just they know their stuff and do it anyway because it is such a nice touch!
C++ has "better ways of doing things than C", but in a significantly more complex environment, while retaining backwards compatibility with the C way, its 90s way of doing things and an extra random way that got immediately depracted 2 standards later. C casting words fine in C because you can't have ultra fancy/complex types, so the reinterpretation is always pretty simple. C function pointers are ugly and incomprehensible, but you can typedef. You have to use raw pointers in C, but since it's the only way to pass things by reference, the code is very self-documenting and stays simple. Overall, C is much simpler than C++
@@vercolit I often see this argument that C is simple while C++ is not, therefore C++ is more unsafe. And frankly I don't think this makes any sense. While C is simple, as in the language definition is small, that doesn't mean making programs is simple. Imagine you're trying to build a car. On one hand, you can have a simple toolbox - just a screwdriver. On the other, a complex toolbox, including various tools. But which toolbox makes constructing the car simpler? Counterintuitively... not the the simple toolbox. You can do it, in very hacky ways.
It is common place to see extremely hacky solutions in C that just perform basic C++ actions, but in an incredibly unsafe and hard to reason about ways. For example, OOP. OOP is useful, particularly for GUIs. Now we have GTK, that hacks together objects in C using function pointers. But... the compiler enforces very little type checking on function pointers. And, there's no access modifiers. And, now a simple function call can actually result in memory errors, like segfaults. All to achieve what's trivial (and safe) in C++.
Or, how about generic functions? Seems simple enough... take libc qsort(). We want to take in a generic function for comparison. But, we have to use void pointers. Now, we have circumvented the type system. We have pushed errors to runtime. Not only is this horribly inefficient in every sense of the word, it also doesn't make sense. 99% of the time the function we want to use is known at compile time, but we HAVE to force runtime behavior. Again we've opened the door to memory errors that have no reason to exist, and as a nice side effect our sort is ~10x slower than an equivalent algorithm in C++. Whoops!
Or, what about generic containers? Also a very useful concept. Trivial, type safe, and memory safe in C++. In C...? Well, we could use void pointers. But this is verbose to work with, and again pushes behavior to run time. If we want a vector of int, we should enforce everything in the vector is an int... you can't safely do that with void pointers. So we need code generation. Ah, macros! Except, macros aren't templates. They aren't type aware. They're primitive. Yet again we're side stepping the type system and hoping that users use the macro correctly. If they don't, then whoops! We can use the newish _Generic... but these are fake generics. They're not general purpose.
So we end up writing 10x as much C code to do simple and easy things in C++. And, along the way, we sacrifice our safety. All in the name of a "simple" language. It's not like these are cherry picked examples, these will pop up in literally every non-trivial piece of software. There are countless things that are trivial (and commonplace) to implement in C++, but nightmare-ish in C.
you know I'm just gonna continue writing C
The problem is multiple inheritance. The more I learned of the "magic" going on in the background to make it possible, the more I think it was not the best idea.
As a Rust programmer the words "const_cast" scare me even more than the C style cast.
I agree, and as a Rust programmer myself, we always have `std::mem::transmute` …but also miri to tell us not to.
Somehow C-style 'type(value)' casts always felt most aesthetic to me since it just looks like a function you give some other data type as an argument that then turns into the data type the function is named after.
This is why I like Nim’s cast system. When using Nim, anything with the word “cast” both appears like a sore thumb in the code, and is guaranteed to be unsafe.
This is a quality video discussing fairly advanced issues and got all the technical details correct.
Really well made video
Who would have thought that making your language more complicated would necessitate even more complextiy to fix the mess? As someone who primarily uses C for his projects, I can proudly say I am sticking with C-style casts.
For the quake example, you should probably static assert on std::numeric_limits::is_iec559 Although it is unlikely to find an implementation where it is false.
C is for building houses
C++ is for playing with legos
public cast can be great fun
once used it to add iterators to a third party container class
I wasn't aware that the "standard" C++ behavior was to only allow reading from the one member of a union was most recently written. Who came up with that idea? That defeats one of the main purposes of unions.
Yes! After learning about how the C-style cast works in C++ I had a small and slightly incredulous but ultimately easily resolved conversation with my manager in a PR, and my guess was that it would top this list. So easy to *think* you know what you're doing with it and so easy to break things.
I imagine that most people "learn" c++ by seeing enough of the short videos and second-hand books of someone trying to describe individual details about the language, and then try to put what they remember into their code, hoping to have expressed what they wanted, before duckduckgoing corrections to errors to get their project to compile.
I've written c++ for years for my job and this video convinced me to advocate to rewrite that whole pos project in rust
as a kotlin developer, I am literally crying right now.
are you trying to learn C++?
Tha's some good shit bro.
unsafe { mem::transmute(val) } goes brrrr
unsafe fn poke(addr: usize, val: T) {
*std::mem::transmute::(addr) = val;
}
For the pun_cast implementation you mentioned UB in C++ when reading an inactive union member. While this is true, the standard enforces this but technically most if not all modern C++ compilers support it, and even use it in their implementations (short string optimization). I guess it depends on how portable you need your implementation to be.
I do believe type punning through unions tends to be “supported” by compilers yeah, but mostly just due to necessity, since so much code relies on it. If we want to one day allow compilers to have true freedom to do all the amazing optimizations they want to do to our code, we need to move away from “eh, it’s not technically allowed but it works in practice,” and toward what is strictly allowed. That sentiment does break down in lots of cases though, since C++ doesn’t even always give you the tools to express yourself both efficiently and legally. It seems like the committee is working hard on this though; the object model has been a big focus of both C++20 and 23.
I’m curious if SBO string implementations actually type pun through unions. I understand that a union is likely used for the storage, but since SBO is either active or it isn’t, it seems like the access to the different variants would be non-overlapping. I’m curious why an implementation would ever access data from the inactive state.
@@_noisecode If I'm not mistaken, SBO is specifically for vectors, where the type it is storing is not known, and thus restricts any funny "exploits " you can do.
But SSO is specifically for strings and has the unique ability to only care about the char type (well all the different char types I guess). There is incredible bit manipulation done to achieve this, reserving the LSb as a flag to check whether it's short or long, and so between the 2, you only allocate even lengths to satisfy this (which also turns out to be more efficient anyway). There's much more but it wouldn't suffice trying to explain it in a RUclips comment.
I believe libc++ implements it this way but MSTL and libstdc++ don't and instead use an approach almost identical to SBO.
If you're interested I tried to write a legible version: github.com/PremanJeyabalan/stdcpplib/blob/main/src/String.h , but please know I'm just a undergrad student who doesn't write C++ professionally (yet). I'd love to hear your thoughts if any.
Ah right, I knew about this trick at one point but had forgotten. For what it's worth, I believe this could be done without undefined behavior; since in your implementation, the size is the first byte of the short variant, and it's legal to reinterpret_cast to inspect the byte representation of any object, you could just reinterpret_cast(the_entire_union) and then inspect the resulting first byte, which will be the value of your short_t::size_flag, accessed legally even when the string is long.
Nice string implementation! It's awesome that you're digging into this stuff so deeply at an undergrad level, and I'm sure any number of places will very eagerly hire you up as soon as they can. :)
Lastly, the term SSO is just a special case of SBO--they both just refer to storing data inline under a certain capacity rather than spilling to the heap. You're right that if you have knowledge of the specific types involved you can often do better than a generic implementation (so a string can be better optimized than a small_vector), but terminology-wise, it's correct to say either that std::string has a Small Buffer Optimization or a Small String Optimization (or a Short String Optimization or whatever else); they're all interchangeable terms.
@@_noisecode You are definitely right! I will make it a point to change that to use a reinterpret_cast instead, never thought of that before.
Ah your SBO explanation helped alot as I was struggling to realise the relationship between it and SSO. A string is just a char buffer anyway so it makes sense that the umbrella term is SBO and SSO being a special case of that.
Thanks so much for taking a look at my code btw, and love the vids you've made so far. I close a video pretty quickly when people unnecessarily start bashing C++ for whatever reason and why Rust is better (ThePrimeagen would be a good example), but the way you cover the concepts are fun to follow and I always gain a deeper understanding of the language after watching them. Personally, the pedantic nature of C++ is exactly the reason I love it, except when it then comes to adding 3rd Party Libs to a project.
Look forward to watching whatever you release next! Cheers.
The content is great and all, but the writing though, insane
cool, well-made video! more please
Great video!
public_cast is actually occasionally useful, I use it for debugging code where I don't want to have to modify the original code (though I could), and the "alternative" is to #define private public
#define public private is UB. If you define any keyword or name in the standard library or use any name anywhere starting with underscore + capital letter except user defined literals, or any name with two underscores, that's UB for you.
@@Bolpat no shit, Sherlock, some compilers even flat out refuse to do it.
And that’s why I mention the publiv cast sometimes is usefull, but with a very natrow scope and a very solid asterix.
@@Bolpat Don't fear the UB, embrace it! If you know *why* it's UB you can safely use it.
For example the underscore names are UB because the standard library uses them, so defining them could break it. But so does defining any name that any header uses that you include. So it's actually a very well defined behavior.
@@BolpatDoesn't preprocessing happen first? All the compiler proper sees is "public"
@@EvanOfTheDarkness"If you know why it's UB you can safely use it."
Only today. Tomorrow, you upgrade your compiler, or switch operating systems, and your need to re-do your careful analysis.
That doesn't sound "safe" to me.
I thought I knew the dangers of C-style casts.
My compiler throws an error when C-style casts interfere with const-ness. I am appalled that this practice isn't standard!
Great story, and nicely told too, but I have to disagree. The most evil cast is the implicit cast we didn't ask for, silently converting something to another type, not because it's the right thing to do, but just because it could be done.
So bit cast ist how you also do things like receiving bytewise messages and interpret them as a struct or convert a struct into a byte array. I usually use memcopy or a union with the byte array and the target struct. Before I knew these techniques I just casted the pointer type into a struct pointer (In C).
Nice video
If you did another video about casts you may want to talk about const_cast a little bit, they're really funny if the pointed memory was really not writable
Eh, too easy of a target. :) It's simply UB to write to memory that's not writable (or more generally variables that were 'born const').
@@_noisecode That's true. But maybe it's worth mentioning because UB are not known enough 🙂
@@JUMPINGxxJEFF I Like the UB that spins the CPU fan backwards and spawns demons. Did you know that one?
@@JuddMan03 I don't buy it sounds like lots of fun, I'd really love to hear more about this one
This video was beautiful. I suppose it makes sense, but it is *really truly unfortunate* that the C casts also become the most tempting ones to use lmfao.
Is this a language that's really used in the wild?
only for most performance critical engineering and scientific applications and the backend of every web browser and most drivers and to write the runtime of higher level languages 😂
when it’s not cpp it’s c which does as much cursed stuff with less.
People write high-load websites in this.
Thankfully not. In real world scenarios C++ is useless
Yes.
@@colep14💀
Speaking of cont_cast : I totally love how the ISO standard basically says if you're using const_cast, you've basically f***** up....
First let me thank you for that public_cast code. I never knew Cpp allowed that. I had a raw_cast that I had to implement class by class. Now I can just use public_cast and not have to manually implement anything.
Second thing is; I always use C style cast, unless I need a specific cast. And I always use C style cast as a static cast or dynamic cast.
Careful with that… C-style casts never do dynamic_casts, so if you’re looking for the safe+checked behavior of dynamic_cast, you won’t find it. All your C-style casts will always “succeed” but may give you a pointer you can’t safely dereference, and you won’t have any way to tell.
public_cast can be implemented with friend functions, i.e. declaring the function outside of the class then doing a friend definition from within the class where the pointer to the private member is returned
Then it has nothing to do with generic public_cast, because you, as an author of the class, explicitely granted access to the internal structure of your class.
@@antagonista8122oh you don't declare it friend in the class with the private member. You declare a friend method in a helper class that returns the member and then somehow call that through argument dependent lookup.
@@antagonista8122 Not from inside of the class you're stealing from, but having a separate class that takes a template parameter and defines an external function using friend. Then do the explicit instantiation of the class with the pointer to member and it will work.
my implementation:
template
class access_private { constexpr friend auto public_cast(K) { return M; }};
class CxSecret { constexpr friend auto public_cast(CxSecret); };
template class access_private;
C c;
int x = c.*public_cast(CxSecret{});
nice! this approach also allows avoiding the type through auto and making it constexpr.
I use the Chapter 3 one relatively often (more than once, I guess) to force my way into the game engine's node system. It's nice when you're certain it's safe.
It's not safe. It's only not UB.
Could you help me understand what you mean by safe? @@Bolpat
@@TankorSmash Safe, in the programming sense, is that part of the language in which no UB can happen. For example, memcpy is not a safe function because it can have UB when called incorrectly. C++ is generally very unsafe, almost anything can give you UB when used improperly, the primary example being int overflow.
As someone once told me unrelated to programming, “safe” is when you know it's harmless, not when it happens to be harmless; and I agree.
If you really understand that game engine, it might actually be safe. Otherwise it's harmless (happens to be not UB) at best.
I can confirm that it is safe in that definition. Thank you for clarifying your point!@@Bolpat
@@TankorSmash I've worked with a programming language called D which has a fairly large safe part, while also offering to go low-level like C++. It's generally similar to C++, e.g. it has templates, static typing, constexpr, but also has an explicit notion of safe, as well as other properties functions can have, such as pure (referential transparency). It really opened my eyes to a lot of interesting ideas and fueled my creativity having worked with it. The main downside is you start hating working with anything else and you hate it when something is supposed to work, but doesn't due to a bug.
I am sooo happy that compilers allow warning on any use of C style casts. Combine that with -Werror and that particular problem is solved.
Then again: Just use Rust 😉
Rust Traits are so much better than all these memory layout based shenanigans
"It's the reason I don't cut my sandwich in half with a chainsaw, it's a reason I don't hang out in my computer as root..."
Me who just migrated from Windows and I didn't make a root profile so I'm technically always as root: *nervous laughter*
Returning to this video to look up public_cast implementation so that I can access internals of a class inside a test case for verification.
In my effort to do good, I’ve unleashed great evil upon the world
@@_noisecode, haha. I have decided that you are right and no one should wield such power. It was funny how the code broke Intellisense in Visual Studio 2022, though.
Yeah, this is a little-known feature of Intellisense. It doubles as a linter because it craps the bed when your code becomes too cursed.
As someone who has written a lot of code in C. This is why I'm scared of C++
Surely this video makes you more scared of C since you have less control of the cast you intended?
@@DrGreenGiant C-style casts are much less likely to go wrong in C though, since there are no insanely fancy complex types
@@randomcatdude I see what you're saying but I still think "can't" and "less likely" are worlds apart, in this example of casts. As for the rest of the complexity of C++ vs C, it's insane and getting worse, which C isn't. (much as it pains me to say as a C++ lover lol)
@@DrGreenGiant Unlike C++, C does not change the address of pointers when doing casts, because it doesn't have inheritance, let alone multiple inheritance. C++ changed its behaviour.
@@randomcatdude There are absolutely insanely complex fancy types in C. They're just written using Macros and void pointers. Wow, much better and easier to reason about!
The most evil was always the inheritance.
And double the inheritance, double the evil.
The type punning cast from float to int and back could be done in C++14 constexpr without undefined behavior (maybes even in C++11 constexpr). Because int could be "deconstructed" to individual bytes by repeatedly shifting by eight and masking by hex ff. You can do this to float too, but it is more, very more complicated. Basically converting int into float mantissa equivalent and then repeatedly dividing by ten or two. It should work for any float implementation - I'm thinking here IEEE754 and VAX floats.
Every time I have to write C++ I hate its complexity and end up writinga mix of C and C++ because it is simpler that way and I am kinda dumb...
I like to live dangerously :D
This is the reason I would never use C++ for anything except if I had a damn good reason to
I used public_cast to write unit tests for private functions.
Hah, the C-style cast laughs in the face of private inheritance!
think-cell public range library has some varied casts including numeric casts, up and down casts, implicit_cast
I have been learning C++ recently, and also like made up human languages. There is this thing called "Agma Schwa's Cursed Conlang Circus" going around, where people crate intentionally obtuse and painful languages, with idiotic rules like having grammatical genders based on Belgian givernment branches, and somehow C++ is still more bonkers inane aneurysm-inflicting than all of the entries together. 😂
"grammatical genders based on Belgian government branches" what does this even mean lmao
@@Greeem I have no fucking idea, but I understand it better than public_cast lol
@@Greeem jokes aside grammatical gender is when words have different forms or are used differently in sentences depending on some category of the thing they apply to.
Classic example is gender gender (English actor/actress etc) but this conlang had different forms depending on what part of Belgium government had authority over the thing, if any.
ruclips.net/video/Wr_tyM8pdXk/видео.html
Or something like that I am not a linguist
public_cast gives me the Java mixin heebie-jeebies
C-style cast will break your code, IF you use inheritance***
I rarely use inheritance in my projects. Is there a problem to use c-style cast in this case?
No. It only breaks your code, if you rely solely on the compile errors, to tell you how to refactor your class. Which you should not do, for many reasons.
@@EvanOfTheDarkness Counter-point, the compiler should aid you as much as possible. Ideally everything that can be determined at compiletime, should. Errors are much easier to find that way. We should build a better language so that we CAN rely on the compiler. Its the frontline defense against ill-formed programs. That means... avoiding C-style casts. It's nonsensical to push errors that can trivially be found at compiletime to runtime.
@@lucass8119 You can avoid C style cast, if you want. Might as well avoid function overloads because adding a new overload can break existing ones (shocking!). And templates are probably not something you should even attempt to do since they can break all the time, when you change something.
Actually you might just wanna avoid C++ all together and use Ada or Rust, so you can never write anything that might break someday (allegedly).
@@EvanOfTheDarkness Lol, templates break at compile time. Its a counter point to you... that's one of the reasons templates are superior to runtime generics like in Java or C#.
Obviously nothing will be perfect, but C-style casts are just bad. Period. They only have disadvantages.
There's valid reasons to use templates, or function overloads, or whatever. None for C-style casts tho... just use C++ casts. They exist for a reason. They're safer, usually faster too.
@@lucass8119 Not all templates break at compile time, but that was not my point. Simple function overloads: you delete one, the code can simply call a different one, silently.
And also he's very wrong in thinking that "private" members or base classes are internal to a class and don't matter. Yes, that's how it _should_ be, but in C++ access checking happens _last._ Basically, public or private doesn't matter for overloads and type casting, it is *only* used to tell when the compiler needs to give you an error.
Function-style casts (which do the same as C-style casts) remain the shortest, and most readable casts in the language, and that's what matters to me. If you need training wheels, go for the overly verbose C++ casts, but they do the same in the end.
i wish i could flag the compiler to try only static_cast for c style casts and if that doesn't work, error.
The public_cast might have great potential for evil, but sometimes it's ok to do what the ruling class calls evil in order to free an internal API that was greedily imprisoned ten namespaces deep in code no mere developer is permitted to modify.
It's a better option, at least, than punning the class into a duplicated copy you've written to have the same byte layout, or memcpying at blind pointer offsets.
Sometimes two wrongs make a right....
Fun fact: one of the main reasons i picked C++ up again after coding my home projects in Go for some time was to get a basic reinterpret cast working because fucking Go wont let me do that, but i just want to optimize the 16 bits of my array item down to 8, without having to iterate through the array and reallocating the memory, both of which i really dont want to do that (or can) because were edging at the capability of what fits into my 32GB of RAM with the 8 bit values alone
14:05 "Uber static cast" - what does it mean? And what the difference with the "normal" static cast?
I don't think I can sleep now...
I propose slitely better sintax for public_cast. The only thing I don't like is that one still has to specify the member type when declaring the accessor.
template
struct static_constructor
{
struct value_type{
value_type(){ initialize();}
void Do(){}
};
static value_type construct;
};
template
typename static_constructor::value_type static_constructor::construct;
class X{
int a= 42;
double b= 43.2;
float c;
friend void print(const X&b)
{
std::cout
from a later comment of mine:
template
class access_private { constexpr friend auto public_cast(K) { return M; }};
class CxSecret { constexpr friend auto public_cast(CxSecret); };
template class access_private;
C c;
int x = c.*public_cast(CxSecret{});
I concur. I had a bug like that after making a class (which used to be a base) inherit from a new base. Compiler happily built but because the class also inherited a mix-in, the C-style casts got the wrong addresses and Bad Things happened; the mix-in was now sizeof(ActualBase) offset. Oh C++ how could you 😪
man this is why love this language, so much fun stuff to play around with!
public_cast, for when blowing your leg off is not enough
That last example convinced me to never use inheritance ever again!
composition for the win
i've got a way to avoid specifying the type in public_cast! and this version is also constexpr. the idea is to create a function returning the value, which allows us to use auto.
template
class access_private { constexpr friend auto public_cast(K) { return M; }};
class CxSecret { constexpr friend auto public_cast(CxSecret); };
template class access_private;
C c;
int x = c.*public_cast(CxSecret{});
i tried making K a template type for public_cast to allow typing it as public_cast, but i haven't managed to do that.
this uses 0 extra memory and is constexpr. the compiler properly inlines everything as if this was normal access to a public variable.
cpp is trippy. people who use all this are wizards...
NO THEY ARE BLIND AND HAVE ZERO UNDERSTANDING WHAT THEY DOING AND WHY
Yeah, these are examples of things that probably should not be used.
@8:40 This feels like a real hostile code design view on the nature of private/public variables lol
@14:00 Again, I feel like it's the user's job to know what they're doing if they're accessing the internals of a library, but this frames it as if it's a conflict between the library developer and the library user or whatever instead of a collaborative effort.
This is the difference between soft dev and soft eng. I must face the fact that I will never be this good.