Dave. Can you do me a favor. I'll even donate 100cad to a charity of your choice. In the fastled library there's a udl in the tetris game written by Aaron liddement. I for the life of me do not understand the bitwise based udl code he has written to define the tertris blocks. I get what he's doing but can't find any other similar cases of the bitwise code using hex to and unsigned long data like the way he's done it. Scott Manley made a wearable matrix using his code years ago which made me see what was going on under the hood. I don't like using code I don't understand and it's driven me nuts ever since. I too am on the spectrum and have a software degree I never used until I had to medically retire a while back. I love your channel and have been around since the beginning and even if you could have a quick look and reccomend a book or something if you don't have the time to make a video. It's one of those things that gets stuck in my head from time to time and it kills me that I do not understand what he's doing fully. I suspect it's intentionally obfuscated and as someone who spends his retirement doing charity work and learning it's been a thorn in my side. I am 45 and plan on using my past credits toward a compsci or ee degree (as a bucket list item and to help me tutor programmers and develop free courses for a local non-profit makerspace). I've googled, bought books, stopped short of chatgpt because I don't want the info handed to me on a silver platter because I prefer to learn hands on. I understand how transistors perform math using different binary rules because I like to do things by coding and then trying the same wirh passive circuit components. He rewrote the sprite class so it no longer uses the udl but it still drives me nuts that I can't figure it out. I'm also a fellow Canadian but I hail from the ns side od the nfld ferry. I'm waiting for approval for funding for a 7 week inpatient program for how to live with ptsd and I'd like to spend my free time coding c++ while I'm away. I was going to bring a guitar but I'd rather work on a skill that I'm not as good at as I'm deciding which degree to pursue. I have a year and a half of credits which transfer to compsci or ee but I don't want to drag around my overpriced ee lab when I could write code instead with a decently powerful laptop. I plan to travel to Utah this summer to camp under the milky way before I hit the books again. I've spent the last 4 years donating half my disposable income from my pension to others so they can learn and put a proper electronics section in the local makerspace. My ptsd keeps me from being able to leave the house and teach there in person or attend the labs I'll have to attend when I go back to school. Sorry. I ramble. I figured you'd get what I mean when something is stuck in your head and can't move past it. I could link the git repository if you like. Id be extremely thankful and would make good on my donation.
Please note: constexpr just declares that it is POSSIBLE to calculate at compile time. It does not actually guarantee it unless the context demands it (as a template argument for example). The compiler will do its best but it might just call the function. c++20 added consteval which forces compile time evaluation.
Thank you for pointing that out! I'm sure many are aware of this, yet there are just as many who aren't and for those who are new to it, this is really good information to be aware of. As a side note, this could also potentially be compiler dependent (implementation defined). I'm not 100% sure on this based on the literature of the standard so don't quote me on it. However, using MSVC vs Clang, vs GCC, vs Intel, vs MingW, etc... may provide the same results if they support and implement this feature of the language which by now I'm sure all major compilers do. Yet their methods of doing so can and will vary which also depends on the various compiler flags that are being used during compilation. It all depends on how each compiler parses, tokenizes and analyzes or handles the source code it is given to compile as well as how the programmer - user instructs the compiler or gives hints to the compiler to do its job. For simple expressions I would assume that almost all compilers would generate the same assembly, binary results. However for more complex expressions, the results may and can vary. Again, this is a really good and underrated comment. It is truly good advice to give!
We use constexpr a lot for FIR/DCT/DFT (fourier transforms, wavelets) constant generation; this removes literally thousands of magic float numbers in the code, that are so error prone. Miss a number, flip a digit, mixup digit order. No code reviewer would ever notice if the 17th digit of PI is 3 or 2. The code generation only uses sqrt/sin/cos etc for which we have constexpr implementations. constexpr simply guarantees that the value is never ever calculated at runtime, and that's the real beauty of it.
this is not an optimization, this is meta programming. Your algorithm doesn't become faster because of the constexpr. It runs just as slow, but during compilation.
@@sergeykolesnik1171 Algorithms don't have "speed", only complexity. "Fast" algos merely trade time complexity in exchange for space complexity and shit themselves the moment you start to hit RAM limits. I.e. a typical hash table has assumptions about the maximum size of the key storage and upon hitting this limit in best case scenario crashes, in the worst case rears its ugly side effects.
I had a VERILOG class where a question on the exam was: create a component that calculates the factorial of 32 (or some number, I don’t remember). I hardcoded the result and returned the value :) got an A
Yeah typical unclear requirements... It should have been more generic. Something like: Having an 8bit input N, handshake start/result_rdy interface, clk, reset and an output 31 bit... Calculate the N! when start is asserted and then assert result_rdy when the correct value is done.
@@pilotashish Actually, there was nothing stupid in this question. The question was to compute a factorial of 32, not to make a function that computes a factorial for any given integer number (32 in particular case), and the program did exactly that.
I think it would be more accurate if we said that the `constexpr` specifier indicates that the expression *may* be evaluated at compile time, in a constant expression context, but not necessarily. And a `constexpr` function specifier may return such a value. If we want to force the function to produce a compile time constant expression (under some constraints), we must use `consteval`.
@@Malaka1802 I'm not sure I understand your question. Compile-time evaluation is completely different than run-time: you must know every value beforehand. The size of the binary is not the main concern here, the goal is to pre-compute as much as you can before the runtime stage. Think of it as a _kind of_ "memoisation".
@@Malaka1802yes-ish, you generally want to compute everything in compile time if you can. Sometimes you just can’t know the values in runtime (like using user input or data from a file, etc), or it’s a functional only function (the point isn’t the result, it’s what’s doing, like `main()`, or `println()`)
@@SforSamPlays Thanks for the reply, so the compiler accepts the expression being constant but checks wether it is altered during runtime and then makes it a variable? So basically "make it constant if possible" ?
Aside from performance gains, the best use of constexpr is unit testing. First, it's important to note that, unlike inlined internal functions, the constexpr functions themselves _are_ included in the program itself, unless stripped by a later step (such as `--whole-program` in g++), so you can have a program that uses constexpr to automatically return any values below some threshold and fall back to runtime evaluation _of the same code_ if in excess of that value. This means you can write fairly complicated state machines (first case I saw was a complete 6502 emulator) with a complete set of unit tests using `constexpr`. Then, they'll read the "real" data at runtime. By embedding the unit tests in the main compilation unit, you get the neat property that if it compiles, it's correct (so long as you don't encounter a compiler bug). You can even give it simple integration tests (like 10 steps of Conway's game of life on the virtual 6502).
@@ThePC007That was implied in the latter part of the comment: "if it compiles, it is correct". So instead of running compile && test you just run compile.
@@AnttiBrax But then, what do you gain from doing that? Since unit tests are compiled once, and the run once, there shouldn’t really be much of a performance differnce. I guess you might benefit from your IDE showing the error right inside your code, though, assuming it can actually catch such bugs.
It's also worth mentioning that C23, which ironically should have just been submitted for final publication..today as a matter of fact, is adding 'constexpr' to C as well.
Nope, just constexpr variables not expression/function. So it's not that useful. Also C doesn't have "if constexpr" like C++ to make compile time code path decisions.
Mostly used constexpr to generate lookup tables at compile time. I remember writing a cloth physics simulation that could use a lookup table of size 256 to avoid multiple multiplications and divisions. And as the table is small it will most likely be stored in a cache somewhere so it was really fast. Got an A+ on that assignment. :)
My favorite application of constexpr is compile time regular expression library, it evaluates your regex at compile time so you don’t have to worry about runtime cost of creating a regex. Also pretty cool one is in the fmt library where it checks whether the format string you provided is valid, btw this library partially made it into c++20 and was mostly completed in c++23 and I love it. Man, people are so creative.
To me, the biggest benefit of constexpr is that it significantly simplifies some stuff that previously would have to be written with template metaprogramming. Thanks to constexpr, just about anything that could be a regular operation on values can now be done in mostly ordinary-looking C++ instead of template stuff, which makes it much more likely to be done at all -- nobody wants to maintain template metaprograms.
Well I wouldn't necessarily say that. I created my own meta templates before they became part of the standard. It was quite fascinating at the time and a very productive learning experience. The introduction of 'constexpr' was an extension of what some people were doing with templates already. It just allows you to write them in a more easily understood manor.
@@johnshaw6702 It's true, templates were already Turing-complete so there was nothing you couldn't do before that you can now. But template metaprogramming is very hard to teach, and the syntax makes the code rather difficult to understand and debug. It's also purely functional (excepting the stateful template instantations thing which IMO should not be used in serious code), which is a completely different paradigm to normal C++. So constexpr stuff, while not giving any strictly "new" capabilities, makes the existing capabilities easy enough to use that I can convince my team to use them, for example, and be confident that people will "get it" and continue to get it over time. I don't shy away from template metaprogramming myself and there's some great tricks you can still only do using it, but the more that can be done using familiar syntax and patterns, the better.
Hi Dave, I've been trying constexpr and static constexpr lately, in a calculator for room heating ratings. Instead of having an array load in at runtime, I made the array static constexpr; low and behold the lookups were instantaneous and the it was 10x faster than the one that didn't do that. On a side note, a C++ Guru showed how it works in that case and all the array elements are coded as a lookup table in the code, making it 100x quicker. Great video as usual, keep up the content. 😃
We had a C++ logging library that ran on embedded hardware, it was used to convert numerical debug level values into strings. There was a heap of functions that converted different log message types into readable strings. A few developers from the customer side complained that this lib alone could use the CPU to 100% when debugging was enabled slightly above default. We made those conversions constexpr and switched them to literal types and dropped the load quite significantly. What has to be mentioned is that constexpr functions will consume more stack space and programms will be larger, so you have to be careful with that. Sometimes it can be worth it, when you're fighting for kilobytes.
constexpr is so powerful that if you go far enough in theory you can have your binary contain every possible state that your program can be in (calculated at compile-time) and then just "jump" to that program state based on your inputs that are only known at runtime. Of course, this would mean the binary could grow to an unimaginable size but in theory it would run in a split second. For example, Jason Turner (a big proponent of constexpr in the C++ community) suggested a game engine where all the states of the game are calculated at compile time using constexpr functions and then the whole game would be a sequence of pre-drawn frames that appear on the screen based on user input. And in this theoretical engine the FPS could reach the CPU frequency.
Man, let me tell you, this exact screen 1:49 is one of the reasons I love this Channel. The code neatly on the left the face not too big not too small and rounded soft edges on the right, THIS is quality!
During the era of the Commodore 64, we relied on precalculated paths utilizing sine and cosine functions, which were too slow for real-time usage in demos and games. Consequently, when you observe sprites gracefully traversing the screen, it's all precalculated. Today, you can adopt a similar approach, such as with a website, where you generate as much content as possible beforehand and serve the already existing pages.
Even on a 486 (with hardware FPU support for SIN/COS), which is some order of magnitude faster than the C=64, there was a lot of speed to gain by using lookup tables. But at least we could do LUT + interpolation instead of fully pre-calculated paths :P
This makes me chuckle a bit, thinking about how people used to look up the trig tables in the back of their math book to evaluate something, and I grew up thankful I never really had to beyond an exercise, because we have computers to evaluate those for any value for us... only to know realize that in some cases the fastest way for a computer to do so is sometimes to make a table ahead of time it can look up to get the answer! (or part of the answer at least) 😅
two things on this: 1. the [[assume(expr)]] attribute in C++ does something similar (it indicates to the compiler that it can assume expr holds. 2. using C++20's requires for templates, you can achieve a similar effect, but have the code detect how deep it has recursed and switch to a runtime implementation
Nice! I regard C++20's Concepts more as *declarative* type constraints (so I prefer `constexpr` for this case). In essence, you are suggesting something similar to Rust's procedural macros, if I'm not mistaken.
"Recursion: a sexy computer science concept that's rarely practical in real life" Ain't that the truth. Furthermore, recursive solutions are also typically write-only. Ex colleague of mine came up with a SQL Server recursive algo to write check amounts in words, very clever I thought, although it was left to me to debug it when it crashed the production server one day... it'd eaten up a couple of 100GB of RAM. Turns out he hadn't written it at all, it'd been copy and pasted from Stack Overflow.
Aren't they always? I've been doing this for 10+ years and every single time I think of something clever, somebody has usually already made it (to one degree or another).
You're missing the most important part of recursion: They are FUN! Nah, but seriously I often solve problems by recursion because it is easier for me to think that way. The benefit when you write something recursively is that you through smart use of parameters can easily control the data in every context, that is in every call in the recursion. I understand, however, that they are more complicated to debug for others who aren't used to thinking recursively. Therefore it is, as always, extremely important to properly test your recursive functions for all possible inputs that can come through the public interface, and never, ever, have them read or write data not in the input parameters.
@@NinjaRunningWild No, that's not why. Recursive solutions in fact tend to be much easier to understand and debug than iterative solutions. The problem is that they don't fail gracefully: try to solve too big a problem and you'll get a literal stack overflow, whereas an iterative solution would typically run just fine.
"I bet it's constexpr." If I'm remembering correctly, constexpr had some other very interesting effects. It may be in the book C++ High Performance by Andrist and Sehr.
Last thing I expected: Dave using a Breaking Bad meme to introduce his video... Being able to talk to our compilers directly is one of the reasons I love systems languages! Have you explored Rust very much, Dave? It has some of the most impressive metaprogramming features I've seen - including full on injection into its AST to create custom syntax.
Once I got a task to implement loading some lookup table that was supposed to be loaded form a csv file. There was an equation and argument values were a discrete finite set with even intervals between values. I threw out the CSV file and jsut implemented a constexpr function what would calculate the lookup table at compile-time, put all the values inside a vector and then just read them. Awesome stufff.
I loved your coverage of this. Many years ago I fell in love with templates, especially meta expressions. The introduction of 'constexpr' just simplified what I already was doing. If you understand templates, then you see the limitations of the C++ language up close and personal, as well as the power. I believe your examples could be done using templates alone, it just wont be as clean. I created template code that doesn't directly know the integral type being used. When an 'int' was 16 bits and a 'long' was 32 bits, and you wanted the minimum of a 32 bit integer, it would result in a 'long' integer. But when a compiler came with a standard library that said an 'int' was 32 bits, that was chosen instead. I realy didn't like to depend on the standard library to determine the choice, but it's the only way I knew how to do it. Letting the compiler dertermine something at complie time is a great way to reduce runtime overhead.
I concur! I've wanted to learn C++ for ages. I started to try to grok the low-level details of systems programming years ago as a teen, but modern OS UIs abstract away so much of what is really happening that any program seems like a miracle to me.
@@theograice8080 if you actually want to learn C++, use learncpp. It is a good tutorial that is suitable to teach you about software development in C++ from zero prior knowledge about programming. If you already know some programming languages you can still use it and just skip a few of the first chapters.
Well that finonacci implementation is horrible. A simple iterative one can get you O(n), a smarter one O(log n) and an aproximate one (using the formula for the nth finonacci number) runs in O(1). Also constexpr is not really needed, if you run the compiler with optimizations activated (O3) it will probably do this by itself.
Yes, but you can get O(1) both for runtime and compile time by calculating the value once on the side and putting it in manually in the code. So if it actually never changes, you'll have paid for the computation only once, and not at every compilation.
@@ObscuraiIt sounds like a cheat if you only concentrate on the drag race example. The benefits to constexpr were really badly sold in the vid I think. There are a lot of better examples of great real world applications in the comments.
Constexpr is really amazing from a compiler tech perspective, but it's still hard to think of circumstances where it enables something genuinely new. These examples are effectively using constexpr to bake a lookup table into the exe. It's really cool that this is possible, but you could achieve this in C. It's obviously more tedious. Either an executable as part of your build script that code gens the lookup table as a pre-process if you want it bundled into the executable - or just loading in an external file at runtime with the lookup table. It's still pretty neat. Definitely ergonomic.
If you can throw constexpr in front of a function, you turn it, potentially, into a lookup table without having to find and replace all the code where the function is called.
@@strehlow how do you propose letting each instance of the function know where to look for the lookup table? Genuinely curious. I can only think of static class members being able to automatically access the same memory.
@@mgancarzjr The function definition has the lookup table. The calling code doesn't know or care how the function produces the result. It doesn't need direct access to the lookup table. This is no different than refactoring a function with a more efficient algorithm, or any other optimizations. As long as it still produces the same mapping of input to output, why would the calling code need any changes?
Good to understand and remember. I'm much more used to writing static code than programming the final compiler to build static code, but this promises a level of machine-level optimization that I've too often neglected. *Thank you* Dave from @DavesGarage for this valuable lesson. In any case, I would never calculate functions such as fibonacci or factorial by top-down recursion in production, of course, not even with memorization. This silly approach only serves to demonstrate concepts.
As goofy as it may sound, I dream of a day where the LLMs get so proficient at ASM for the X86 architecture that we can go back and optimize everything and show ourselves we don't need to continually increase chip horsepower.
Constexpr in C++20 is amazing, youre also allowed to do dynamic allocations provided you dont leak the memory out of the constexpr context. For example, you can use std vector, and so long as you copy the data into an array, or similar, its totally allowed
If you do some tricky stuff with arena allocators, even that's not necessarily impossible either. You can absolutely have a constexpr function returning pointers into an rodata segment of the compiled binary.
Unless I do not understand here what you are basically saying is that Program A aka the compiler, does the heavy lifting of compute and pre-resolve the sieve which is pushed in the binaries of Program B aka the actual competition submission. Although this is a "clever" niched solution, it's no more useful than having the sieve solution injected from a JSON that was pre-calculated by another program before. I am no C++ guru and I love that I have learned something but it seems this is no different than compilers environment variables logics. As for the sieve, I totally get why it would not be included in the competition. Big fan. Awesome stuff. I should have started with that. Ciao!
I thought the same - if you want fib(35) hard coded why bother even using a constexp "function" but just put the value in directly - you are just as likely mistype the functional code as you would type the exact number in from google or calculator. The only upside would be if you are unsure at the time of writing the main program what fib number you want to use - and hence easy to change to say fib(22)
@@bretthunter2828what you experience is what is called a failure of imagination. I also thought like you, but when I saw what kind of sorcery people came up with I was shocked. For example I saw a compile time regex parser library, if you know regex it can construct state machine at compile time which is an incredible speed up. And it was simple to use!
To give a concrete example to what M-FeD was saying, if you have a string and you need to know which of one possible values it contains, you can switch on the runtime-calculated hash of the string and have case statements of compile-time calculated hashes of possible strings to compare against. The alternative would be a ton of if...else if... statements and a bazillion string compares. E.g.: switch( hash( str )) { case hash( "colour" ): break; case hash( "size" ): break; //etc... } All you need is a couple of hash functions that can hash strings or string literals, and you're good to go.
@@bretthunter2828 No, it creates magic numbers and maintenance becomes a problem. Imagine that you call the fibonacci function in different places, but with different values (constants) that do NOT depend on user input. Why on earth would you hardcode the result returned from each function call, since the function will be evaluated at compile-time? When you change the value of the argument passed to it, the value is naturally recalculated without the need for you to do it manually, remember, nobody deserves code with “magic numbers”. I often see constexpr used to lookup tables based on arguments and/or templates, all the memory will be stored in .rdata and will be totally read-only, just modify 1 argument to create a totally different lookup table without the need to do it manually
Awesome video. I'm a C99 grognard but had always wondered what this keyword meant. Your presentation is information dense but easy to understand with lots of laughs too. Thanks!
So, I can stop writing code generators for LUTs? Very cool. I was already vaguely aware of how this worked, but seeing a concrete analysis is very helpful. Thanks for sharing.
Very cool. So many small things make a huge difference. It's a shame most people don't actually learn assembly/compilers anymore. I myself am no expert, but trying to look at building a recompiler for emulation purposes really makes things like this important. It's crazy how results differ between compilers like MSVC, gcc, and clang (hint, clang seems to win most of the time on Windows). "The more you know, the more you realize you don't know"
Great video! The most insane thing I have seen was a median filter function done in the pre-compiler. The syntax itself looked like voodoo to me. (it was however in GLSL, but the concept is the same)
My favourite video on this topic is "Rich code for tiny computers" where Jason Turner writes very complex C++ functions that generate all sorts of tables for the c64.
Another great thing about constexpr is that you don't need to pass only constant arguments in (unlike non-type template parameters or consteval functions), but if you have even basic optimizations turned on, when you do pass in constant arguments, the compiler should evaluate the result and substitute it in. As an example, I'm writing a calculator, but I want to be able to use machine integers for speed, but switch to GMP when they get too large. To do so, I needed to implement the logic to check if there would be an overflow/underflow, so I made 3 functions, willOverflowAddition, willOverflowSubtraction, and willOverflowMultiplication, which are template constexpr functions for two signed integers. In practice they won't be being used in constant expressions, but for the optimizer it's helpful to afford it any opportunities to simplify the code you can.
Hi Dave! I work with Qt C++ right now and your videos are helping me make the quality of my work better! Thanks again! Do you think you'll do any more videos on compile-time execution, i.e. template arguments and properly using constexpr in class definitions?
Nice video about an important topic. Thanks Dave! constexpr is really a super power! People have implemented constexpr ray tracers! Just one side note: There is a closed form formula for Fibonacci numbers (no recursion required), which can be evaluated in microseconds as well, without any computational limitations on the number. It is based on linear algebra, eigenvectors and the eigenvalues (1+sqrt(5))/2 and (1-sqrt(5))/2.
in other words, phi and 1-phi where phi is defined as phi = 1+1/phi (or phi^2 = phi+1) and is known as golden ratio and 1-phi is known as a golden ratio conjugate (1-phi = -1/phi) phi can be declared as a constant phi = 1.6180339887498948482045868343656 whereas 1-phi = -0.6180339887498948482045868343656 so due to how floating point numbers work, it's best to instead define apc = 0.6180339887498948482045868343656 (absolute phi conjugate) and then apply that by doing 1.0+apc (instead of phi) and -apc (instead of conjugate)
Exactly my thought. Why doesn't he just hard-code the result of the function, this way we don't even need to bother the compiler. "Hey folks, I optimized the program so it runs 1000000 times faster when you call factorial(1e7). How? I hard-coded the constant with the result."
@@diogeneslaertius3365 Because then you have hardcoded magic values in your code. If that magic value is incorrect how will you know? If it's done with code you can fix errors more easily.
@@AelfricLake Not every possible result, only hardcoded results for specific constant inputs given at compile time. It's of limited usefulness because most inputs won't be known at compile time.
i envy your memory and general cognitive ability Dave. You are a fortunate person to have such a beautiful mind. in a world where everything seems to be a race against time, you have seen this over the years and better yet you were able to keep up with it.
"if constexpr" is super useful for making templated functions that act mostly the same except for a small snippet. For example I've successfully used it for a container that needed both a Peek and a Read function, that did pretty much the same thing with the difference of either advancing an internal variable or not
no constant expressions are evaluated at compile time the point is that the expression is constant and thus won't need to be calculated during runtime as it was already evaluated and baked in during compilation
Great video, just wrote a fibonacci program in c++ as I am learning , and this will be great to test against your constexpr code! Love your videos keep them coming
Using tail recursion massively speeds things up. I wonder if I can paste code __int64 Fibtail(__int64 n, __int64 p = 0, __int64 c =1) { if (n == 0) return p; else if (n == 1) return c; else return Fibtail(n - 1, c, p+c); }
For the sieve prime problem, it shouldn't be necessary to separately calculate the square root of the upper bound before starting the algorithm. It seems like a more efficient algorithm might be ... As we're marking off the multiples of primes, keep track of the multiplier in a local variable incremented at each pass, and if the multiplier is less than or equal to prime itself after we've passed the upper bound, then we know we are at/past the square root of the upper bound and can call it quits on marking off more multiples of primes. The cost of this approach is two new operations -- incrementing the multiplier in the inner loop (while crossing off each multiple of the prime) and a comparison at the end of each prime loop to check whether this multiplier is less than or equal to the prime itself (and if so, then break from the outer loop). The second operation is a substitute for comparison against the pre-calculated square root, so isn't really an additional operation, it is just placed after rather than before. So, unless the upper bound is the square of a prime, we will have to run one additional prime to know we have reached this point, as compared to the pre-calculated square root which will tell us before we run that prime. The time-saving on this approach would depend partially on the efficiency of the dedicated square root algorithm (I'm skeptical on the efficiency of the one shown...). And the relative cost of the increment operation that's tracking the multiplier. My instinct is that since the multiplier is a local variable on the stack whereas crossing out the prime multiples is an array/memory operation presumably on the heap, its likely the increment will be done concurrently by the CPU while waiting for the memory-write operation to complete, and thus the cost will be nonexistent in practice on a modern CPU. I could be wrong... in which case the additional cost of this incrementing would grow O(n^2) with n = the upper bound... but then what is the complexity of the recursive square root??? Plus there's no chance of optimizing that with concurrent memory operations that need to be done anyways... hmmmmmm?????
I use constexpr with a hash method for strings. That way in a switch/case statement strings can be used as a selector. This makes code more readable, I love that.
Another main use for constexpr is to allow template initialisation with more complicated values. E. g. you can use something like std::array where iexp(3, 10) computes 3 to the power of 10.
You can also use the entirety of (C++23) and allocate at compile time (C++20) to perform arbitrary computations at compile time. Constexpr should be the reason alone for many C bois to consider using C++ (another compelling features being RAII, templates and type-safety).
@DavesGarage If you draw the recursion tree (as you did in the video), you see that it is not a perfect binary tree since all leaves are not at the same level. This causes the time complexity to be less than 2^n. More accurately, the time it takes to compute f(n) is equal to the time it takes to compute f(n-1) and f(n-2) plus O(1). This is very similar to the recursive formula for the Fibonacci sequence itself! So the runtime has the following value for constants a and b: a(((1-sqrt(5))/2)^n) + b(((1+sqrt(5))/2)^n) (1-sqrt(5))/2 is between -1 and 0, so the first term's absolute value gets smaller as you increase n. Therefore, the second term is the dominant term, which gives us O(((1+sqrt(5))/2)^n).
One is the complexity of the computational effort of the recursive implementation (size of a binary tree required for the recursion), the other is the order of magnitude of the Fibonacci numbers themselves, since the closed form solution is based on powers of (1+sqrt(5))/2 and (1-sqrt(5))/2.
Used to program C++ with MFC Windows back in the '90's (Petzold et al). However preferred C (Kernighan & Ritchie) because structured programs often hid issues that were difficult to debug.
Neat! I loved the breaking bad intro and hadn’t heard of constexpr, so googled when it was the throat shot I hadn’t heard of this - I must be somewhat ancient back at the original c++ versions Yeah I guess the downside is the that timebuild is broken due to some intern doing a fib(2^100) in their code. Love the video
I was once asked in a job interview to write conversion to roman numerals and about midway i realized, it can be a constexper and showed to to the interviewer, he was blown away
I pre upvoted due to this (a) being Dave's Garage which I respect, (b) C++, and (c) constant expressions, which I love. const expressions are easier to understand than some crazy template metaprogramming to generate primes :)
Thoughts on using this: 1) It could be used to precompute the twiddle factors for an FFT if you are working to make the FFT really fast. It is an alternative to writing a program to compute them and then including the text into your C++ source. 2) In a hardware thing I needed to know the best combination of what was loaded into two registers of a device. One register could selected between 7 clock sources and the other was the (integer) factor to divide by. The goal was to hit an exact frequency as best I could. I solved this again by writing code that figured it out and spit it out as text to include.
This is cool. And wasn't there when I learned C++ (which I know a bit of, but don't use super often for the kinds of work I do), so... I'd missed the memo! Thanks for sharing!
btw - a couple of notes for the video: * constexpr was added in C++11 (not C++14) * the same constexpr function can also be used with runtime variable values - but then of course evaluated at runtime. Same function - 2 uses * constexpr can not have undefined behavior. UB will be detected at compile time * C++20 adds consteval - which enforces compile time only evaluation
I'm being brought onto a project with an associate professor at my university, he needs me to write a GUI and optimize his python code. I'll be re-writing it in C++, and there are a lot of physics calculations so I'm certain this will be helpful for setting constants needed in those calculations.
1975 or thereabouts I was a student programmer, and we were set a problem to solve recursively. There was some I/O processing required, and I went to the college library to read up on Burroughs Fortran. I discovered that Fortrain, on the B5500, supported recursive access to files, and so that's what I did. The lecturer was quite surprised by my approach. On reflection, I think he was likely ND, he was undoubtedly gifted, had a peculiar view on things, and was physically awkward. Not like Chloe Hayden, but similarly lacking full control.
C++20 it even more powerful as you can allocate memory, and only limit is that this memory can't leave `constexpr` context. One use I find for `constexpr` is for populating "decoding tables" used in parsers like `if (decoding[nextChar].isChar)` instead of `if ('A'
You at the end of the video: "This is all probably fairly new to you" Me: Yep this super new technique is amazing! Then I go do some research, look at your github and follow the trail to the post about constexpr square root function. Aaaaannnnddd its a post that will turn 10 years old next month?! wtf? How have I never seen this? Forget that it has to be ran at compile time and the long-ish time it takes to compile. It's still very very cool. Thanks for this video!
Small addition: constexpr functions do NOT guarantee execution at runtime, they enable it. constexpr functions can be invoked with runtime values for which they will compute the result during runtime. This is why the C++20 standard introduced the "consteval" keyword, which actually forces the compiler to never execute a function during runtime.
Haha what!? I didn't realize this was a Dave's garage video, I thought it was just a short edit. I was like "whoever did the voice dubbing sounds like Dave" 😂
If you're wondering (and on windows) the main reason this took 38 microseconds was due to std::endl. It has an implicit os.flush() which can take a while on most platforms, especially windows. You can just output a " " which shouldn't flush. It won't change the outcome much (though it's loads faster), constexpr is amazing, and this video is great. If you're looking at very time sensitive things, it's probably better to sample the clock before/after the calculation then print it all in one big go, rather than measure the print too! Sorry for nitpicking!
Well, in practice the performance improvements could have also be achieved otherwise, but constexpr allowed for a lot clearer code and less magic numbers or handcrafted LUT, thus less errors. However if you work on a big project you want to keep compile times reasonably small and should be careful with constexpr that take very long to compile (or have a compile toggle to use the expensive ones)
Working for John Deere has taught me one thing: Some of us C++ devs. think about optimizing code for performance; but many other C++ devs. write unmaintainable horribly slow code which only works because modern embedded systems are rather fast compared to 20 years ago. I seriously wish there were more C++ devs. who'd focus on coding quality and architecture, rather than getting some more or less relevant performance improvement which may or may not translate to any measurable outcome.
7:31 for the constexpr time you just measured how long std:out takes. You should have recorded the time before any printing. 40ish microseconds is actually impressive compared to how slow Console.WriteLine is in C#, especially when it has to scroll the console too. I think the correct value is closer to 0.4ns (assuming 1 instruction at 2GHz). There. Made the code 100,000 times faster on top of the original 1000 😊.
IIRC, on some new fastest VAX back in the day, VMS Pascal generated a divide check when fed a trivial program and a listing was requested with the /SHOW:STATISTICS switch; because it would compile in less than a clock quantum and the Lines Per Second stat divided by zero, LOL. So I feel your pain.
One very important use of constexpr not discussed here is ensuring that code contains no undefined behaviour. A constexpr expression cannot compile if it contains undefined behaviour. Extremely useful for safety critical systems. For another nontrivial example of constexpr see Hana Dusikova’s work on compile time regex.
I don't code or know how to. However, I often find myself opening up code to look for specific values, keywords, true/false settings, etc... I do so especially while I'm troubleshooting mods for games but always end up staying much longer than I need to because it's like a fun puzzle. Fun because I need to learn what I don't know to keep going deeper. I don't know if I should just learn some more math and commit to learning a coding language. Channels like this one would allow for easy and extensive understanding.
The first thing that made me pause and say "hmm" for a second was you defining the base cases for the fibonacci sequence as 0 and 1, when I have always heard and thought of it as 1 and 1 (probably from the rabbits example). No particular insights but it's interesting
second thing making me say hmm is how much I dislike the youtube autocomplete on comments, and how much less present it is on replying to my own comments?
One interesting approach is to use this technique in Python or Node by writing a text file to disk containing precomputed outputs, and later claim O(n) complexity to look up the answer in benchmarks.
Thanks to everyone for the kind words on the intro... I was a little worried about using it, but had fun making it, so went out on a limb!
Entertaining and informative, thank you.
Loved it
I laughed SO HARD when constexpr popped up. I haven't laughed that hard in awhile. Thank you for that.
Dave. Can you do me a favor. I'll even donate 100cad to a charity of your choice.
In the fastled library there's a udl in the tetris game written by Aaron liddement. I for the life of me do not understand the bitwise based udl code he has written to define the tertris blocks. I get what he's doing but can't find any other similar cases of the bitwise code using hex to and unsigned long data like the way he's done it.
Scott Manley made a wearable matrix using his code years ago which made me see what was going on under the hood. I don't like using code I don't understand and it's driven me nuts ever since. I too am on the spectrum and have a software degree I never used until I had to medically retire a while back.
I love your channel and have been around since the beginning and even if you could have a quick look and reccomend a book or something if you don't have the time to make a video. It's one of those things that gets stuck in my head from time to time and it kills me that I do not understand what he's doing fully. I suspect it's intentionally obfuscated and as someone who spends his retirement doing charity work and learning it's been a thorn in my side. I am 45 and plan on using my past credits toward a compsci or ee degree (as a bucket list item and to help me tutor programmers and develop free courses for a local non-profit makerspace). I've googled, bought books, stopped short of chatgpt because I don't want the info handed to me on a silver platter because I prefer to learn hands on. I understand how transistors perform math using different binary rules because I like to do things by coding and then trying the same wirh passive circuit components. He rewrote the sprite class so it no longer uses the udl but it still drives me nuts that I can't figure it out. I'm also a fellow Canadian but I hail from the ns side od the nfld ferry. I'm waiting for approval for funding for a 7 week inpatient program for how to live with ptsd and I'd like to spend my free time coding c++ while I'm away. I was going to bring a guitar but I'd rather work on a skill that I'm not as good at as I'm deciding which degree to pursue. I have a year and a half of credits which transfer to compsci or ee but I don't want to drag around my overpriced ee lab when I could write code instead with a decently powerful laptop. I plan to travel to Utah this summer to camp under the milky way before I hit the books again. I've spent the last 4 years donating half my disposable income from my pension to others so they can learn and put a proper electronics section in the local makerspace. My ptsd keeps me from being able to leave the house and teach there in person or attend the labs I'll have to attend when I go back to school.
Sorry. I ramble. I figured you'd get what I mean when something is stuck in your head and can't move past it. I could link the git repository if you like. Id be extremely thankful and would make good on my donation.
Omg, my sides 😂, that intro was so good
RUclips kept recommending this to me and I was starting to get annoyed until I finally noticed it was a Dave’s Garage video.
😂
I was wondering about that, but sticking my head in the thumbnail seemed weird... maybe I should at first?
@@DavesGarage Yeah, the lack of your mug is why I didn’t immediately click on the video.
@@DavesGaragequick work!
@DavesGarage it's much better than sticking your head in other places.
Please note: constexpr just declares that it is POSSIBLE to calculate at compile time. It does not actually guarantee it unless the context demands it (as a template argument for example). The compiler will do its best but it might just call the function. c++20 added consteval which forces compile time evaluation.
Thank you for pointing that out! I'm sure many are aware of this, yet there are just as many who aren't and for those who are new to it, this is really good information to be aware of. As a side note, this could also potentially be compiler dependent (implementation defined). I'm not 100% sure on this based on the literature of the standard so don't quote me on it. However, using MSVC vs Clang, vs GCC, vs Intel, vs MingW, etc... may provide the same results if they support and implement this feature of the language which by now I'm sure all major compilers do. Yet their methods of doing so can and will vary which also depends on the various compiler flags that are being used during compilation. It all depends on how each compiler parses, tokenizes and analyzes or handles the source code it is given to compile as well as how the programmer - user instructs the compiler or gives hints to the compiler to do its job. For simple expressions I would assume that almost all compilers would generate the same assembly, binary results. However for more complex expressions, the results may and can vary. Again, this is a really good and underrated comment. It is truly good advice to give!
It is guaranteed if you just constexpr the result. That is guaranteed to calculate at compile time or be a compile error.
@@ColinBroderickMathsSure when the assignment operator is involved for a specific type. Now as for the evaluation of an expression, mileage may vary.
Is this really you, cpt. Obvious?
They should give it some other name, it's confusing. Something like fconstexpr would be better.
We use constexpr a lot for FIR/DCT/DFT (fourier transforms, wavelets) constant generation; this removes literally thousands of magic float numbers in the code, that are so error prone. Miss a number, flip a digit, mixup digit order. No code reviewer would ever notice if the 17th digit of PI is 3 or 2. The code generation only uses sqrt/sin/cos etc for which we have constexpr implementations. constexpr simply guarantees that the value is never ever calculated at runtime, and that's the real beauty of it.
thank's for sharing practical example
this is not an optimization, this is meta programming. Your algorithm doesn't become faster because of the constexpr. It runs just as slow, but during compilation.
@@sergeykolesnik1171That was my takeaway
@@sergeykolesnik1171Precomputation absolutely is an optimization. You don't tend to compile as often as you run.
@@sergeykolesnik1171 Algorithms don't have "speed", only complexity. "Fast" algos merely trade time complexity in exchange for space complexity and shit themselves the moment you start to hit RAM limits. I.e. a typical hash table has assumptions about the maximum size of the key storage and upon hitting this limit in best case scenario crashes, in the worst case rears its ugly side effects.
I had a VERILOG class where a question on the exam was: create a component that calculates the factorial of 32 (or some number, I don’t remember). I hardcoded the result and returned the value :) got an A
😂 Doesn't speak highly of grading in classes.
Yeah typical unclear requirements... It should have been more generic. Something like: Having an 8bit input N, handshake start/result_rdy interface, clk, reset and an output 31 bit... Calculate the N! when start is asserted and then assert result_rdy when the correct value is done.
🤣🤣@@Divine_Evil
don’t ask stupid questions if you don’t want stupid answers
@@pilotashish Actually, there was nothing stupid in this question. The question was to compute a factorial of 32, not to make a function that computes a factorial for any given integer number (32 in particular case), and the program did exactly that.
That opening was fkn hilarious
Thumbnail was pretty good as well.
Agreed!!!
Its from "Better call Saul" if you didnt know. Good show!
@@DM-qm5scso, Steven Ogg (I think I got “Trevor”’s real name right) had a guest bit?
My wife used to watch Sol, but I only sampled a few episodes.
I think it would be more accurate if we said that the `constexpr` specifier indicates that the expression *may* be evaluated at compile time, in a constant expression context, but not necessarily. And a `constexpr` function specifier may return such a value. If we want to force the function to produce a compile time constant expression (under some constraints), we must use `consteval`.
Why would you ever not want a function that can be evaluated at runtime or compile time? Just because the size of the binary?
@@Malaka1802 I'm not sure I understand your question. Compile-time evaluation is completely different than run-time: you must know every value beforehand. The size of the binary is not the main concern here, the goal is to pre-compute as much as you can before the runtime stage. Think of it as a _kind of_ "memoisation".
@@Malaka1802yes-ish, you generally want to compute everything in compile time if you can. Sometimes you just can’t know the values in runtime (like using user input or data from a file, etc), or it’s a functional only function (the point isn’t the result, it’s what’s doing, like `main()`, or `println()`)
@@SforSamPlays Thanks for the reply, so the compiler accepts the expression being constant but checks wether it is altered during runtime and then makes it a variable?
So basically "make it constant if possible" ?
Aside from performance gains, the best use of constexpr is unit testing. First, it's important to note that, unlike inlined internal functions, the constexpr functions themselves _are_ included in the program itself, unless stripped by a later step (such as `--whole-program` in g++), so you can have a program that uses constexpr to automatically return any values below some threshold and fall back to runtime evaluation _of the same code_ if in excess of that value.
This means you can write fairly complicated state machines (first case I saw was a complete 6502 emulator) with a complete set of unit tests using `constexpr`. Then, they'll read the "real" data at runtime. By embedding the unit tests in the main compilation unit, you get the neat property that if it compiles, it's correct (so long as you don't encounter a compiler bug). You can even give it simple integration tests (like 10 steps of Conway's game of life on the virtual 6502).
Nice
Doesn't that just mean that compiling now takes just as long as running the actual test would?
very cool and pretty well explained, thanks!
@@ThePC007That was implied in the latter part of the comment: "if it compiles, it is correct". So instead of running compile && test you just run compile.
@@AnttiBrax But then, what do you gain from doing that? Since unit tests are compiled once, and the run once, there shouldn’t really be much of a performance differnce. I guess you might benefit from your IDE showing the error right inside your code, though, assuming it can actually catch such bugs.
It's also worth mentioning that C23, which ironically should have just been submitted for final publication..today as a matter of fact, is adding 'constexpr' to C as well.
Nope, just constexpr variables not expression/function. So it's not that useful. Also C doesn't have "if constexpr" like C++ to make compile time code path decisions.
@@aniketbisht2823 Nope what? It would have taken you much less time to go to google and type "c23 constexpr".
@@aniketbisht2823 macro magic can do this in c
Mostly used constexpr to generate lookup tables at compile time. I remember writing a cloth physics simulation that could use a lookup table of size 256 to avoid multiple multiplications and divisions. And as the table is small it will most likely be stored in a cache somewhere so it was really fast. Got an A+ on that assignment. :)
That's a great example of a use for that, feel like I just understood the practicality a lot better now. Thanks!
Thats not a simulation then
My favorite application of constexpr is compile time regular expression library, it evaluates your regex at compile time so you don’t have to worry about runtime cost of creating a regex. Also pretty cool one is in the fmt library where it checks whether the format string you provided is valid, btw this library partially made it into c++20 and was mostly completed in c++23 and I love it. Man, people are so creative.
To me, the biggest benefit of constexpr is that it significantly simplifies some stuff that previously would have to be written with template metaprogramming. Thanks to constexpr, just about anything that could be a regular operation on values can now be done in mostly ordinary-looking C++ instead of template stuff, which makes it much more likely to be done at all -- nobody wants to maintain template metaprograms.
Well I wouldn't necessarily say that. I created my own meta templates before they became part of the standard. It was quite fascinating at the time and a very productive learning experience. The introduction of 'constexpr' was an extension of what some people were doing with templates already. It just allows you to write them in a more easily understood manor.
@@johnshaw6702 It's true, templates were already Turing-complete so there was nothing you couldn't do before that you can now. But template metaprogramming is very hard to teach, and the syntax makes the code rather difficult to understand and debug. It's also purely functional (excepting the stateful template instantations thing which IMO should not be used in serious code), which is a completely different paradigm to normal C++.
So constexpr stuff, while not giving any strictly "new" capabilities, makes the existing capabilities easy enough to use that I can convince my team to use them, for example, and be confident that people will "get it" and continue to get it over time. I don't shy away from template metaprogramming myself and there's some great tricks you can still only do using it, but the more that can be done using familiar syntax and patterns, the better.
Hi Dave, I've been trying constexpr and static constexpr lately, in a calculator for room heating ratings.
Instead of having an array load in at runtime, I made the array static constexpr; low and behold the lookups were instantaneous and the it was 10x faster than the one that didn't do that.
On a side note, a C++ Guru showed how it works in that case and all the array elements are coded as a lookup table in the code, making it 100x quicker.
Great video as usual, keep up the content. 😃
We had a C++ logging library that ran on embedded hardware, it was used to convert numerical debug level values into strings. There was a heap of functions that converted different log message types into readable strings.
A few developers from the customer side complained that this lib alone could use the CPU to 100% when debugging was enabled slightly above default. We made those conversions constexpr and switched them to literal types and dropped the load quite significantly.
What has to be mentioned is that constexpr functions will consume more stack space and programms will be larger, so you have to be careful with that. Sometimes it can be worth it, when you're fighting for kilobytes.
Probably one of my favorite scenes in Better call Saul. Steven Ogg is awesome. I also live your mashup here with it!
Big shots in that show
You are one of the reasons I started learning C/C++. I am extremely grateful for this channel, I love your content! Programming is just great.
constexpr is so powerful that if you go far enough in theory you can have your binary contain every possible state that your program can be in (calculated at compile-time) and then just "jump" to that program state based on your inputs that are only known at runtime. Of course, this would mean the binary could grow to an unimaginable size but in theory it would run in a split second.
For example, Jason Turner (a big proponent of constexpr in the C++ community) suggested a game engine where all the states of the game are calculated at compile time using constexpr functions and then the whole game would be a sequence of pre-drawn frames that appear on the screen based on user input. And in this theoretical engine the FPS could reach the CPU frequency.
You can hard code the results in any language of course but if you want to enjoy some time off while compiling it's perfect.
constexpr can never be as powerful as a python script that generates some code called in your makefile. It's dirty but it works.
@@attilatorok5767no
Man, let me tell you, this exact screen 1:49 is one of the reasons I love this Channel. The code neatly on the left the face not too big not too small and rounded soft edges on the right, THIS is quality!
During the era of the Commodore 64, we relied on precalculated paths utilizing sine and cosine functions, which were too slow for real-time usage in demos and games. Consequently, when you observe sprites gracefully traversing the screen, it's all precalculated. Today, you can adopt a similar approach, such as with a website, where you generate as much content as possible beforehand and serve the already existing pages.
Even on a 486 (with hardware FPU support for SIN/COS), which is some order of magnitude faster than the C=64, there was a lot of speed to gain by using lookup tables. But at least we could do LUT + interpolation instead of fully pre-calculated paths :P
This makes me chuckle a bit, thinking about how people used to look up the trig tables in the back of their math book to evaluate something, and I grew up thankful I never really had to beyond an exercise, because we have computers to evaluate those for any value for us... only to know realize that in some cases the fastest way for a computer to do so is sometimes to make a table ahead of time it can look up to get the answer! (or part of the answer at least) 😅
The old way was to use tables, especially logarithm tables combined with slide rules for computations.
two things on this:
1. the [[assume(expr)]] attribute in C++ does something similar (it indicates to the compiler that it can assume expr holds.
2. using C++20's requires for templates, you can achieve a similar effect, but have the code detect how deep it has recursed and switch to a runtime implementation
Nice! I regard C++20's Concepts more as *declarative* type constraints (so I prefer `constexpr` for this case). In essence, you are suggesting something similar to Rust's procedural macros, if I'm not mistaken.
"Recursion: a sexy computer science concept that's rarely practical in real life"
Ain't that the truth. Furthermore, recursive solutions are also typically write-only.
Ex colleague of mine came up with a SQL Server recursive algo to write check amounts in words, very clever I thought, although it was left to me to debug it when it crashed the production server one day... it'd eaten up a couple of 100GB of RAM. Turns out he hadn't written it at all, it'd been copy and pasted from Stack Overflow.
Aren't they always?
I've been doing this for 10+ years and every single time I think of something clever, somebody has usually already made it (to one degree or another).
There's a reason NASA doesn't allow it in their coding though. Error-prone & hard to debug.
You're missing the most important part of recursion: They are FUN!
Nah, but seriously I often solve problems by recursion because it is easier for me to think that way. The benefit when you write something recursively is that you through smart use of parameters can easily control the data in every context, that is in every call in the recursion. I understand, however, that they are more complicated to debug for others who aren't used to thinking recursively. Therefore it is, as always, extremely important to properly test your recursive functions for all possible inputs that can come through the public interface, and never, ever, have them read or write data not in the input parameters.
@@NinjaRunningWild No, that's not why. Recursive solutions in fact tend to be much easier to understand and debug than iterative solutions. The problem is that they don't fail gracefully: try to solve too big a problem and you'll get a literal stack overflow, whereas an iterative solution would typically run just fine.
Recursive solutions generally use up a lot more memory than iterative ones due to each function call adding memory to the stack
"I bet it's constexpr."
If I'm remembering correctly, constexpr had some other very interesting effects. It may be in the book C++ High Performance by Andrist and Sehr.
there's constexpr young brother too, consteval.
Last thing I expected: Dave using a Breaking Bad meme to introduce his video...
Being able to talk to our compilers directly is one of the reasons I love systems languages! Have you explored Rust very much, Dave? It has some of the most impressive metaprogramming features I've seen - including full on injection into its AST to create custom syntax.
Once I got a task to implement loading some lookup table that was supposed to be loaded form a csv file. There was an equation and argument values were a discrete finite set with even intervals between values. I threw out the CSV file and jsut implemented a constexpr function what would calculate the lookup table at compile-time, put all the values inside a vector and then just read them. Awesome stufff.
I loved your coverage of this. Many years ago I fell in love with templates, especially meta expressions. The introduction of 'constexpr' just simplified what I already was doing. If you understand templates, then you see the limitations of the C++ language up close and personal, as well as the power. I believe your examples could be done using templates alone, it just wont be as clean.
I created template code that doesn't directly know the integral type being used. When an 'int' was 16 bits and a 'long' was 32 bits, and you wanted the minimum of a 32 bit integer, it would result in a 'long' integer. But when a compiler came with a standard library that said an 'int' was 32 bits, that was chosen instead. I realy didn't like to depend on the standard library to determine the choice, but it's the only way I knew how to do it. Letting the compiler dertermine something at complie time is a great way to reduce runtime overhead.
If you made a c++ tutorial I would buy it.
I concur! I've wanted to learn C++ for ages. I started to try to grok the low-level details of systems programming years ago as a teen, but modern OS UIs abstract away so much of what is really happening that any program seems like a miracle to me.
@@theograice8080Try game programming & write your own engine in DirectX, OpenGL, or SDL. You'll learn all you need to know along the way.
@@theograice8080, get the book Beginning C++23 by Ivor Norton and start there.
@@theograice8080 if you actually want to learn C++, use learncpp. It is a good tutorial that is suitable to teach you about software development in C++ from zero prior knowledge about programming. If you already know some programming languages you can still use it and just skip a few of the first chapters.
So the compile time becomes O(2^n)?
Only if your compile environment has the requisite resources 😂
Yep, it's a cheat. Trading compile time for run time.
Well that finonacci implementation is horrible. A simple iterative one can get you O(n), a smarter one O(log n) and an aproximate one (using the formula for the nth finonacci number) runs in O(1). Also constexpr is not really needed, if you run the compiler with optimizations activated (O3) it will probably do this by itself.
Yes, but you can get O(1) both for runtime and compile time by calculating the value once on the side and putting it in manually in the code. So if it actually never changes, you'll have paid for the computation only once, and not at every compilation.
@@ObscuraiIt sounds like a cheat if you only concentrate on the drag race example. The benefits to constexpr were really badly sold in the vid I think. There are a lot of better examples of great real world applications in the comments.
Constexpr is really amazing from a compiler tech perspective, but it's still hard to think of circumstances where it enables something genuinely new. These examples are effectively using constexpr to bake a lookup table into the exe. It's really cool that this is possible, but you could achieve this in C. It's obviously more tedious. Either an executable as part of your build script that code gens the lookup table as a pre-process if you want it bundled into the executable - or just loading in an external file at runtime with the lookup table. It's still pretty neat. Definitely ergonomic.
constexpr is very useful for code obfuscation, string encryption etc. enjoy doing that in C
If you can throw constexpr in front of a function, you turn it, potentially, into a lookup table without having to find and replace all the code where the function is called.
@@mgancarzjr or just re-code the function to do the lookup rather than compute the result. The calling code wouldn't need any changes.
@@strehlow how do you propose letting each instance of the function know where to look for the lookup table? Genuinely curious. I can only think of static class members being able to automatically access the same memory.
@@mgancarzjr The function definition has the lookup table. The calling code doesn't know or care how the function produces the result. It doesn't need direct access to the lookup table.
This is no different than refactoring a function with a more efficient algorithm, or any other optimizations. As long as it still produces the same mapping of input to output, why would the calling code need any changes?
Good to understand and remember.
I'm much more used to writing static code than programming the final compiler to build static code, but this promises a level of machine-level optimization that I've too often neglected.
*Thank you* Dave from @DavesGarage for this valuable lesson.
In any case, I would never calculate functions such as fibonacci or factorial by top-down recursion in production, of course, not even with memorization. This silly approach only serves to demonstrate concepts.
As goofy as it may sound, I dream of a day where the LLMs get so proficient at ASM for the X86 architecture that we can go back and optimize everything and show ourselves we don't need to continually increase chip horsepower.
That opening.....man this is just one of the great bonuses to this channel! Here to learn and got a great laugh as a bonus right off the bat!
Constexpr in C++20 is amazing, youre also allowed to do dynamic allocations provided you dont leak the memory out of the constexpr context.
For example, you can use std vector, and so long as you copy the data into an array, or similar, its totally allowed
If you do some tricky stuff with arena allocators, even that's not necessarily impossible either. You can absolutely have a constexpr function returning pointers into an rodata segment of the compiled binary.
Unless I do not understand here what you are basically saying is that Program A aka the compiler, does the heavy lifting of compute and pre-resolve the sieve which is pushed in the binaries of Program B aka the actual competition submission. Although this is a "clever" niched solution, it's no more useful than having the sieve solution injected from a JSON that was pre-calculated by another program before.
I am no C++ guru and I love that I have learned something but it seems this is no different than compilers environment variables logics.
As for the sieve, I totally get why it would not be included in the competition.
Big fan. Awesome stuff. I should have started with that.
Ciao!
I thought the same - if you want fib(35) hard coded why bother even using a constexp "function" but just put the value in directly - you are just as likely mistype the functional code as you would type the exact number in from google or calculator. The only upside would be if you are unsure at the time of writing the main program what fib number you want to use - and hence easy to change to say fib(22)
@@bretthunter2828it's a lot easier to review code for errors than a magic number
@@bretthunter2828what you experience is what is called a failure of imagination. I also thought like you, but when I saw what kind of sorcery people came up with I was shocked. For example I saw a compile time regex parser library, if you know regex it can construct state machine at compile time which is an incredible speed up. And it was simple to use!
To give a concrete example to what M-FeD was saying, if you have a string and you need to know which of one possible values it contains, you can switch on the runtime-calculated hash of the string and have case statements of compile-time calculated hashes of possible strings to compare against. The alternative would be a ton of if...else if... statements and a bazillion string compares.
E.g.:
switch( hash( str ))
{
case hash( "colour" ):
break;
case hash( "size" ):
break;
//etc...
}
All you need is a couple of hash functions that can hash strings or string literals, and you're good to go.
@@bretthunter2828 No, it creates magic numbers and maintenance becomes a problem.
Imagine that you call the fibonacci function in different places, but with different values (constants) that do NOT depend on user input.
Why on earth would you hardcode the result returned from each function call, since the function will be evaluated at compile-time?
When you change the value of the argument passed to it, the value is naturally recalculated without the need for you to do it manually, remember, nobody deserves code with “magic numbers”.
I often see constexpr used to lookup tables based on arguments and/or templates, all the memory will be stored in .rdata and will be totally read-only, just modify 1 argument to create a totally different lookup table without the need to do it manually
Awesome video. I'm a C99 grognard but had always wondered what this keyword meant. Your presentation is information dense but easy to understand with lots of laughs too. Thanks!
I went from learning about tidbits of early windows programs to learning about relevant code features. Thank you sir!
Great video Dave! Would love to see more videos about algorithms
Wait, so I can make my hello world 1000x faster?
print(“hi”)
@@feel65printf("hi"); 😡💢
@@HassanIQ777 std::println("hi"); (≧▽≦)
@@HassanIQ777 write(STDOUT_FILENO, "hi", 3); 🤬
So, I can stop writing code generators for LUTs? Very cool. I was already vaguely aware of how this worked, but seeing a concrete analysis is very helpful. Thanks for sharing.
Very cool. So many small things make a huge difference. It's a shame most people don't actually learn assembly/compilers anymore. I myself am no expert, but trying to look at building a recompiler for emulation purposes really makes things like this important. It's crazy how results differ between compilers like MSVC, gcc, and clang (hint, clang seems to win most of the time on Windows). "The more you know, the more you realize you don't know"
Great video! The most insane thing I have seen was a median filter function done in the pre-compiler. The syntax itself looked like voodoo to me. (it was however in GLSL, but the concept is the same)
My favourite video on this topic is "Rich code for tiny computers" where Jason Turner writes very complex C++ functions that generate all sorts of tables for the c64.
This is awesome Dave thanks, I'm a c# software engineer, and I am learning c++ atm. This is very good timing.
Another great thing about constexpr is that you don't need to pass only constant arguments in (unlike non-type template parameters or consteval functions), but if you have even basic optimizations turned on, when you do pass in constant arguments, the compiler should evaluate the result and substitute it in.
As an example, I'm writing a calculator, but I want to be able to use machine integers for speed, but switch to GMP when they get too large. To do so, I needed to implement the logic to check if there would be an overflow/underflow, so I made 3 functions, willOverflowAddition, willOverflowSubtraction, and willOverflowMultiplication, which are template constexpr functions for two signed integers. In practice they won't be being used in constant expressions, but for the optimizer it's helpful to afford it any opportunities to simplify the code you can.
Hi Dave! I work with Qt C++ right now and your videos are helping me make the quality of my work better! Thanks again! Do you think you'll do any more videos on compile-time execution, i.e. template arguments and properly using constexpr in class definitions?
I am not coding anymore, but this is something I would tried in during my college days for coding competitions, good content 🎉
the starting scene was magnetic ! taught so much by baiting us all
Nice video about an important topic. Thanks Dave! constexpr is really a super power! People have implemented constexpr ray tracers! Just one side note: There is a closed form formula for Fibonacci numbers (no recursion required), which can be evaluated in microseconds as well, without any computational limitations on the number. It is based on linear algebra, eigenvectors and the eigenvalues (1+sqrt(5))/2 and (1-sqrt(5))/2.
in other words, phi and 1-phi
where phi is defined as phi = 1+1/phi (or phi^2 = phi+1) and is known as golden ratio
and 1-phi is known as a golden ratio conjugate (1-phi = -1/phi)
phi can be declared as a constant phi = 1.6180339887498948482045868343656
whereas 1-phi = -0.6180339887498948482045868343656
so due to how floating point numbers work, it's best to instead define
apc = 0.6180339887498948482045868343656 (absolute phi conjugate)
and then apply that by doing
1.0+apc (instead of phi) and -apc (instead of conjugate)
Most of this went way over my head!
Thanks, Dave.
I look forward to a video on the ct_sqrt function mentioned.
This is just shifting the algorithm runtime from program run time to compile time.
Exactly my thought. Why doesn't he just hard-code the result of the function, this way we don't even need to bother the compiler.
"Hey folks, I optimized the program so it runs 1000000 times faster when you call factorial(1e7). How? I hard-coded the constant with the result."
@@diogeneslaertius3365 Because then you have hardcoded magic values in your code. If that magic value is incorrect how will you know? If it's done with code you can fix errors more easily.
Yeah in a complex program you'll just hold every possible result in memory? What...
@@AelfricLake Not every possible result, only hardcoded results for specific constant inputs given at compile time. It's of limited usefulness because most inputs won't be known at compile time.
Sure, but it's still a useful tool no?
i envy your memory and general cognitive ability Dave. You are a fortunate person to have such a beautiful mind. in a world where everything seems to be a race against time, you have seen this over the years and better yet you were able to keep up with it.
You can take the Dave out of Microsoft, but you cant take the Developer Developer Developer out of Dave. Great Job on modern C++ !
"if constexpr" is super useful for making templated functions that act mostly the same except for a small snippet. For example I've successfully used it for a container that needed both a Peek and a Read function, that did pretty much the same thing with the difference of either advancing an internal variable or not
So constexpr just compiles my code before I hit the compile button? noice!
Thanks for this banger
no
constant expressions are evaluated at compile time
the point is that the expression is constant and thus won't need to be calculated during runtime as it was already evaluated and baked in during compilation
Great video, just wrote a fibonacci program in c++ as I am learning , and this will be great to test against your constexpr code! Love your videos keep them coming
Using tail recursion massively speeds things up. I wonder if I can paste code
__int64 Fibtail(__int64 n, __int64 p = 0, __int64 c =1)
{
if (n == 0)
return p;
else if (n == 1)
return c;
else
return Fibtail(n - 1, c, p+c);
}
For the sieve prime problem, it shouldn't be necessary to separately calculate the square root of the upper bound before starting the algorithm. It seems like a more efficient algorithm might be ... As we're marking off the multiples of primes, keep track of the multiplier in a local variable incremented at each pass, and if the multiplier is less than or equal to prime itself after we've passed the upper bound, then we know we are at/past the square root of the upper bound and can call it quits on marking off more multiples of primes.
The cost of this approach is two new operations -- incrementing the multiplier in the inner loop (while crossing off each multiple of the prime) and a comparison at the end of each prime loop to check whether this multiplier is less than or equal to the prime itself (and if so, then break from the outer loop). The second operation is a substitute for comparison against the pre-calculated square root, so isn't really an additional operation, it is just placed after rather than before. So, unless the upper bound is the square of a prime, we will have to run one additional prime to know we have reached this point, as compared to the pre-calculated square root which will tell us before we run that prime.
The time-saving on this approach would depend partially on the efficiency of the dedicated square root algorithm (I'm skeptical on the efficiency of the one shown...). And the relative cost of the increment operation that's tracking the multiplier. My instinct is that since the multiplier is a local variable on the stack whereas crossing out the prime multiples is an array/memory operation presumably on the heap, its likely the increment will be done concurrently by the CPU while waiting for the memory-write operation to complete, and thus the cost will be nonexistent in practice on a modern CPU. I could be wrong... in which case the additional cost of this incrementing would grow O(n^2) with n = the upper bound... but then what is the complexity of the recursive square root??? Plus there's no chance of optimizing that with concurrent memory operations that need to be done anyways... hmmmmmm?????
In a sense, it's a small step away from just working out the answer in advance, and then just outputting the answer at "Run" (sic) time.
I can't believe i just watched trevor try to get finger to write him some C. truly one of the moments of all time
I use constexpr with a hash method for strings. That way in a switch/case statement strings can be used as a selector. This makes code more readable, I love that.
Another main use for constexpr is to allow template initialisation with more complicated values. E. g. you can use something like std::array where iexp(3, 10) computes 3 to the power of 10.
You can also use the entirety of (C++23) and allocate at compile time (C++20) to perform arbitrary computations at compile time.
Constexpr should be the reason alone for many C bois to consider using C++ (another compelling features being RAII, templates and type-safety).
The coolest compile time practical code I've seen in use in the wild is Zig's regex compiler
Great video. The time complexity of the naïve Fibonacci implementation is actually O(((1 + sqrt(5)) / 2)^n), which is approx. O(1.62^n).
Every reference I could find said 2^n and ChatGPT did as well, so you'd have to explain that one in more detail!
@DavesGarage If you draw the recursion tree (as you did in the video), you see that it is not a perfect binary tree since all leaves are not at the same level. This causes the time complexity to be less than 2^n.
More accurately, the time it takes to compute f(n) is equal to the time it takes to compute f(n-1) and f(n-2) plus O(1). This is very similar to the recursive formula for the Fibonacci sequence itself! So the runtime has the following value for constants a and b:
a(((1-sqrt(5))/2)^n) + b(((1+sqrt(5))/2)^n)
(1-sqrt(5))/2 is between -1 and 0, so the first term's absolute value gets smaller as you increase n. Therefore, the second term is the dominant term, which gives us O(((1+sqrt(5))/2)^n).
One is the complexity of the computational effort of the recursive implementation (size of a binary tree required for the recursion), the other is the order of magnitude of the Fibonacci numbers themselves, since the closed form solution is based on powers of (1+sqrt(5))/2 and (1-sqrt(5))/2.
You're awesome man! Like C++ itself!
Used to program C++ with MFC Windows back in the '90's (Petzold et al). However preferred C (Kernighan & Ritchie) because structured programs often hid issues that were difficult to debug.
Neat! I loved the breaking bad intro and hadn’t heard of constexpr, so googled when it was the throat shot
I hadn’t heard of this - I must be somewhat ancient back at the original c++ versions
Yeah I guess the downside is the that timebuild is broken due to some intern doing a fib(2^100) in their code.
Love the video
Lol, my first intuition was something like:
arr[] = {0,1};
for(i
A practical use-case: My ui engine uses constexpr crc32 to make string comparison of keywords in the css evaluator a LOT faster.
I was once asked in a job interview to write conversion to roman numerals and about midway i realized, it can be a constexper and showed to to the interviewer, he was blown away
I never thought a C++ joke(!) would be so gd funny! Gonna re-read Petzhold for the gags. :)
I pre upvoted due to this (a) being Dave's Garage which I respect, (b) C++, and (c) constant expressions, which I love. const expressions are easier to understand than some crazy template metaprogramming to generate primes :)
Thoughts on using this:
1) It could be used to precompute the twiddle factors for an FFT if you are working to make the FFT really fast. It is an alternative to writing a program to compute them and then including the text into your C++ source.
2) In a hardware thing I needed to know the best combination of what was loaded into two registers of a device. One register could selected between 7 clock sources and the other was the (integer) factor to divide by. The goal was to hit an exact frequency as best I could. I solved this again by writing code that figured it out and spit it out as text to include.
Instructions Unclear; was playing Tool - Lateralus at 105 dB on repeat.
This is cool. And wasn't there when I learned C++ (which I know a bit of, but don't use super often for the kinds of work I do), so... I'd missed the memo! Thanks for sharing!
btw - a couple of notes for the video:
* constexpr was added in C++11 (not C++14)
* the same constexpr function can also be used with runtime variable values - but then of course evaluated at runtime. Same function - 2 uses
* constexpr can not have undefined behavior. UB will be detected at compile time
* C++20 adds consteval - which enforces compile time only evaluation
I'm being brought onto a project with an associate professor at my university, he needs me to write a GUI and optimize his python code. I'll be re-writing it in C++, and there are a lot of physics calculations so I'm certain this will be helpful for setting constants needed in those calculations.
1975 or thereabouts I was a student programmer, and we were set a problem to solve recursively. There was some I/O processing required, and I went to the college library to read up on Burroughs Fortran. I discovered that Fortrain, on the B5500, supported recursive access to files, and so that's what I did.
The lecturer was quite surprised by my approach.
On reflection, I think he was likely ND, he was undoubtedly gifted, had a peculiar view on things, and was physically awkward. Not like Chloe Hayden, but similarly lacking full control.
Kinda funny you mention constexpr; a colleague used it not two weeks ago, and as soon as he mentioned it, I knew exactly what he did.
C++20 it even more powerful as you can allocate memory, and only limit is that this memory can't leave `constexpr` context.
One use I find for `constexpr` is for populating "decoding tables" used in parsers like `if (decoding[nextChar].isChar)` instead of `if ('A'
You at the end of the video: "This is all probably fairly new to you"
Me: Yep this super new technique is amazing!
Then I go do some research, look at your github and follow the trail to the post about constexpr square root function. Aaaaannnnddd its a post that will turn 10 years old next month?! wtf? How have I never seen this? Forget that it has to be ran at compile time and the long-ish time it takes to compile. It's still very very cool.
Thanks for this video!
Can you please bring back the outro with the chairs and the couch?
Small addition: constexpr functions do NOT guarantee execution at runtime, they enable it. constexpr functions can be invoked with runtime values for which they will compute the result during runtime. This is why the C++20 standard introduced the "consteval" keyword, which actually forces the compiler to never execute a function during runtime.
Haha what!? I didn't realize this was a Dave's garage video, I thought it was just a short edit. I was like "whoever did the voice dubbing sounds like Dave" 😂
If you're wondering (and on windows) the main reason this took 38 microseconds was due to std::endl. It has an implicit os.flush() which can take a while on most platforms, especially windows. You can just output a "
" which shouldn't flush. It won't change the outcome much (though it's loads faster), constexpr is amazing, and this video is great. If you're looking at very time sensitive things, it's probably better to sample the clock before/after the calculation then print it all in one big go, rather than measure the print too! Sorry for nitpicking!
Hahaha love the intro, and the video. Will definitely share it with my colleagues!
Well, in practice the performance improvements could have also be achieved otherwise, but constexpr allowed for a lot clearer code and less magic numbers or handcrafted LUT, thus less errors.
However if you work on a big project you want to keep compile times reasonably small and should be careful with constexpr that take very long to compile (or have a compile toggle to use the expensive ones)
Working for John Deere has taught me one thing: Some of us C++ devs. think about optimizing code for performance; but many other C++ devs. write unmaintainable horribly slow code which only works because modern embedded systems are rather fast compared to 20 years ago. I seriously wish there were more C++ devs. who'd focus on coding quality and architecture, rather than getting some more or less relevant performance improvement which may or may not translate to any measurable outcome.
I love this channel, I am always learning amazing things
7:31 for the constexpr time you just measured how long std:out takes. You should have recorded the time before any printing. 40ish microseconds is actually impressive compared to how slow Console.WriteLine is in C#, especially when it has to scroll the console too.
I think the correct value is closer to 0.4ns (assuming 1 instruction at 2GHz). There. Made the code 100,000 times faster on top of the original 1000 😊.
Beginning to doubt my own sanity, because I understood MOST of that.
This is cool! Or another way to think of it is that constexpr is more or less converting you code into an executable script.
yeah it's like evaluating the code, then pasting the output back as constants lol
IIRC, on some new fastest VAX back in the day, VMS Pascal generated a divide check when fed a trivial program and a listing was requested with the /SHOW:STATISTICS switch; because it would compile in less than a clock quantum and the Lines Per Second stat divided by zero, LOL. So I feel your pain.
I guess the limitation is that the input should also be something that can be deterministically determined before hand.
It is intimidating how well you understand code.
One very important use of constexpr not discussed here is ensuring that code contains no undefined behaviour. A constexpr expression cannot compile if it contains undefined behaviour. Extremely useful for safety critical systems.
For another nontrivial example of constexpr see Hana Dusikova’s work on compile time regex.
I don't code or know how to.
However, I often find myself opening up code to look for specific values, keywords, true/false settings, etc...
I do so especially while I'm troubleshooting mods for games but always end up staying much longer than I need to because it's like a fun puzzle. Fun because I need to learn what I don't know to keep going deeper.
I don't know if I should just learn some more math and commit to learning a coding language. Channels like this one would allow for easy and extensive understanding.
The first thing that made me pause and say "hmm" for a second was you defining the base cases for the fibonacci sequence as 0 and 1, when I have always heard and thought of it as 1 and 1 (probably from the rabbits example). No particular insights but it's interesting
second thing making me say hmm is how much I dislike the youtube autocomplete on comments, and how much less present it is on replying to my own comments?
5:30
Actually... it is Theta(golden ration ^ n)... but if it is that... then it also bounded up by 2^n... so yes
One interesting approach is to use this technique in Python or Node by writing a text file to disk containing precomputed outputs, and later claim O(n) complexity to look up the answer in benchmarks.