C++ vs Rust: which is faster?

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии • 959

  • @fasterthanlime
    @fasterthanlime  Год назад +301

    Rust & C++ code is here: gist.github.com/fasterthanlime/b2e261c3d1492171d6a46edf620a0728
    If someone's watching this going "I /know/ I can beat that by tweaking the C++ code", you should do it! I think there's a lot of good that can happen with some friendly competition between the C++ and Rust camps, as long as we both treat it from a "huh, neat that we can get compilers to do that!" perspective and not just shout at each other from opposite side of a virtual stadium.
    I'll try to make a full write-up that goes a little deeper (maybe comparing emitted IR between clang and rustc), but first I have more research to do.

    • @larikkin
      @larikkin Год назад +10

      You might want to pin this comment, so it doesn't get lost. Atleast until someone makes an optimized C++ version.

    • @decky1990
      @decky1990 Год назад +18

      Definitely need to get someone who knows relatively-modern C++ to refactor this more idiomatically (might even try it myself). So many macros and C-style code.
      And immediately de-referencing begin() is not undefined behaviour, it’s programmer error - the programme did exactly what it was told it to do.

    • @tsunekakou1275
      @tsunekakou1275 Год назад +21

      @@decky1990 i dont consider anyone who write using namespace std; as a C++ programmer, because if they had access to any decent C++ teaching materials they would have known using namespace std; is REALLY REALLY lame. if they have heard they shouldn't use using namespace std; and keep using it then they aren't seriously enough and treating C++ as a toy language that for lame competitions. is this gate keeping, yes, but is is it unreasonable to expect this, no. Maybe because there are so many bad youtube shorts, tiktok videos about C++ out there is become a norm. I couldn't believe that @fasterthanlime think this is how people write C++ and use this kind of code to compare with Rust (5:00 section).

    • @cmxpotato
      @cmxpotato Год назад +15

      There's also a lot of unecessary by value copying on the C++ implementation. Quick examples I've noticed is in the for-each loop and when populating an object then passing it to a push_back() of a vector. Though I'm not sure how much of that impacts performance.
      There's performance left on the table from using cout and endl since the implementation of those two are notorious for slow IO without some configuring.

    • @decky1990
      @decky1990 Год назад +5

      @@tsunekakou1275 aye the namespace directive is a bit of a faux pas. I’ve heard stroustrup kills a kitten every time some uses it. I suppose this was my cry for more idiomatic writing. I don’t want to beat the guy up for posting interesting content, but maybe get someone who specialises in the language instead of Jack-of-all-trading them, such that they’re all average implementations. Would be a really interesting comparison.

  • @orbital1337
    @orbital1337 Год назад +1240

    As soon as I saw the C++ solution I knew that it wasn't written by someone who knows C++ all that well so I decided to attempt my own solution. My solution ended up being around 4000x faster. C++ and Rust are close enough that the deciding factor for any non-trivial task is almost always going to be the algorithms used, not the language. So study your algorithms, folks. :P

    • @arthurpenndragon6434
      @arthurpenndragon6434 Год назад +143

      truly insane how many orders of magnitude are hidden to be optimized even in millisecond programs.

    • @crackasaurus_rox9740
      @crackasaurus_rox9740 Год назад +69

      ​@@arthurpenndragon6434Once you get used to looking at the disassembly, it's pretty easy to make 3 order improvements consistently, on other people's software anyway...

    • @SuperSpeed52
      @SuperSpeed52 Год назад +12

      Algorithms rules supreme

    • @bluedark7724
      @bluedark7724 Год назад +18

      Fair. What you are is a specialist. The acreage cpp programmer wouldn't go into your level of detail .. but I must ask, are you still using cpp if you are optimising the compiler?

    • @SonicMastr500s
      @SonicMastr500s Год назад

      @@bluedark7724 Knowing algorithms has nothing to do with optimizing the compiler

  • @willsterjohnson
    @willsterjohnson Год назад +1282

    I know **just** enough low level computer magic to understand this, but man was it work. This was very well presented, I've seen much simpler things explained with far less clarity.

    • @fasterthanlime
      @fasterthanlime  Год назад +230

      This is perhaps the best compliment I've received all year. I know the year is young, but still! I'm really happy that the video is understandable at some level - it's really hard to strike a good balance between "obvious to a lot of regulars" and "desperately impenetrable".

    • @spoofilybeloved6729
      @spoofilybeloved6729 Год назад +30

      @@fasterthanlimea big part is your delivery takes you through the emotional journey(like saying “let’s not talk about that, and hopefully never again”, about an especially complicated topic) which makes it a lot more engaging than other comparatively similar videos.

    • @kamee211
      @kamee211 Год назад +2

      I dont understand it my head gone blank 😭 do i need to study more ??? can you tell me something in which i can learn about it

    • @willsterjohnson
      @willsterjohnson Год назад +5

      @@spoofilybeloved6729 absolutely, their approach is much more like a teacher who wants to ensure you can follow along and feel included. A lot of presenters will pride themselves on knowing something and make you feel small for not understanding.

    • @wickeddubz
      @wickeddubz Год назад +1

      I failed to understand SIMD section, but i understood general principle and difference between 2 cases. It means that you explanation is really great. And i’m not even a developer or a programmer.

  • @SWinxyTheCat
    @SWinxyTheCat Год назад +144

    The conclusion that compiler engineers need hugs and cats need petting are both true.

    • @fasterthanlime
      @fasterthanlime  Год назад +27

      Sometimes it's even the other way around! ...if you know each other well, of course.

  • @vanweapon
    @vanweapon Год назад +386

    "This has been an ongoing fight for years according to sources who are very tired"
    I felt this in my soul

    • @ChinCo1
      @ChinCo1 Год назад +2

      I remembered Go.

  • @iilugs
    @iilugs Год назад +294

    This video is really great! I also did Advent of Code with Rust and it really helped.
    It's really refreshing to listen about Assembly in a clear, non-scary way.

    • @fasterthanlime
      @fasterthanlime  Год назад +43

      I'm glad you felt that way! As I rewatched along during the premiere, I was convinced everyone would drop off as soon as the assembly section started, but maybe it's because I've heard those sections a hundred times while editing already!

    • @andrewdunbar828
      @andrewdunbar828 Год назад +14

      @@fasterthanlime No way! More assembly plz

    • @tomfahey2823
      @tomfahey2823 Год назад

      ​@@fasterthanlimeThe graphics were excellent, I doubt I would have followed along otherwise.

  • @MKUSQ
    @MKUSQ Год назад +885

    Love the video! Absolute madness at the end, LLVM is probably such mess at this point because of the optimization priorities that it takes a team of compiler engineers a couple of days to even trace a problem lol, but they are the real heroes

    • @jaysistar2711
      @jaysistar2711 Год назад +46

      The Static Single Assignment (SSA) form that LLVM IR uses simplifies dataflow dramatically. In Rust, it's a bit more SSA like due to borrow checker rules limiting aliasing, while C++ aliases at will, which can cause some optimizers to be disabled since the dataflow (and if it could have changed) is un provable at compile time.

    • @andrewdunbar828
      @andrewdunbar828 Год назад +21

      This is the kind of thing that every optimizing compiler has to deal with.

    • @darshanbhat9457
      @darshanbhat9457 Год назад +1

      Ok sir, tell me a non mess compiler infra then

    • @user-wu3vd7dd2r
      @user-wu3vd7dd2r Год назад +6

      Maybe you doubt, that I - not a rust expert -can write code slower than c++ one? I can, trust me. Another question is proficiency level of the one who wrote these C++ solutions that were ported.

    • @antonliakhovitch8306
      @antonliakhovitch8306 Год назад +10

      ​@@darshanbhat9457I think you misunderstood. They aren't dissing on LLVM at all; it's a mess by necessity

  • @BarronKane
    @BarronKane Год назад +96

    I spent weeks of my life shoving SIMD into an unreal engine module to leverage new cpu architectures and you managed to sum up basically 3 weeks of pain into 20 minutes and I still understood it better than all the formal sources I bled my eyes at. Compiler engineers deserve all of the love in the world and they are criminally under appreciated.

  • @FunkyMoneyMan
    @FunkyMoneyMan Год назад +402

    I do think it’s interesting the difference in knowledge of the two languages and what is “easier”. I know C++ very well I would say and I’ve never even touched rust, but when he went over the operator overload functions and mentioned how C++ was difficult and rust was “just implementing a trait like any other”. This made me audibly laugh since for me C++ overload is super easy and clean, when I saw the rust code at 5:56 it looked insanely bloated and had characters that seemed absolutely useless.
    I’m not trying to bash Rust or the knowledge Faster, just pointing out how coding in a language with little knowledge creates MASSIVE bias and situations like this where something is “simple” just because you know it.

    • @RottenFishbone
      @RottenFishbone Год назад +63

      I completely agree with what you're saying, but I think the point he was getting at is the Rust way is a typical trait implementation. I think he meant that if you know even a minimal amount of Rust then you're going to know how to use traits whereas operator overloading is extra knowledge in C++. Also, he did simplify it down to simply writing #[derive(PartialOrd)] right after :P

    • @FunkyMoneyMan
      @FunkyMoneyMan Год назад +40

      @@RottenFishbone even the simplified went right over my head👀

    • @ric8248
      @ric8248 Год назад +35

      exactly my thoughts! overloading couldn't be easier in C++

    • @tsunekakou1275
      @tsunekakou1275 Год назад +4

      "hard to remember"... lmao.

    • @tsunekakou1275
      @tsunekakou1275 Год назад +19

      @@RottenFishbone i wouldn't say "overloading is extra knowledge in C++", it's not template level difficulty, you can learn overloading after a week or two. sure overloading has a lot of thing going on there but at least it in your face instead hiding. you could argue Rust PartialOrd is newbie friendly but i kinda disagree with that even.

  • @andersama2215
    @andersama2215 Год назад +154

    Just a note, if you're using the same "backend" eg: llvm for rust and c++ the results should be similar, the reality is though while llvm is a generalized backend for multiple languages, it's history is primarily to support the clang frontend. Essentially you're comparing how well integrated one frontend is with another so don't be too surprised if rust falls behind c++.
    The add_examples issue is due in part to padding. Your struct with 3 members is likely being handed off to the compiler and it sees a struct which isn't perfectly aligned to a power of 2. So while you're thinking you're dealing with 3 64 bit registers the compiler's going to treat that struct as 4 64 bit registers. While you're right that you're just adding 6 64 bit integers together, what you've likely done is trip up an optimization stage because the compiler sees 8 64 bit integers (where the 4th and 8th are essentially discarded) and that likely is enough for simd optimizations to kick in (where adding 6 may not be worth doing) and then confuse later optimization stages.

    • @igorordecha
      @igorordecha Год назад +20

      How is this llvm's and not rusts' fault? The only stable backend for Rust is LLVM. It was their(rust creators) decision to use LLVM. They could've written their own backend(which would've been a disaster) but they didn't.
      I think it's completely fair to compare Rust and C performance by how well they work with LLVM

    • @mvuksano
      @mvuksano Год назад +7

      @@igorordecha I get your point but I disagree. It could easily happen that a future release of clang or gcc outperforms rust just by tweaking how it interacts with compiler. Or someone could write a transpiler from c++ to rust and then compile code that way. I have to say that i don't agree that any this video does not demonstrate that one language is superior or inferior to the other.

    • @igorordecha
      @igorordecha Год назад +9

      @@mvuksano rust might be better language (and it is) as far as the spec and syntax go BUT with current tooling it IS slower* than C. It doesn't have to be that way in the future but it is slower* now. And that is what this video shows. It isn't "not fair comparison" because at the end of the day the binary built with rust is worse*. Yes, it's not the Rust's fault. It's the tooling's fault but it directly make the language worse (for now)
      * yes, as shown in this video, you can make the binary faster by tweaking the number of arguments but in the real world you're not changing the API because the compiler has a bug.

    • @jimatperfromix2759
      @jimatperfromix2759 Год назад +4

      This point (and all its replies) is very good - as is the whole video and many more of the reply comments. I'm just about to embark on learning Rust (and Go at the same time), and although this is definitely not my first computer language rodeo (I used to be somewhat of a collector (some might say hoarder) of computer languages, but the number of languages available has now gotten somewhat ahead of my rate of collecting them), I'm glad I saw this video first.
      What I've learned here, is that if you want to write performant Rust code (on X86 target CPU, say), you're gonna have to learn how to outsmart the combination of the Rust compiler plus LLVM plus all the cumulative stupidities of the 50-year process of patching the X86 instruction architecture plus especially the well-motivated but often badly implemented SIMD instructions. One problem is that the code generation process does not have sufficient knowledge of the approximate relative performance of various instructional/architectural approaches to ASM-coding the given Rust code. If it did, it could at least loop over the several aproaches and reject using those SIMD instructions when the extra wasted execution time makes it such that using SIMD slow down the execution. There probably needs to be finer gained compiler control over its use of SIMD (not sure what's available at this point). You don't want to tell it not to use SIMD at all, just to un-screw-up its compilation a function for X = Y + Z, when later in the code you have a whole huge section that definitely needs the SIMD instructions.
      Also, it's clear to me that (assuming I'm targeting X86 again, for sake of argument) I'm gonna have to learn the ugly X86 instruction set in order to do performance tuning in Rust. Darn, I thought I could just use Rust, but at the same time not only avoid the inanities of C++, but also say to the option of "learn X86 ASM" been-there-done-that-earlier-in-my-career with a much better computer architecture so really don't want to go down that road again only this time with a horrible X86 instruction-set architecture.
      One wonders whether, in spite of the fact that Rust is not an interpreted language and thus does not require a formal byte-code intermediate language technically speaking, it might not be a great idea to define a (better than Java and perfectly defined for a hypothetical Rust virtual machine that matches well against current X86 and Arm architectures) Rust byte-code architecture, and then compile Rust to that Rust VM, then send that on to LLVM for final compilation to target machine. You could maybe make that first to-byte-code translation smarter, say smart enough that the Rust coder doesn't have to stand on his/her head just to outsmart the current combo of Rust + LLVM.

    • @JorgetePanete
      @JorgetePanete Год назад +1

      its*

  • @carloscarral8870
    @carloscarral8870 Год назад +56

    Your way of describing complex SIMD instructions was superb, congrats Amos!

  • @valizeth4073
    @valizeth4073 Год назад +59

    Just glancing over the C++ code it could definitely be improved by:
    Removing the globals
    Removing the horrific macros (for endl specifically)
    The constants could be made into constexpr constants (i.e real constants, not text replacement)
    Removing the using namespace std
    Removing the `typedef` struct (this has never been a required thing in C++, this is pure C)
    Replacing the remaining typedefs with using
    Adding defaulted comparison operators for the structs that would essentially all be one liners without any user defined bodies (structs are just classes, they can have member functions, even though the operators would be friended and not technically members).
    Not much about performance itself which could probably also be improved, just cleaning up the code immensely by following the c++ core guidelines.

    • @fasterthanlime
      @fasterthanlime  Год назад +9

      Yeah I mentioned in the Day 18 write-up that these solutions felt a lot like "C/C++", not "proper C++", but like you said I don't think any of these actually would impact performance. I would really hate if the original author would get negative feedback from all that though 😬
      Thanks for the C++ refresher!

    • @Wunkolo
      @Wunkolo Год назад +17

      @@fasterthanlime There are lots of needless deep-copies in the C++ code that would certainly effect performance though...

    • @osamaalbahrani
      @osamaalbahrani Год назад +1

      Interesting points, I didn't know that the typedef wasn't needed in C++.

    • @chillst3p
      @chillst3p Год назад +15

      @@Wunkolo Yeah when doing a for each the code should be doing it by reference, i:e: for ( BluePrint& bp : bps ) {
      The copies are certainly making it slower.

    • @PhyllMpse
      @PhyllMpse 4 месяца назад +1

      i'm a C++ programmer, but these thing
      "C++ would have been fixed if thet remove preprocessor,
      Removing the globals,
      The constants could be made into constexpr constants Removing the typedef struct,
      Replacing the remaining typedefs with using"
      C++ can't remove this features... cuz if they did, it will destroy the backward compatibility with C & the past version.
      so yeah, it can improve C++... but they can't remove this features, and it will never for a long time. and if u wanna write a faster and a good code... then just don't use these features u mentioned them.

  • @BriceFernandes
    @BriceFernandes Год назад +31

    "If you see a compiler engineer in the wild, ask if they need a hug." 😂 Very interesting to see the differences in compilation between C++ and Rust, and the effects of Stack vs register allocation illustrated so well. Great video, tons of information packed in a very short space. Thank you.

  • @frydac
    @frydac Год назад +226

    Matt Godbolt has a podcast (two's complement), and on the latest episode he quoted 'the first rule of profiling is that you're wrong' by which he means (I'm assuming) it is virtually impossible to guess the performance of a piece of code, or know which version will be faster as you illustrated here nicely.
    So I'm guessing the answer to the question in the title is 'it depends' or just 'they are similar'

    • @fasterthanlime
      @fasterthanlime  Год назад +64

      Like most titles in the form of a question, that one's not really meant to be answered - but I agree, they both performed pretty close to each other (and I still want to have a C++ person look at it - which I'm hoping this video will achieve).
      Being wrong is something I'm trying to get better at, since every time you're wrong is an opportunity to learn something, and I love to learn. Here my first two go-to tools (callgrind & not-perf) revealed themselves to be not so useful and I was forced to stare at assembly for a long while, which I think is a happy outcome for everyone!

    • @AdvancedSoul
      @AdvancedSoul Год назад +13

      The title is very clickbaity of course. It doesn't make sense to compare languages for performance; rather, we compare compilers' codegen, which is what the video demonstrates.

    • @not_ever
      @not_ever Год назад +22

      @@fasterthanlime If you would like C++ opinions, I suggest you post this on r/cpp. You might not get a warm reception but you will get opinions.

    • @montyoso
      @montyoso Год назад +2

      Thank you for the recommendation of the Matt Godbolt podtcast. Thanks to you i started to listen to it and i am enjoying it so far.

    • @salia2897
      @salia2897 Год назад +4

      @@AdvancedSoul well, you can compare programming language features by performance, because some can be implemented faster than others. Rust and C++ are basically the same in this regard though. With a theoretical advantage for Rust because it often knows more about borrowing, but that is currently not exploited by the LLVM backend.
      So yeah between Rust and C++ it comes down to what the compiler does and if you use LLVM for C++ it will come done to some very random implementation details.

  • @OffbrandDrPhil
    @OffbrandDrPhil Год назад +2

    "I felt pretty stupid while reading it, and I wish you the exact same- that just means you found something you can learn about." 2:57
    That's a great way of putting it, and it makes me a lot more comfortable learning that I'm not the only "stupid" one when it comes to working with and learning to code. Thanks for that!

  • @4mb127
    @4mb127 Год назад +19

    Love your blog posts. Keep doing what you're doing. It's great. 👍

    • @fasterthanlime
      @fasterthanlime  Год назад +2

      Thanks!! I'm glad I'm finally able to release videos on par with my posts, there was a big gap for a while and the audiences for both didn't intersect much.

  • @AhmedEssam_eramax
    @AhmedEssam_eramax Год назад

    Thanks!

  • @kintrix007
    @kintrix007 Год назад +11

    This is the first video I have seen from you, and I have to say, it made me sure I want to see more from this channel. Such well present, such cat, such content.
    You have a really nice style, keep it up.

  • @KaidenBird
    @KaidenBird Год назад +2

    "What's faster? C++ or Rust? Trick question! It's x86 Assembly! get ready to get comfortable with some goddamn registers!" on a lighter note WHY IS SIMD SUCH A BLACK BOX

  • @johnnywezel3399
    @johnnywezel3399 Год назад +13

    Pro tip: only write C++ if you know how to use it.

    • @Hacking-Kitten
      @Hacking-Kitten 3 месяца назад +1

      So how do you get to that point?

    • @johnnywezel3399
      @johnnywezel3399 3 месяца назад +2

      @@Hacking-Kitten Same way as any other language: use it in real life projects for at least 10 years.

  • @m3nthalone
    @m3nthalone Год назад +39

    Well, that escalated quickly. Came for Rust, got to Assembler. I wish I learned this in the university, it is so fascinating. Makes me think how many decisions were taken for us in higher level languages. Thank you for the deep dive and clear explanation. I feel complete… Turing complete now 😅

    • @runed0s86
      @runed0s86 Год назад +1

      Rust's type system, by ITSELF, is turing complete.

    • @itsmeagain1415
      @itsmeagain1415 10 месяцев назад

      ​@@runed0s86so are C++ templates hehe

  • @Dygear
    @Dygear Год назад +11

    I was rewarded not just with knowledge, but also with a cat video at the end. Obviously this is now the best video ever.

  • @bytefu
    @bytefu Год назад +87

    Great video. I love the low-level stuff, and compiler writers definitely deserve many hugs. I have my own toy compiler written in Rust, and even implementing the dumbest register allocator possible was already worth at least two hugs per day, because debugging even the tiniest of changes means reading screens of assembly code, interpreting that in your head and trying to keep up with tons of info, such as which variable should sit in which register, what registers are spilled at a particular point, etc. That's hard even given that I "cheated" and decided to generate code for RISC-V, which is much simpler than x86 or amd64, and I'm just playing around. I don't know if I am even able to write a more sophisticated one correctly with my ADHD and average intellectual abilities. How hard is what serious compiler devs do? Probably insanely hard.
    The good thing is I have a cat too (btw she has the same pattern on her forehead as your cat). Whenever I feel overwhelmed with coding, she's always there for me. By that I mean she is sleeping nearby, not caring even a little bit about my struggle with my own inability to think clearly. But occasionally, she jumps on my lap and gently reminds me of things more important than code, such as scratching her cheeks 😁

    • @cyrilemeka6987
      @cyrilemeka6987 11 месяцев назад

      What language are you creating the compiler for?

  • @aleksandermirowsky7988
    @aleksandermirowsky7988 Год назад +108

    The content was really great, super informative. It was like reading one of your blog posts but in video form.
    I watched the day 18 stream and learned a ton.
    Some of the assembly stuff went over my head but that's just because I'm not too familiar with it. But even so, it was explained well. I'll definitely be coming back to watch it again once I know more about assembly.

    • @MaxAbramson3
      @MaxAbramson3 10 месяцев назад

      MIPS Assembly is the easiest to read and understand.

  • @cad97
    @cad97 Год назад +14

    I believe the reason that 3×u64x2 uses registers where 2×u64x3 uses stack is as simple as the Rust ABI having a ScalarPair mode to pass scalar pairs (like slice references, or other structs which are just two ints) as two separate arguments. So before LLVM sees it, the function taking 3×u64x2 gets turned into a function taking 6×u64. The 2×u64x3 is given to LLVM as a function taking two pointers.
    IIRC LLVM does actually have the ability to pass more complex types than just scalars in function arguments (i.e. virtual registers), but rustc never uses this functionality because the semantics of compound types in LLVM don't line up well with Rust's semantics.

    • @fasterthanlime
      @fasterthanlime  Год назад +6

      Interesting! Can you expand on "the semantics of compound types in LLVM don't line up well with Rust's semantics"? Preferably in blog post form, 2 pages minimum, no maximum, you have 4 hours ago, ready set go (just kidding, but I would love to read more about this if you feel like posting a link to something!)

    • @cad97
      @cad97 Год назад +2

      @@fasterthanlime for simple and "full" types like u64x3 here, I don't think there's any difference. I honestly know very little, just that rustc only uses LLVM's compound types for field offset indexing and lowers padding to explicit fields. I also think compound types in LLVM may also carry TBAA implications, though that wouldn't matter if only used for passing compound types as function arguments.
      The TL;DR version is quite literally just that doing pass-by-reference for all compound types is simple, sufficient, obviously correct, and doesn't impede inlining optimization.

    • @ZephrymWOW
      @ZephrymWOW Год назад +3

      Those are words

    • @irrelevant_noob
      @irrelevant_noob Год назад

      heh, "you have 4 hours ago"... gg on making us feel way too familiar within this environment! xD

  • @pcost
    @pcost Год назад +3

    Incredible video. I am going to share it with everyone in the company I run! Keep up the ultra-nerdy-geeky subjects & explanations, you are REALLY good at it and makes this kind of high level stuff very accessible to whoever is listening.

  • @SwordQuake2
    @SwordQuake2 2 месяца назад +2

    5:50 operator overloading is quite simple, the OP just had no idea how to do it. And the spaceship operator in C++20 helps immensely in comparisons. It's basically the same as your cmp function.

  • @kered13
    @kered13 Год назад +16

    5:05 Replacing whatever C++ map you were using with absl::flat_hash_map or some other high performance map implementation would probably have significantly improved the C++ performance. But you're right you need a certain level of experience in C++ to know that the standard library map types are kind of shit for performance.
    5:50 In C++20 you can just do `auto operator(const Cube&, const Cube&) = default` and the compiler will automatically implement a lexicographic comparator, like the Rust derived comparator. While this is a fairly recent feature, I would consider it something every C++ programmer should know, just like every Rust programmer should know about derive, because it is so useful.
    On one final note, having recently been comparing some C++ and Rust code for performance, I'll say that one Rust feature that can often give it an advantage over C++ is destructive moves, which allow the Rust compiler to make some good optimizations when passing objects by move that would not be possible in C++.

  • @flTobi
    @flTobi Год назад +123

    I don't comment very often, but I have to say I really enjoyed this deep dive on why C++ was faster.
    Keep it up!

    • @plasticstuff69
      @plasticstuff69 Год назад +8

      /s? 😂

    • @69k_gold
      @69k_gold Год назад +17

      @@plasticstuff69 Typical Crab

    • @matthewmurrian
      @matthewmurrian Год назад +12

      It took a Rust expert digging into assembly to beat a naive C++ implementation (sometimes). So, yeah. Agreed.

    • @doublekamui
      @doublekamui Год назад

      if both use llvm as backend, they are similar, if c++ use gcc instead llvm then they can different, but rust can use gcc too, so they are similar. now take a look to the feature, rust give more feature and its design is 100% memory safe, preventing memory leaks that can happen in c++ when building big project with many people that they may forget to delete a variabel after unused.

    • @dattien3453
      @dattien3453 11 месяцев назад +5

      @@doublekamui exactly what smart pointers are made for

  • @bobsalita3417
    @bobsalita3417 Год назад +23

    There's still more layers of complexity. Code speed differs depending on whether the instructions/data hit the on-chip cache which in turn depends on the level and size of the cache. Also, optimal alignment of data can make an utterly huge speed difference (address boundary of 8 vs 16 vs 32 vs 64 vs 128). Ultimately, all these factors will be optimized by compiler backends which have been trained by machine learning. I'm an old compiler guy who is spending his retirement on doing just this. I'm betting we can dump LLVM for a machine-instruction-machine-learning-neural-network.

    • @fasterthanlime
      @fasterthanlime  Год назад +6

      That is true (and way beyond the scope of this video). I remember seeing some stuff re: machine-learning driven register allocation in LLVM, but I have no idea if it's actually mainstream enough to, say, be used by default in rustc. Do you happen to know? I'm curious!

    • @HansLemurson
      @HansLemurson Год назад +7

      Turn compilers into an even bigger black box! It would be scary if a malicious AI corrupted compilers so that all programs would contain silently executing code to do...something.

    • @StanleyPinchak
      @StanleyPinchak Год назад

      @@HansLemurson nevermind those extra skynet instructions

    • @HansLemurson
      @HansLemurson Год назад +4

      @@StanleyPinchak Just a few socket interrupts and some speculative execution...nothing to see here...

  • @finallyfedupwithpeople
    @finallyfedupwithpeople Год назад +12

    For the C++ version, I'd suggest making the operator functions static and turning bps into a local variable in main.

  • @d0x2f
    @d0x2f Год назад +1

    How did it take the algorithm this long to show me your videos. I've read all your articles as I see them show up on Reddit. Happy to see your videos are just as interesting.

  • @ahmadalghooneh2105
    @ahmadalghooneh2105 Год назад +8

    Most of the C++ optimization happens in the flags when you are compiling, and as you said you had trouble even in linking. So, I could say somebody who's experienced in C++ can definitly beat RUST. Moreover, guys please don't get into this nonesense comparisons, depending on your application you should choose the language.

    • @alexgorodecky1661
      @alexgorodecky1661 Год назад +3

      potentially rust and c++ can be "lowered" to pure C by the programmer will 😄 It's true - performance comparison for rust and c++ is nosence

  •  Год назад +8

    Decades ago at this point I had our compiler team rant about register optimization. To this day, I'm convinced it's all black magic.

  • @HugoCardozaAguirre
    @HugoCardozaAguirre Год назад

    ¡Gracias!

  • @sanderbos4243
    @sanderbos4243 Год назад +14

    Extremely high-quality explanation, I'll make sure to send this to whoever wants an introduction to the world of assembly and the wild west of compiler optimizations!

  • @MattGodbolt
    @MattGodbolt Год назад

    Thanks so much for the very kind shout out!

  • @damonpalovaara4211
    @damonpalovaara4211 Год назад +26

    I'd open an issue on that or reach out to one of the core devs. That seems like an oversight from the compiler team. Has to be some issue with the IR code that's generated not playing nice with LLVM

    • @fasterthanlime
      @fasterthanlime  Год назад +34

      So a friend of mine who's a rustc contributor looked into it some. This is the closest report I could find: github.com/rust-lang/rust/issues/26494#issuecomment-619506345 - according to them, it's an extremely common report with no easy solution, most PRs to "fix it" introduce regressions 🙃

    • @KaneYork
      @KaneYork Год назад +6

      the ScalarPair ABI in rustc might be slightly suspect here too - it makes 2-element structs special

    • @fasterthanlime
      @fasterthanlime  Год назад +7

      @@KaneYork that seems likely - it's probably what my compiler friend tried to explain to me and it flew right over my head.

    • @allenwebb273
      @allenwebb273 Год назад

      @@fasterthanlime solving this seems like it is related to the bin packing problem only a little bit more complicated because of things like alignment requirements.

  • @jaysistar2711
    @jaysistar2711 Год назад +2

    11:00 You actually can't have "more complex expressions". Both the number of registers in the expression and the specific registers that can b used are fixed. Each full expression is a different addressing mode, which is (ussually) a completly different instruction opcode, and not an operand change. x86_64 registers are much more "general purpose" than x86 registers are, but there are still some instructions that only operate on certian registers.

    • @fasterthanlime
      @fasterthanlime  Год назад +4

      From a machine code perspective, sure! The specifics matter less when writing (or reading) assembly, and I haven't even shown multiplications in there. That said, nit picking is very much in the spirit of this channel, so thank you!

  • @semicharmedkindofguy3088
    @semicharmedkindofguy3088 Год назад +4

    Good job with the video! I'm not too comfortable with assembly but the way you explained with visualisations helped a lot.

    • @fasterthanlime
      @fasterthanlime  Год назад +3

      I actually used pshufd to move half cat parts to both halves of an XMM register. Beware: if you get the `order` operand wrong, things get weird. Real weird.

  • @CallousCoder
    @CallousCoder Год назад +1

    I love your Advent of code articles! I already picked up some new Rust knowledge!
    And your writing style is really nice!

  • @Voltra_
    @Voltra_ Год назад +3

    This is what language comparison videos should always be like

  • @k22kk22k
    @k22kk22k 6 месяцев назад

    One thing to add: if you use registers of SIMD instructions, the program takes more time to resume during kernel’s context switching because we need to save/load data for extra registers.
    That is one of the reasons why utilizing extra registers (like -march=native) results in slower code.

  • @i.8530
    @i.8530 Год назад +3

    been reading your articles for a while, never knew you had a youtube channel. loving all the content, keep it up!

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      Thanks! I do link to my latest video at the bottom of every article, but seeing how long my articles are, maybe I should put them higher up 🥲

  • @ciberman
    @ciberman Год назад

    15:47 "Aren't you glad you clicked this video?" resulted to be the most nerdy retoric question I've been asked this week.

  • @blablabla7796
    @blablabla7796 Год назад +10

    The thing about C++ is that it is very easy to make suboptimal choices in writing the code. A person doing C++ for 10 years is nowhere near an expert but even then you could easily make a single mistake and have your code run 2000x slower than it needs to be. But the good thing about C++ is: if you’re allowed to make “non-faithful” changes to code, you could theoretically turn your problem into an exercise to be done by the compiler and at run time it does the minimum amount of work. Potentially have the answer of the question be a value literal. C++ will always be in a weird place where it has to maintain some weird compatibility with C, which allows it to piggyback off of the work of really smart people that work on C, but also inadvertently carrying all the baggage and weird crap from C. Rust fortunately doesn’t have this problem because it was designed from the ground up pretty recently.

  • @oxey_
    @oxey_ Год назад +3

    was very impressed with how knowledgeable you seemed and thought to myself why I hadn't seen you before but then I saw the channel name and everything clicked haha

    • @fasterthanlime
      @fasterthanlime  Год назад +3

      That's hilarious. Now you know what I sound and look like!

  • @ontheballcity71
    @ontheballcity71 10 месяцев назад +1

    Best sponsored segment ever!

  • @cheaterman49
    @cheaterman49 Год назад +3

    I love the way you explain function calls and argument passing, it's like a crash course of what the compiler does in "higher level" languages, explained to an audience of 50 year old 6502 programmers hahaha! Really a great way to go about it IMO, well explained but not dumbed down ELI5 style :-)

  • @tricky778
    @tricky778 Год назад +2

    To include binary data into c++ on Linux, creat the binary file and use LD with the right input file type flag to generate a .o file, that object file will have two symbols for the start and end of the data which you can access by declaring the right arrays with the right names (no length values required) something like unsigned char symbol_start[]; where you replace "symbol" according to the symbols name.

  • @gakman
    @gakman Год назад +5

    Awesome video. Really enjoyed it. Keep up the excellent work! Cute sponsor.

    • @fasterthanlime
      @fasterthanlime  Год назад +2

      Your feedback has been relayed to the sponsor in question, whose only comment was: mrrraw.

  • @robervaldo4633
    @robervaldo4633 7 месяцев назад

    this is the most underrated youtube video title I've seen for some time... thanks!

  • @mzg147
    @mzg147 Год назад +3

    Top tier content. Yearning for more!

    • @fasterthanlime
      @fasterthanlime  Год назад

      I have a few other good ones, but be warned that as you go back in time, you'll see myself get worse and worse at script writing, performing, shooting, and editing. Thanks for the kind words!

    • @mzg147
      @mzg147 Год назад

      @@fasterthanlime This format really with fluid and beautiful code presentation while diving deep into interesting concepts of computing is surely the best match for my tastes, but I still really enjoyed some of your other videos and they are all really good so.. thank you for your hard work!

  • @JBBell
    @JBBell Год назад

    “It’s been an ongoing fight, according to sources who are very tired.”
    Literally almost spit out my drink. That is some quality wit, characteristic of this whole, excellent video.

  • @bimoverbohm6837
    @bimoverbohm6837 Год назад +12

    Die-hard C++ developer here. I've been thinking for a long time that I should give Rust a try. Some really cool concepts, fast and very expressive. To be fair C++ has been progressing a lot in the last years and people are usually complaining about ancient C++03 while they should be using C++17 or 20 by now...

    • @heyhoe168
      @heyhoe168 Год назад +2

      с++20 concepts are so odd, it feels like a different language now.

  • @johnbcodes
    @johnbcodes Год назад +2

    Loved it, but I got confused at 16:03: "But as soon as we switched over to two structs with three fields each, it stopped using registers and started passing them on the stack!" but the code shown is what was discussed in the prior section as SIMD/128-bit wide registers and I didn't hear the stack mentioned then. I guess what I'm saying is I understood the difference to be using 64-bit vs 128-bit registers instead of registers vs the stack. What did I miss?
    Other feedback: I knew less assembly than discussed here but it was explained well so don't shy away from it in the future. If anything it might encourage me to learn more.

    • @YuFanLou
      @YuFanLou Год назад +2

      The difference at 16:03 is the absence and then presence of *ptr*. Passing parameter by register means no need to mov data from stack into register before adding, since all data are in registers already. The SIMD here is mostly to reduce number of ptr lines.

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      Yeah that's basically it. From add_numbers' disassembly you can only tell it's reading from main memory (because square brackets), but one side effect is that now two quadwords can be moved from memory to an XMM register in a single instruction, so that lets LLVM's auto-vectorization kick in. Moving them from two separate 64-bit registers to a 128-bit XMM register would probably take more instructions and might not be worth it (the compiler certainly seems to think so, looking at the codegen).

  • @mytech6779
    @mytech6779 Год назад +3

    C++ leaves certain behaviors undefined because it allows the compiler more freedom and tricks for optimization. Rust being a bit more strict for "safety", can in some cases constrict what contortions the compiler is allowed to do.

    • @fasterthanlime
      @fasterthanlime  Год назад +4

      Although that's true, 1) I don't believe there's any actual UB here, at least not any that lets the compiler optimize more, 2) it goes the other way too! See news.ycombinator.com/item?id=25624538 (afaik noalias has been enabled for a few stable versions - until another bug is found, it's had a comical track record)

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 Год назад +6

      That “.begin()” thing -- you have to compare it to “.end()” first, and it’s only safe to dereference if they are not equal.
      I know very little C++, but I do remember that.

    • @tsunekakou1275
      @tsunekakou1275 Год назад +3

      @@lawrencedoliveiro9104 hat off to you, seem like not everyone was fooled. (not sarcasm)

  • @chigozie123
    @chigozie123 Год назад +1

    "If you see a compiler engineer in the wild, ask if they need a hug"
    Lol 😆 🤣

  • @shreyasjejurkar1233
    @shreyasjejurkar1233 Год назад +3

    This is an absolutely fantastic video! Loved the way you described SIMD instructions! Respect! 🙌🙌🙌🙌
    I wish to see an x86 programming tutorial course from you!

  • @ThatNateGuy
    @ThatNateGuy Год назад

    This was really educational, for the parts that didn't go over my head. Nice chiptunes, btw!

  • @juliavdkris
    @juliavdkris Год назад +3

    Amazing video, thank you so much! You're great at clearly explaining these sorts of concepts in a calm and constructive way

    • @fasterthanlime
      @fasterthanlime  Год назад +5

      Ah crap, that's right! This is why you don't go off-script!

    • @juliavdkris
      @juliavdkris Год назад +1

      @@fasterthanlime But you clarifying it as "the function calling the other function" clears up any potential confusion it could've caused, so it's all good
      Also I wrote that comment before finishing the video, and I really didn't think I could love it even more. But that ending with "if you ever meet a compiler engineer in the wild, ask if they need a hug" and the sponsored by cat segment are fucking amazing

    • @nota2938
      @nota2938 16 дней назад

      I was pondered by this lol.
      Thx for you both.

  • @TheHronar
    @TheHronar Год назад +5

    I think that it's important to consider the development time of the application, when talking about speed. Which is why Javascript is so popular, despite being slower than C++.
    Rust seems to be an amazing middle ground with speeds that compete with well written C++ code! Thanks for the video.

    • @fasterthanlime
      @fasterthanlime  Год назад +12

      Oh definitely. You should also take into account maintenance costs, otherwise you end up with Go 🙃

    • @Cornyfisch
      @Cornyfisch Год назад +1

      @@fasterthanlime I used to write some small scale programs in Go 2-3 years ago, switched to Rust since. Can you elaborate how Go programs grow unmaintainable that‘s specific to the language?
      I am just curious.

  • @multiHappyHacker
    @multiHappyHacker Год назад +1

    does Rust even have constexpr/consteval stuff? we can usually precompute some subset of the problem at compile time.

  • @johndisandonato
    @johndisandonato Год назад +9

    Great video, as usual! A bit disappointed that I can't use a discount code with cat, but the information more than makes up for it. Wanted to ask, could alignment have something to do with the compiler's choice of using SSE registers, or may that be just because the size is bigger than 16 bytes and the compiler just treats smaller structs as a special case? Though otoh smaller structs could also mean say [u8, u16, u8, u32, u32]...

    • @fasterthanlime
      @fasterthanlime  Год назад +4

      I think it has more to do with the latter: heuristics based on the maximum argument size the compiler is willing to pass by registers. But I don't feel comfortable making these claims without doing a bunch more research.
      Re SIMD, note that the compiler is using movdqu, the unaligned version. I don't think there's any guarantees the struct will be 16-byte-aligned, but I may be talking out of my butt.

  • @vladbintintan7659
    @vladbintintan7659 6 месяцев назад

    AS a 1 year later comment(the c++ syntax i will talk about was intrdouced in c++20 so it still would have worked) in c++ you can also automatically let the compiler implement operator, which also implements automatically all the classic comparison operators

  • @comonad6229
    @comonad6229 Год назад +3

    Thanks for the great video! I'd like to see how some more complicated language features (such as fat pointer/vtable, or enum vs tagged union) may affect the compiling result and the performance

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      That would be interesting! My guess is that, like here, they're fairly close. One area where I think we'd see a real difference is noalias, but I need to figure out how to showcase it.

  • @polpol2739
    @polpol2739 Месяц назад

    Sounds like the "compiler science" still can be improved, but isnt ti effect mainly the compilation process but not the excution which ends the same form (as in 1010101010) ? or the 1010101 end different because the optamization process of the compiler ?

  • @matthewmurrian
    @matthewmurrian Год назад +5

    Rust expert/C++ novice puts in a lot of effort to make Rust implementations faster than naive C++ implementations. Finds that those Rust implementations are sometimes faster.

  • @steffahn
    @steffahn Год назад +2

    I was curious about one thing you really didn’t address. Calling conventions don’t really matter when functions are inlined (I believe), and functions that are called so often that their calling convention has significant performance impact better be inlined, so why didn’t that happen here? Answering my own question with a guess: You mentioned *recursive* functions and recursive calls cannot really be inlined (otherwise, where would you stop?) so am I right that this specific instance of performance-relevant calling-convention-details-weirdness was mostly only a thing in the first place due to the fact that we had a heavily recursive (and thus almost necessarily more inefficient than necessary) implementation to begin with?

    • @fasterthanlime
      @fasterthanlime  Год назад

      I believe so, yes. We could try to make the solution non-recursive and see how it performs in both languages then!

    • @alexgorodecky1661
      @alexgorodecky1661 Год назад

      GCC can inline recursive functions but LLVM not (tested on 11). Sometimes it gains 20-30%. Also, no-tailcall recursion is bad deal for secured high-perf code

  • @zemlidrakona2915
    @zemlidrakona2915 Год назад +3

    Modern complied languages should be reasonably comparable in performance, and in any case compiler writers will leap frog each other as they improve compilers. I program in C++ even though it's the most hacked up language ever. At one point I considered going to Rust, but the first thing I needed was placement new, which the Rust guys told me it didn't have. In addition I got advised to redesign my code if I can't directly port it from C++ to Rust. At that point I gave up. In general I find these days the way your data is organized is far more important than the the kind of stuff your talking about here and C++ lets you do close to anything in that department. But again it's an annoying hacked up language and it's getting more hacked up each year.

    • @marcossidoruk8033
      @marcossidoruk8033 Год назад

      Just do C-style C++ and ignore everything in the standard Library and thats it, no stupidly hacked code unreadable templates and OOP nonsense.
      If you care about performance and you are already thinking in terms of data layout you are probably already doing this, but just in case I had to say it.
      Also you could consider using one of the new C replacement languages like Odin, I've heard they are really nice, Rust is indeed a stupid language to use if you don't care about hardcore Security, the kind of restrictions Rust imposes on you are pointless in that case.

  • @calder-ty
    @calder-ty Год назад +1

    Great video and explanation. Love reading your articles, so watching the video was fun.

  • @whoopsimsorry2546
    @whoopsimsorry2546 Год назад +13

    Decent video. Obviously, everything done in one language can be done in the other one as well. When it comes to optimizing stuff like this, it's mostly "luck" as to which language's libraries are more suited for the job at hand. I don't think "which is faster" could ever be an actual question. There's many other far better questions: "which is safer?", "which is easier to learn?", "which one allows for better expressiveness?", "which one is the better option for embedded?", and so on.

    • @fasterthanlime
      @fasterthanlime  Год назад +11

      We're roughly in agreement but let me pick two nits: there's definitely things you can do in C++ that you can't do in safe Rust, simply because the borrow checker doesn't understand them. And I say that as a pretty vocal Rust advocate but that's just... the deal we made with the compiler. (And there's still research/improvements ongoing, but, yeah.)
      Re safety/learning curve/expressiveness: I covered that /briefly/ when comparing the two codebases, but it probably warrants more in-depth coverage. I would of course need to do a bunch more research about C++ first. This is all the aftermath of some casual stream and I thought it could be an interesting watch for folks who are like "I hear Rust is safe and that sounds good but that also means it _has_ to be slower, right?". Thanks for watching!

    • @squelchedotter
      @squelchedotter Год назад +3

      @@fasterthanlime On the other hand, I find there are many things that you can do in safe Rust that you can't do in C++, not because of limits in the language but because of limits on your own sanity

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      @@squelchedotter That's very true but it's also a harder sell for folks who aren't /already/ into Rust. I sound like a madman writing "it's great, just try it" all year round: to most it just sounds like I'm in a cult, and that explains a lot about the dynamics of the interactions between the Rust community & the greater programming community.

    • @tsunekakou1275
      @tsunekakou1275 Год назад

      there is no "better" questions. depends on what the project requirements or what resources on hands. i don't need safety so i don't need Rust. Rust is just as hard as C++, so there is no win there for both of them, "which one allows for better expressiveness?", expressiveness is not on the top of my list, i write my C++ quite expressive and i happy with that level of "expressiveness" for my projects, Rust abuse abbreviation is a minus for me (get a C vibe), what is Box, what is Rc (yes i know it RefCount), Arc (yes i know it Atomic RefCount). "which one is the better option for embedded?" no idea, mostly don't care. which is faster, they are similar or at least not the performance win margin that i would care. which compile faster? likely Rust would lose on that, i have never seen a benchmark that Rust won on compile time. which is more restrictive? Rust is more restrictive, for me it a lose, for others it might be a win. IMO, there isn't a whole lot of reasons to switch from C++ to Rust beside safety, and that is still a stretch, things that need safety require a good spec, Rust spec is like "what am i?".

    • @whoopsimsorry2546
      @whoopsimsorry2546 Год назад

      ​@@tsunekakou1275 That's what I was trying to point out. The question on speed is mostly subjective as is mostly any other question. If we were trying to compare the 2 languages, there are far more important things that play a role into choosing between rust and C++.
      I personally also do C++, people using rust probably got their thing too. Just hoping it's not because it's "faster" in this one picked out case.

  • @eternalnight9453
    @eternalnight9453 Год назад

    What a great video! And most of all, thanks for sponsoring the Cat. 😍

  • @strega-nil
    @strega-nil Год назад +6

    For what it's worth, in C++20 you no longer need to implement comparison operators if they're doing normal lexical comparison:
    auto operator (const Type&) const = default;

    • @tsunekakou1275
      @tsunekakou1275 Год назад +3

      you would think he should have known that when he use that C++ vs Rust title.

  • @chyldstudios
    @chyldstudios Год назад +1

    Just discovered your channel. Pure gold.

  • @orestes_io
    @orestes_io Год назад +5

    Awesome format an ini-depth knowledge. Super useful! Makes me want to pick up Rust now :)

    • @fasterthanlime
      @fasterthanlime  Год назад +4

      What luck! I know someone who has a website full of articles focused on learning Rust! (the Advent of Code 2022 series is a good start - there's probably enough content on fasterthanli.me that it's disorienting)

  • @chrisminnoy3637
    @chrisminnoy3637 Год назад +1

    Registers don't exist anymore in practice on out of order processors. The registers you SEE in x86 assembly are for humans to read, but the cpu does register renaming. So the boundry between register and stack is very obscure

    • @CouchPotator
      @CouchPotator Год назад +1

      That's not what register renaming results in. Registers still exist and are still very much different than the memory (stack or heap). What register renaming does is separate physical registers from the publicly available registers. This is completely unrelated to whether something is in a register or on the stack. Register renaming is also independent of out-of-order-execution. A cpu can have either or both or neither.

  • @KohuGaly
    @KohuGaly Год назад +63

    Ah yes, the "zero-cost abstractions" being "roll-D20-cost abstractions" as always 😀

    • @fasterthanlime
      @fasterthanlime  Год назад +18

      Weeeeell in general I would agree with you, as in I'm always skeptical that the compiler would optimize things properly - but this one surprised me, and to me, is a proper bug. I might look into it some more - it's possible to compare clang and rustc generated IR, so troubleshooting is not as inaccessible as it would appear.

    • @d3line
      @d3line Год назад +12

      Huh? It's just a compiler optimization that wasn't implemented, and the only "abstraction" here is a struct

    • @k04jg02
      @k04jg02 Год назад

      😂👏👏👏

  • @chrzan9608
    @chrzan9608 Год назад

    Very informative vid and furthermore you killed me at the end, that's funny ^_^

  • @piotrarturklos
    @piotrarturklos Год назад +6

    Nice breakdown. C++ can be used to write optimal code, there isn't a "faster language", excluding specific use cases like neural networks, for which special tools can often produce a better machine code than a regular compiler. Rust can be used for the same kind of optimal code writing in general, except for when one needs special tools and there isn't a Rust interface but there is a C++ one. Examples of such tools include Halide and GPU languages like CUDA.

  • @colinmaharaj
    @colinmaharaj Год назад

    5:30 yes I've done that in the past to embed data in my apps. Generate a header version of the data.

  • @wallyw3409
    @wallyw3409 Год назад +3

    C++ in binary is faster than everything for math but ya who knows how to do that.

    • @doublekamui
      @doublekamui Год назад +1

      if both use llvm as backend, they are similar, if c++ use gcc instead llvm then they can different, but rust can use gcc too, so they are similar. now take a look to the feature, rust give more feature and its design is 100% memory safe, preventing memory leaks that can happen in c++ when building big project with many people that they may forget to delete a variabel after unused.

  • @nachiketkanore
    @nachiketkanore Год назад

    This quickly went from easy-to-understand to god-level over-the-head stuff!

  • @savagesarethebest7251
    @savagesarethebest7251 Год назад +3

    I have not done x86 assembler since i was like 12yrs old, still remember that LEA is load effective adress and some opcodes in hexadecimal. Like nop is 90h, and ret is C3h

  • @levimogford3202
    @levimogford3202 Год назад

    that assembly review was so juicy ty
    i want to learn Rust, and later Assembly
    and ive always wanted to learn whats going on under the hood
    ty, this is exactly what ive always wanted

  • @sergeiromanoff
    @sergeiromanoff Год назад +9

    Why C++ programmers don't try to prove their language is faster than something else? They don't need to

    • @doublekamui
      @doublekamui Год назад +1

      if both use llvm as backend, they are similar, if c++ use gcc instead llvm then they can different, but rust can use gcc too, so they are similar. now take a look to the feature, rust give more feature and its design is 100% memory safe, preventing memory leaks that can happen in c++ when building big project with many people that they may forget to delete a variabel after unused.

    • @MrChelovek68
      @MrChelovek68 5 месяцев назад +1

      Rust don't give anithing. And this is Very funny

    • @MrChelovek68
      @MrChelovek68 5 месяцев назад

      And when you need to write something lowlevel,you Should use unsafe modificator or how ir called in that piece of crap. And like candy. Uou can usr leaks or valgrind,which give you full info about memory leaks. Fact that you fidn't know about it, tell everyone level of understanding of this topic .

  • @RafalFilms
    @RafalFilms Год назад +1

    Wow, great video. Both content-wise and presentation-wise

  • @Bluesourboy
    @Bluesourboy Год назад +3

    The traps you kept running into that made the C version of the app faster are bound to be the same traps any new developer would run into as well. This is problematic for Rust.
    The other problem for Rust is that they are many more C experts and until there are more Rust experts, C will continue to be the fastest.

  • @alex_s168_p
    @alex_s168_p Год назад

    This video helped me understand SIMD a lot!
    Could you maybe make some sort of x86 explanation series, where you also explain some more SIMD instructions?

  • @ruadeil_zabelin
    @ruadeil_zabelin Год назад +3

    4:15 So you're comparing buffered c++ to unbuffered rust. Why didn't you write an unbuffered c++ version? This is my main pet pieve when people compare languages. Compare the same thing. Implement the same thing.

    • @fasterthanlime
      @fasterthanlime  Год назад +3

      I didn't write the C++ version. I ported someone else's C++ to Rust, so it would make sense to get the Rust port as close as possible to the original at first, and then experiment with how to make it faster.

  • @AriosFireFeathers
    @AriosFireFeathers Год назад +2

    I'm sure it's nothing much to compiler engineers, but using address instructions for basic arithmetic is nothing short of genuis in my eyes! I fumbled around with MMIX and RISC-V in university for a bit, but I couldn't even imagine coming up with that on my own.
    Am I right in assuming that the bracket syntax and lea instruction is specific to x86, with it being CISC and all that? If so, I wonder what kind of further optimizations the CPU decoder frontend might apply to it. And now I'm curious if there's a way to profile u-op translations...

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      I think you're correct about the CISCiness of it. It has to do with addressing modes, cf. all the answers in stackoverflow.com/questions/1658294/whats-the-purpose-of-the-lea-instruction

    • @TerjeMathisen
      @TerjeMathisen Год назад +1

      LEA exists in some form in pretty much all instruction sets, the main semi-modern exception was probably Itanium where addressing was supposed to happen with unrolled code and updates after use. The original ARM had even more complicated versions which have since been removed in the 64-bit architecture.

  • @error200http
    @error200http Год назад +3

  • @Spookyhoobster
    @Spookyhoobster Год назад

    Awesome video! Gonna check out that livestream VOD.

  • @mario7501
    @mario7501 Год назад +1

    I recently started learning rust and so far I really like it. Coming from c++ there are a few things that made it hard initially, like type casting being non trivial. Say you have a trait with several implementations for that trait and a function that takes in a trait object, does some checking to figure out what the specific implementation of that trait object is and then call specific functions of that implementation. In c++ you can use dynamic_cast. In rust the only solution that I could come up with after some research was to implement an as_any function and than use ::downcast_ref::variablename.as_ref().as_any().unwrap(). Seems kind of hacky but it led me to an alternative using an enum Type that has specific implementations as fields of the enum variants. That way you can do the same thing without dynamic dispatch which is kind of awesome!
    So yes, rust makes some things really hard and annoying but it usually has a better alternative.
    The other thing I slightly miss is that function overloading isn't possible. I'd also like to see more flexible operator overloading, but otherwise I really like the language. The fact that moving is the default behavior is awesome. Variables are const by default - amazing. No more const std::vector& when passing a parameter.

    • @az-kalaak6215
      @az-kalaak6215 Год назад

      You should honestly never use dynamic_cast in c++ haha
      static_cast should be the way to go when possible, and reinterpret_cast when you know what you are doing (and use dynamic_cast only in specific cases)
      C-style cast should be a big no-no as you do not know which cast will be performed (especially when using templates)
      const_cast should be avoided except for very specific usages (or when using C api)
      I am no rust developer, therefore I don't see why a move by default (instead of copy) is good (I usually hardly never need to move anything, I always either pass references or copy)
      Same goes for the const by default, as const keyword is a no-op in c++, I don't really see its benefits, could you explain them?

    • @mario7501
      @mario7501 Год назад +2

      @@az-kalaak6215 yeah I know dynamic cast isn't optimal, but in the specific scenario of downcasting in a class hierarchy I think it's the better choice to use dynamic cast because you get a runtime check. But of course you need to check if the cast failed. Static cast is the more dangerous one in this case as it'll lead to undefined behavior at runtime if the object that you are down casting is actually different from the type T in static_cast. Anyway, the example I mentioned came up in testing code, so I wanted to make sure the derived type coming in was actually of the expected type and dynamic cast was the best choice to do that check.
      Regarding your question, I don't think it's an advantage for devs who know c++ very well, but the move by default forces you think harder about whether you want to use that object in the future or not. Moves are destructive in rust and the compiler prevents you from using a moved variable unless you reinitialize it with valid data. The fact that const is the default again forces you to think hard if a value needs to be mutable, a function needs to take a non-const reference, etc. These things help the compiler with optimization the generates code.
      Of course you could arrive at the same result using c++, but it would be the less concise and non-default option. So it's the one that is probably not always used when it really should be.
      That's more or less all I'm saying. Rust gives you defaults that help the compiler optimize code better, which makes it more likely sloppily written up rust code will outperform sloppily written c++ code. In a perfect world, there wouldn't be any sloppy code, but yeah, that's not realistic haha

    • @az-kalaak6215
      @az-kalaak6215 Год назад

      @@mario7501 ok ok, I understand :o
      If I'm correct, it's more a guidance to write clean code rather than a magical formula then

    • @mario7501
      @mario7501 Год назад +1

      @@az-kalaak6215 yeah more or less. And you can be certain that if you code compiles, it won't have any undefined behavior. Unless you use unsafe, but that's a whole different story.

  • @iiTzLamar
    @iiTzLamar Год назад +3

    Uhhh... the benchmark games shows C++ is faster than Rust though. And the few cases were Rust were faster than C++ were because Rust was using a faster method for the computations. Also, Rust uses LLVM and the benchmark games C++ programs use gcc which IMO generates slower code compared to LLVM

    • @fasterthanlime
      @fasterthanlime  Год назад +1

      Here's the page for the n-body program specifically: benchmarksgame-team.pages.debian.net/benchmarksgame/performance/nbody.html
      I'm sure we can find one of the challenges where the fastest solution is C++. I'll find one now. There! benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fannkuchredux.html
      You will note that in the context in which this particular benchmark is mentioned, the point being made is that Node.js isn't as slow as one would think, compared to the fastest solution at this current date. There's no mention of C++ yet at this point.

  • @ikomatteo3177
    @ikomatteo3177 Год назад

    just pointing out -- on the programming language benchmark game website, you have to scroll all the way down on the toy program page (n-body page for example) to find the unsafe solutions, where you will find even faster submissions. For example, at the time of writing this comment, the fastest is a C submission which is twice as fast.

  • @Proferk
    @Proferk Год назад +3

    c++ best

  • @bruterasta
    @bruterasta Год назад

    5:43 Operator overloading is not hard to remember. If original code comes from AoC then presence of the link probably comes from the fact, that person was learning new language. And I say that because the presented one is not entirely correct.

  • @teranyan
    @teranyan Год назад +14

    Unless you are an expert in both languages you have no business making videos like "c++ vs rust: which is faster", that's all I'm gonna say.

  • @frroossst4267
    @frroossst4267 Год назад +1

    Now, this is the tech content we crave!