Investigating Nvidia's Raytracing Performance

Поделиться
HTML-код
  • Опубликовано: 1 фев 2025

Комментарии • 1,1 тыс.

  • @robbie_
    @robbie_ 2 года назад +391

    I can explain part of it, as a software dev who does 3d modelling. Games are dynamic in that you're updating the spatial partition frequently with new asset shapes and positions. Benchmarks for things like Octane, iRay and so on won't include the time it takes to do this as they're almost always based on time taken to render a single frame.

    • @Mopantsu
      @Mopantsu 2 года назад +30

      Indeed. Even if you are rendering for realtime performance it's a set path. Unlike 'on-the-fly' camera movement in a game engine.

    • @cosmic_gate476
      @cosmic_gate476 2 года назад +6

      "Spatial partition"
      Please elaborate

    • @alouisschafer7212
      @alouisschafer7212 2 года назад +9

      @@cosmic_gate476 u look left u look right in a 3d environment is that what he means idk

    • @cosmic_gate476
      @cosmic_gate476 2 года назад +1

      @@alouisschafer7212 for his overall message, yes that's what it means but I'm asking about a specific term he used

    • @Real_MisterSir
      @Real_MisterSir 2 года назад +9

      @@Mopantsu Not quite true. Most (if not all?) 3D render software have a form of interactive render setting where you actually do render in real time as you move your in-scene camera around like you would move a character in a videogame. Of course you never get a fully rendered frame when you keep moving around because it's way too demanding for today's hardware with full photon tracing and other elements like caustics etc. that aren't used in games because it'd be way too demanding.
      But you most certainly can do "on-the-fly" camera movement in a 3D program while rendering (or attempting to render).

  • @TrueThanny
    @TrueThanny 2 года назад +261

    10:54 You should be using a geometric mean, not an average, when computing an average difference across multiple games in frame rate. It's geomean() with a typical spreadsheet. It also doesn't matter whether you do it on relative performance figures or the original frame rate. That is, doing a geomean on the frame rates and comparing those provides the same result as comparing each individual game's frame rates, then doing a geomean on the comparison.
    When you do that for the TechPowerUp numbers, you get an increase of about 6% in the ability of RTRT mode to retain performance from generation to generation. Overall, the 4090 is 12.6% better at maintaining its frame rate with RTRT effects enabled than the 2080 Ti. Which isn't a whole lot for two generations of updates. All of the real increases are coming from raster capacity.
    Which is probably the conclusion you're going to reach after I unpause the video.

    • @floridaman3823
      @floridaman3823 2 года назад +6

      Then you could do a ∆of(∆) calc to see how far they've come in that variable.

    • @pleaserespond3984
      @pleaserespond3984 2 года назад +3

      Why is the geometric mean more appropriate in this case?

    • @kontoname
      @kontoname 2 года назад

      @@pleaserespond3984 en.wikipedia.org/wiki/Geometric_mean

    • @whirlwind872
      @whirlwind872 2 года назад +8

      @@kontoname Wikipedia is literally the worst place to learn about math unless you already know a lot about it to begin with. That article will be completely incomprehensible to most people

    • @ar993094
      @ar993094 2 года назад +1

      Geometric Mean should not be used to compute the average here either. In this case what should be used is to use the Mean Average Deviation, Geometric Mean is good for trends over time which accounts for different periods in a series e.g Year 1, Year 2, Year 3, in this case the Geometric Mean is great to use because it would smooth out the differences between periods. Using The Average Deviation in the case for testing different titles would make much more sense, as you would be able to identify the actual trend and the obtain varriances off the numbers.

  • @retractingblinds
    @retractingblinds 2 года назад +47

    Jim, the 5870 wasn't 13 years ago...
    Oh shit.

    • @TheVanillatech
      @TheVanillatech 2 года назад +6

      I dreamed of owning that card! I was running a C2D e6300 with a 4770 at the time, and STALKER wasn't playing ball at my native resolution.
      Less than a year later, a guy brought his kids PC into my friends shop saying "it's dead" and ordered a brand new gaming PC. Turned out the machine had a dicky HD5970, which constantly reset the PC every 20-30 minutes. We baked it, in his wifes oven that night while we were drunk and after reading guides online. Lots of tin foil wraps. It worked, the card lived, and we benchmarked it all weekend. On the monday he rang me and said "come get your card". He was happy enough with his GTX 570 (lifelong Nvidia fan, my friend is), and it was the best day ever!
      Was kinda silly, the card was as long as my mainboard and the e6300 couldn't do it justice, but it felt good owning a top end card for the first time since the original VooDoo. Essentially TWO 5870's in tandem. Another friend inherited the 4770 (coming from a 9800GT). Everyone won. Except the poor bastard who had to spend £1100 in the shop to replace his son's PC.

  • @tsumurireallll
    @tsumurireallll 2 года назад +64

    After three generations of cards with little RT progress between them, maybe it's worth considering that the problem isn't so much in the cards as is is the games themselves. You touched on this near the end, but as you said, ray tracing isn't "free" from a development standpoint. I think it's entirely possible that there's a heavy RT bottleneck in the software side, at least in gaming.

    • @Winnetou17
      @Winnetou17 2 года назад +3

      I was thinking the same. When he showed the massive gains that the production applications had.

    • @ErgonomicChair
      @ErgonomicChair 2 года назад

      Currently it's almost all hardware, but regardless, its a very heavy workload but the way its handled is fully hardware right now... otherwise AMD would actually be copeting in raytracing.

    • @TheReferrer72
      @TheReferrer72 2 года назад +4

      Ray Tracing is actually easier to implement for engine devs and creators because you don't have to do all the silly pre-baking of lights and shadows etc.
      Raster has always been about using tricks like LOD, billboarding, shadows..Ray tracing will elimanates that.

    • @tsumurireallll
      @tsumurireallll 2 года назад +8

      @@TheReferrer72 Technically that's true. In my limited experience with graphics programming, ray tracing is a much simpler task than traditional raster work. But I think the difference here is that for decades now, traditional rasterization is all we've been doing, and so the algorithms that the engines use have matured quite a bit. Ray tracing in video games is still quite new, and it hasn't had the time to mature in engines yet.

    • @TheReferrer72
      @TheReferrer72 2 года назад +1

      @@tsumurireallll Yes, that's true there is a technical debt in software maintaining multiple rendering pipelines and raster is king still.

  • @anepicotter4595
    @anepicotter4595 2 года назад +131

    While this is an interesting analysis, I think it’s a confused line of thought. The goal with advancements in graphics tech has never been to make it so a feature has no performance hit. The goal is to make it so GPUs are good enough that games use the new technology as a standard. An exaggerated example is the move from 2D to 3D rendering. The performance hit between those is enormous but GPU performance is high enough that nobody would ever opt for 2D even if they could. GPU performance has a long way to go before devs stop using Raster entirely. We’re probably looking at at least 2 more generations until people stop caring about the performance hit of RT and 3 generations after that until the devs even start to consider not implementing raster with their next titles but eventually Rasterization will be a technology of a bygone era that just isn’t worth the effort to implement.

    • @paulssnfuture2752
      @paulssnfuture2752 2 года назад +23

      3 gen? is still kinda too optimistic IMO

    • @sebastianrosenheim6196
      @sebastianrosenheim6196 2 года назад +4

      Lol, Lmao even

    • @killersberg1
      @killersberg1 2 года назад +19

      The difference between RT and raster is so little that reviewers sometimes don't even know If it is on, when the toggle is buggy. I think there will be at least five generations or more bevore everyone can use RT. Don't forget, most people don't have a rtx4080

    • @NeovanGoth
      @NeovanGoth 2 года назад +9

      Remember when GPUs like the S3 Virge 3D were actually _slower_ than software renderers, although providing more modern features like 16 bit color, bilinear filtering and alpha effects? When people complained about the 3DFX Voodoo not supporting 2D graphics? Didn't took many generations after this until 3D acceleration became mandatory. The same will happen with RT, I'm pretty sure.

    • @AROAH
      @AROAH 2 года назад +13

      It would help if RT was actually used for anything other than making lighting more difficult/look worse and pretty reflections. It would be cool to see RT used for something like AI vision tracing. The fact that it’s a more computationally complex way of accomplishing what is already being done doesn’t really sell me on the utility of it in the same way as 3D acceleration. For workstation usage, I’d imagine it’s a godsend. And the quality increase for lighting is impressive. I just don’t see why you’d want to completely dump rasterization for it when that method is already extremely effective for its purpose and using RT in this way would require a TON of new development in the software space. Honestly, I’d rather pay less for a card without RT functionality.
      I think that machine learning acceleration has far more promise of being an absolutely require feature of future cards. The potential for ML-accelerated image upscaling, frame interpolation, speech synthesis, and game AI is far more exciting than RT, from what I’ve experienced.

  • @arnoldgaarde7066
    @arnoldgaarde7066 2 года назад +368

    you have no idea how much I have missed your excellent analyses videos and they really make you think differently. You make me choose a RX480 8gb over a 1060 6gb until then I had only used Nvidia except for the first 3d card I got that was a permedia. I know some Idiots made you loose the will to do videos. But I want you to know you are making a difference and we are many that listen and learn. But we sadly forget to do one thing and that is to say "thank you we apreciate the work you do"

    • @mauricedalaimo2127
      @mauricedalaimo2127 2 года назад +1

      You choose a 480 over a 1060??? Why? Bad advice unless price you mean, especially when physx was much more important to be on card cuz cpus were less cores then

    • @Quick_Sa_Fugim
      @Quick_Sa_Fugim 2 года назад +67

      @@mauricedalaimo2127 Move 3, 4 years down the road and the 480 competed with the 1070 instead of the 1060. Smart purchase.

    • @youcrew
      @youcrew 2 года назад +8

      Yep. He has always really laid out how to make the best decision for yourself.

    • @TheHighborn
      @TheHighborn 2 года назад +3

      seconded

    • @Unreatxplaya
      @Unreatxplaya 2 года назад +15

      @@mauricedalaimo2127 the VRAM alone was enough reason for me when I got my 470. It likely wouldn’t have played well with my ultrawide.

  • @Freddie1980
    @Freddie1980 2 года назад +41

    NerdTechGasm touched on this issue in his last video. From what he said and what you spoke about in your video the issue is resources allocation. In V-Ray Lovelace can deploy near enough 100% of it's RT cores and use all the available cache to trace the rays. In games the Lovelace is being asked for a combination of raster and RT at the same time but the hardware that draws them has to share to cache pool which limits the available performance.

    • @qlum
      @qlum 2 года назад +12

      I miss that guy's videos however rare they may be.

    • @ghoulbuster1
      @ghoulbuster1 2 года назад +9

      Interesting that cache is the weakest link for Nvidia, considering AMD fixed that problem with infinity cache and soon to be revealed 3D cache.

    • @w04h
      @w04h 2 года назад +1

      @@ghoulbuster1 Except that it is absolutely not the same cache we are talking about. We are talking about L1 cache, which is 128kB per SM for both Lovelace and Navi that is used for calculating pixel color at intersects. L4 cache or infinity cache is used for caching texture data to increase throughput so AMD can use 256 bit bus and cheap ass VRAM on 1000$ cards.

    • @ghoulbuster1
      @ghoulbuster1 2 года назад

      @@w04h well if it's that cheap why don't Nvidia use it and lower the price of their cards?

  • @bslay4r
    @bslay4r 2 года назад +61

    Turing L1 cache/SM: 96 kB
    Ampere L1 cache/SM: 96 kB
    Lovelace L1 cache/SM: 128 kB
    This small cache bottlenecks RT perf because it's too small for shader _and_ RT (+ tensor) data combined no matter how fast the BVH data is generated.
    So what we're seeing is basically improved raster perf from gen to gen and relative stagnant RT perf from gen to gen.
    Jensen should increase the size of this cache but AD102 is already a huge die and there are 144 SMs in Lovelace...
    Lovelace is much faster in non-gaming RT workloads because there you don't stress these caches with graphics shader data.

    • @viktortheslickster5824
      @viktortheslickster5824 2 года назад +12

      Fully agree! I just read your comment which was exactly the same as mine with regard to cache bottlenecks. I was hoping lovelace would have 192kb cache to let the RT units breath but it didn't happen.

    • @bslay4r
      @bslay4r 2 года назад +2

      @@viktortheslickster5824 I don't know. If the solution were this simple they should know about this. Maybe they don't intend to make the cards faster? For what reason?

    • @Plajerity
      @Plajerity 2 года назад +3

      @@bslay4r More memory means a higher delay. But I don't think it's the case here - we still should see a significant improvement. Some other comments said it's about raster bottleneck - there is too little RT to process in current games to make use of better RT cores. Only games that heavily rely on RT show better improvements. Control, Cyberpunk are Nvidia sponsored titles, so devs get money to show tech, but it's likely that those games wouldn't look worse if better optimized.

    • @hellsacolyte
      @hellsacolyte 2 года назад +1

      @@bslay4r It seems so obvious on the surface as a solution, that it makes you wonder what other wall they hit. I'd be pretty surprised if anything in this video was a surprise to the engineers tasked with this. Maybe the strategy of just "throw m0ar" isn't panning out? 😁

    • @brunogm
      @brunogm 2 года назад +2

      @@hellsacolyte This puts into perspective why RDNA3 increased L0/1 cache for "50%" gain in RT;

  • @viktortheslickster5824
    @viktortheslickster5824 2 года назад +61

    Jim, the problem is not related to memory bandwidth, it's related to the operations within the SM. The graphics demand in gaming produces a lot of different instructions which are sent to the shaders, with data stored in the L1 cache. Essentially while the raw horsepower of the RT cores has increased, the cache is crowded by all the divergent tyes of instructions which means this increase in RT compute doesn't translate fully into game performance. You should watch a video by nertechgasm who explains this as I am just relaying his explanation. Nvidia tried to tackle this issue with 'shader execution reordering' to make the RT pipeline more efficient (but this needs to be implemented in game code), and I expect performance would also increase with an increase in size of the L1 cache. But this is expensive in terms of area.

    • @valikmora
      @valikmora 2 года назад +4

      Game Dev Companies don't usually like to keep good Software Architecture devs, just look at EA with 2042, nowadays Game companies are facing a software crisis.

    • @kevinerbs2778
      @kevinerbs2778 Год назад

      Then why not go back to unliknked shader clocks & Raytracing cores/Tensor core clocks too? I'm guessing you'll need to make separate schedulers for each then right? Seems more do able on RDNA4 than Nvidia's current design.

  • @XD-cr3du
    @XD-cr3du 2 года назад +179

    It's astounding really when you think about how much we talk about raytracing these days when in reality it still matters so little in modern games. Nvidia's marketing machine really works wonders.
    I've seen one game so far where I was actualy impressed with the result, Cyberpunk, which was built from the ground up to include raytracing. In most other games I turned it on it was really a coinflip whether I liked the non raytraced or the raytraced result better.

    • @dante19890
      @dante19890 2 года назад +9

      Unreal Engine 5 change that.

    • @XD-cr3du
      @XD-cr3du 2 года назад +60

      @@dante19890 An engine alone doesn't change anything if developers don't take the time to build in proper raytracing features.

    • @paul1979uk2000
      @paul1979uk2000 2 года назад +11

      Truth is, it's usually the hardware makers or review sites that might be paid off by the hardware makers that make a big deal about it, for most gamers, they don't care, in fact with console gamers, they seem to prefer to not have ray tracing if they can use that performance to bump up the visuals in other ways.
      The song and dance around ray tracing is to sell expensive hardware, now don't get me wrong, ray tracing is the future of gaming but we are still years off from that, raster performance is still far more important.

    • @mrtimharrington
      @mrtimharrington 2 года назад +11

      The best implementation is Quake 2 RTX, which really shows off what it is capable of.
      Thing is, we won't see full RT in proper AAA games for a decade I'd imagine.

    • @TJPavey
      @TJPavey 2 года назад +27

      Everyone remember Hairworks? Nvidia is great at marketing.
      Nvidia created these proprietary forms of tech like Gsync and sells it as the better alternative to the open solution. Because they aren’t dealing with a trade group they can get to market quicker and look to have a tech advantage. See DLSS vs FSR etc. it’s like Beta be VHS or BluRay vs HD DVD except in this case the fanboys keep the proprietary version afloat.

  • @qlum
    @qlum 2 года назад +37

    If I remember correctly, heavy rasterization severely impact's Nvidia's RT performance, more so than AMD. So Nvidia may do well in games that replace large part of the rendering tech with RT at a lower quality like metro exodus vs games that add higher quality RT effects to part of the game like most games that implement it.
    The problem for Nvidia in this case is that consoles lack the RT performance to rely on Nvidia's preferred method, so that will drag them down.
    This is also why they always pushed for things like quake / minecraft RTX and their new modding tools that replace the lighting model with ray tracing in older games. This is relatively speaking what their architecture is good at.

  • @tqrules01
    @tqrules01 2 года назад +48

    This seems like an issue between the RT cores and raster/ shading units. Perhaps it's time for them to start Reverse engineering that infinity fabric....

    • @samlebon9884
      @samlebon9884 2 года назад +5

      Good luck.

    • @DavidFregoli
      @DavidFregoli 2 года назад +1

      I think you have no fucking clue what you're talking about

    • @ulasht1
      @ulasht1 2 года назад +8

      honestly I feel like this will be the height of Nvidia, as AMD is showing Consistent Price to Performance improvements over them in terms of just pure Raster power, and Intel when you can get eh bloody software to work with the cards is showing solid Raster performance and is only hindered by Software and Drivers, really rather then relying on Microsoft to fix their Legacy Direct X issues they should come up with an instruction set that Converts Legacy Direct X to Vulcan which their hardware seems to overall better at using.

    • @tqrules01
      @tqrules01 2 года назад +2

      @@DavidFregoli I think your taking this waaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaay to serious. You do know this is RUclips not some Science Journal right? Besides it actually may be a problem where the bandwidth between the RT cores / Traditional shading is preventing the Tensor cores from using the Bandwidth so even as a joke it still would actually make sense... Guess that was an elbow from the sky on your face.. all well sorry dude.

    • @heyhoe168
      @heyhoe168 2 года назад

      it is fabrication tech. They can just buy a license for infinity fabric.

  • @Mopantsu
    @Mopantsu 2 года назад +109

    The problem with the RT benchmarks is they do not take into account the rasterisation uplift. So they say it's x times faster RT performance but that number compared to the 30 series includes rasterisation uplift. Take that percentage away and you get the correct RT uplift. EDIT: I see this is what you addressed. EDIT2: 4090 vs 2080 Ti. 262% faster in raster. 284% in RT. BUT! Take away the raster uplift and you get only 22% uplift in RT on average! EDIT3: Unreal 5 and other engines offer an alternative to RT in luminance. I wonder if hardware acceleration and AI for luminace in UE5 would be time better spent that RT gigarays. EDIT4: AMD could introduce a dedicated chiplet for RT etc.

    • @ferdievanschalkwyk1669
      @ferdievanschalkwyk1669 2 года назад +8

      I think this whole story will be determined by what software platforms see mass adoption first. RT or luminance type systems. The one that is easiest to implement for developers and most widely available is most likely I think.

    • @alexg9155
      @alexg9155 2 года назад +7

      Yes, RT ops act as a bottleneck to the raster performance, and it will stay like this for many years to come, until raster hits physical limits.

    • @korvish111
      @korvish111 2 года назад +1

      I’m not sure it works this way. If RT is a bottle neck, than increasing raster performance won’t do much for RT.
      If nvidia re-released a 3060ti but with the number of RT cores as a 4090… how different would the RT performance be?
      It begs the question if a much cheaper graphics card could be made with equivalent RT performance if that’s all people care about.

    • @ladrok97
      @ladrok97 2 года назад +3

      I like your dedication to the edits

    • @Real_MisterSir
      @Real_MisterSir 2 года назад

      @@ferdievanschalkwyk1669 The thing with Lumen for UE5 (and similar systems for other applications) is that it is only effective on static mesh, so while you can move a lightsource around freely at very little expense to performance, you can't effectively move the scene around because the light interaction is mapped to the physical objects that exist in the world, and if you change the world object positions then the light map has to be recalculated. So for less dynamic games and scenes it works absolutely great, but in very dynamic scenes (like scenes with lots of particle simulations and motion with characters or non-static objects).
      So naturally there is a disadvantage here in the applications that are available for Lumen-like technologies, where calculated real-time photon tracing (ray tracing) doesn't have any limitations of that nature, but are instead simply capped by hardware and software performance/optimization.

  • @AlmightyGTR
    @AlmightyGTR 2 года назад +53

    I think this analysis would be true, if RT operated without Raster. What we are comparing here is a product of Raster calculation time + RT calculation time. What we are comparing here is not RT improvement, but RT+Raster improvement. If we would like to conduct RT performance only analysis, then we need to dissect RT from Raster. The way to do that, I would suggest, would be time taken for RT calculations by 2080ti, vs time taken for RT calculations by 3090ti, vs time taken for RT calculations by 4090.
    Now this is what true RT improvement comparison would be.
    The way to do this would be:
    Raster Calculation time = 1/FPS in Raster
    RT Calculation time= (1/FPS in RT) - (1/FPS in Raster)
    Now this is where we truly understand how much time did the GPU spend on calculating RT. Then we can compare it across generations.
    If we wish to be truly scientific, then we need to isolate what we are measuring, before we start analysing.

    • @AlmightyGTR
      @AlmightyGTR 2 года назад +4

      @@GVCC1 You don't need to own any cards, you already have all the data published by TPU. Just use the formulas I put above and get what you need out of the numbers.

    • @panjak323
      @panjak323 2 года назад

      Yup we need to analyze extra time needed for RT.

    • @defeqel6537
      @defeqel6537 2 года назад +3

      And to really isolate what we are measuring, the raster performance cannot include features that are disabled when using RT for the same thing. e.g. games don't do screen space reflections when RT reflections are enabled, so the raster performance needs to be measured with SSR disabled

    • @jmxtoob
      @jmxtoob 2 года назад +3

      @@defeqel6537 exactly what I came to say. Same for shadows etc. And whilst it doesn't isolate the raster vs raytracing performance, it implies it would be even worse for RT performance uplift because the "raster only" pipeline is doing more work than raster in the raster+RT figures

    • @BeyondFunction1
      @BeyondFunction1 2 года назад +3

      The analysis *is* true. Because at the end of the day what matters is the outcome, FPS. That doesn't change just because you don't like the data being used.
      He *does* mention, though, that it would be more accurate to measure frame times rather than FPS, but that data couldn't be found. Doesn't really matter anyway, though, because 99.999% of users look at GPU performance in terms of frame rates.
      I take issue a bit with the "relative RT performance" angle. It's not entirely baseless, but it's a bit. . .I dunno, overwrought, maybe? At least, it sort of comes across that way, as though to paper over what is clearly a very large improvement in RT frames from one gen to the next. If it were me, I would maybe state things a bit differently.
      Regardless, it *does* raise an eyebrow that the rate of improvement in RT doesn't quite seem to have kept pace with rasterization improvement, despite the supposedly massive increase in RT resources.

  • @EvanOfTheDarkness
    @EvanOfTheDarkness 2 года назад +3

    While some of your analysis is a bit wrong, you are right, in that there is a discrepancy. First of, sparse raytracing (what RTX can do realtime) is only usable for reflections and global illumination, so most of the scene is still traditionally rendered by the GPU, and only the lighting information is coming from raytraced textures, as opposed to other, screen space approaches (like SSAO or HBAO).
    This means, that the raytraced performance of a game *cannot* ever be higher then the "rasterized" one (actually both are rasterized, but only one is raytraced), since the raytraced version must do both. So the "rasterized" framerate usually acts as a hard cap for the raytraced one. (In theory since you have raytracing, you _could_ gain a few FPS from AO and reflections being handled async by the raytracing, and not in post process, but since raytracing also requires some post processing, and many game engines still use AO to supplement raytracing, in practice this doesn't matter.)
    The interesting part is that, since RT framerates are so low, we know that its the RT pass that limits framerates, so with a supposed 6x improvements to RT cores, we'd expect to see 4-6 times higher framerates (up to the hard limit of the rasterized framerate), but we only see a 2.8 times average improvement. So what's going on?
    Well, the problem is that the Raytracing pass is not _really_ async. Sure the RT cores can work separately, but they need all the 3D meshes and textures of the current frame, before they can start working. So the whole rasterized pipeline, up to the vertex shaders needs to be executed. Some games use simplified geometry for raytracing, that helps to create the raytracing geometry faster, but also adds extra overhead, lowering the hard limit of the non-raytraced part. Then raytracing can execute simultaneously with rasterization, and shading. After that, all post processing steps (including the actual use of raytraced lighting for AO and reflections) needs to be executed on the normal shader cores.
    So as of now, raytracing can only run in parallel with 2 parts of the rasterized pipeline: "rasterization" (a.k.a. converting the triangles to pixels), and direct lighting (where you calculate the lighting contributions from direct light sources).

  • @extremegraphicstech
    @extremegraphicstech 2 года назад +3

    You are wrong in one massive point.
    RT CORES are improving and has nothing to do with Raster.
    The RTX 3070 has 30% less RT Cores than the 2080 ti and the 2080ti tents to be faster on raster than the 3070 by a little margin and also has more VRAM, however when it comes to RT the 3070 is usually as good if not better than the 2080ti.

  • @pvalpha
    @pvalpha 2 года назад +19

    Excellent analysis again, so if I catch the drift of what you're going at here, is that Raster drives a lot of what RT elements should be capable of as a performance value, therefore just simply putting more into RT cores doesn't net more RT performance unless you also vastly increase raster processing. There's a bottleneck somewhere perhaps - like you said maybe the memory bandwidth, because the hardware performance in gaming (raster+raytracing) doesn't match the performance in task-specific raytracing applications which wouldn't stress the raster component.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      ⬆️Congrats you won a prize🎁🎉🎊

  • @dex6316
    @dex6316 2 года назад +14

    What I’m curious about is a break down in how much ray traced games actually utilize ray tracing. It’s possible that there is a utilization problem at play where there isn’t enough ray tracing being used to fully utilize the RT units. This would explain why Cyberpunk (a game known to heavily utilize RT effects) would gain more performance than average. It would also explain the results Nvidia gets in their benchmarks, since those results are just RT and path tracing galore. It’s possible that developers are skimping on RT effects knowing the poor performance of DX12 Ultimate compatible GPUs in ray tracing.
    You could test this hypothesis by comparing Cyberpunk in RT Overdrive -when it releases- to the current 30% uplift. If the uplift increases then we’d have to pull out a crystal ball and look to tomorrow’s games to see an uplift. If it doesn’t, then the hypothesis the NVIDIA’s raster uplift is nearly as high as the RT uplift is accurate.

    • @starkistuna
      @starkistuna 2 года назад +8

      problem is every title is using its own implementation of RTX , so use shadows others reflections , others lighting, whatever one engines considers medium settings might be ultra on another because of the hit the cores take on their respective engines, watching Jensen make claims of 2x and 4x is cringe when he knows he is using upscaling and frame generation which for all we know is using internal 1080p resolutions on some of those claims.

    • @WayStedYou
      @WayStedYou 2 года назад +1

      Metro Exodus EE should be way further ahead since its entirely ray traced if that were true.

  • @ever611
    @ever611 2 года назад +26

    I see the notification, I click it, I give you a like, I watch it!

    • @badasahog
      @badasahog 2 года назад +1

      If you like before watching the video the algorithm doesn't give it as much credence

    • @johnryan461
      @johnryan461 2 года назад

      Me too. Glad Adored is back. Love the man and I'm sorry I couldn't be bothered with is lackeys.

  • @Kallan007
    @Kallan007 2 года назад +51

    I like pretty games as much as anyone, but I will never spend the kind of money Nvidia wants for their video cards. And my last three graphic cards have all been GTX. Seems I will be going AMD this time around. And for that, I will have more money and sense to show for it.

    • @starkistuna
      @starkistuna 2 года назад +4

      it ridiculous what they want for the 4080 and people are paying 1500 for them lol.

    • @Shapar95
      @Shapar95 2 года назад +3

      ray tracing is the future. Nvidia prepared for it with Vision and actually pushed for that future. AMD didn't. They disappoint.

    • @elon6131
      @elon6131 2 года назад +2

      AMD’s 1000$ GPU you think is somehow any better? Come on.

    • @ishiddddd4783
      @ishiddddd4783 2 года назад +4

      @@Shapar95 Shill 101

    • @toddbair9347
      @toddbair9347 2 года назад +9

      @@Shapar95 Nvidia convinced RTX 20 buyers to pay a giant premium for a features that are arguably just now becoming somewhat useful. To add insult to injury, of the handful of RTX titles that finally are starting to make good use of the tech, most RTX 20 cards struggle to play them at high resolutions and refresh rates. While it's commendable for Nvidia to lead the way on certain innovations, the fact they essentially force consumers to subsidize said features years before they're useful is pretty scummy. 🤷‍♂

  • @kevinbrown8226
    @kevinbrown8226 2 года назад +10

    Someone else said this another way but you seem to assume there is no additional cpu load associated with turning RT on. The bottle neck of CPU could be lower fps with RT on. Or maybe it has something to do with bandwidth but additional CPU load for RT should be checked for. I could be totally off base. Great to have you doing videos again!

    • @kevinbrown8226
      @kevinbrown8226 2 года назад

      Consider also the examples nvidia used like portal path traced. An old game with lower cpu demand. It may also be leveraging the new RT hardware or something in Lovelace. Perhaps in a few months we will see game and driver updates that improve Lovelace RT performance? Definitely an underwhelming result on release given claims made about increased RT throughput.

    • @jaju123456
      @jaju123456 2 года назад +9

      Rt does drastically increase cpu load due to creation of the BVH structure. As such those dying light 2 and far cry 6 results etc are simply cpu bottlenecked. It’s obvious since those were the most limited titles also at 1440p.

  • @bwcbiz
    @bwcbiz 2 года назад +31

    So bottom line is that all the new RT hardware is needed to just keep up with the improvements in the Raster hardware.

  • @TheMakiran
    @TheMakiran 2 года назад +11

    I think what explain these results, is that the drawing of the frame is not consisted of ONLY ray tracing. Let's say it takes 10% of the pipeline to draw RT in a frame. If you double your RT perf, you would reduce this 10% to 5%. The other 95% are raster related. I hope you get the point
    So the more RT you throw in a scene, the more difference we should see

    • @raypav
      @raypav 2 года назад

      I like that logic, but it seems to me that Ray tracing takes far more than 10%. The performance penalty for turning RT on is massive, slashing framerates by half. I wonder why the performance penalty is so big even with much faster RT cores? Is it the cache as other comments have mentioned (I.e. rasterization performance has improved but because of cache limitations the RT cores are bottlenecked and cant keep up), or does having ray tracing on require more rasterization somehow?

    • @senoctar
      @senoctar 2 года назад +1

      @@raypav It is definitely more than 10% but remember that raster also improves in each generation. Even if you double the RT (100% increase), if you also increase raster by 50%, the absolute FPS values in RT gen-over-gen will seem small.

    • @joshm60
      @joshm60 2 года назад +1

      This. Deathloop, Far Cry 6, and Resident Evil Village use very little ray tracing. By my calculations, when you exclude those games, the other games (which are also affected by raster performance) have a 32.7% better performance uplift from the 2080ti to the 4090 with RT on vs off.

  • @Psychx_
    @Psychx_ 2 года назад +38

    I am by far not a computer graphics expert, but raytracing as a noisy algorithm, where it's kinda random which pixels get how many samples in a given timeframe is definitely not one of those things that scale linearily with an increase in calculating throughput. Algos producing noisy/dithered output have been embraced by the industry in the past few years and since they require a lot of denoising (blur, essentially), they are one of the main reasons why modern, interactive CG produces rather soft images.
    This is worsened by the trend that these algorithms are computationally expensive (again, because you need to yield enough samples that meet certain distribution criteria), introducing a need for image upscaling techniques that serve as another source of image softness/blur.
    I'd rather see more traditional rendering techniques being employed, and the power of modern hardware being used to allow 2x or 4x MSAA at native 4K for producing high quality, crisp, non-flickering images, instead of having the actual rendering resolution regress again…

    • @CharcharoExplorer
      @CharcharoExplorer 2 года назад +7

      MSAA in modern games = No Go. TAA has issues but its fine at 4K, 5K, and 8K. Traditional techniques are IMHO a bad idea. I believe the focus for RT is warranted long term.

    • @Danuxsy
      @Danuxsy 2 года назад

      or we use neural networks to generate the images and bypass the entire calculating rays thing.

    • @CharcharoExplorer
      @CharcharoExplorer 2 года назад

      @@Danuxsy ... lol

    • @Psychx_
      @Psychx_ 2 года назад

      @@Danuxsy It'll take a good while until that is technically feasible, if ever. Such a NN would be crazy huge as you'd need to feed it scene geometry, material- and texture-, aswell as lightsource data.
      The NNs used nowadays in gaming are comparatively simple. You input an image and get a larger one out, or you input several ones and have the network interpolate object movement inbetween them (frame generation).

    • @divertiti
      @divertiti 2 года назад +2

      Absolutely not, lighting in real life makes images soft with gradients and nuances, not like hard shiny images produced by traditional rendering technique. Going back to traditional rendering would be the real regression

  • @ataksnajpera
    @ataksnajpera 2 года назад +16

    You should compare path-tracing only titles like Quake 2 RTX to estimate real uplift in RT.

    • @mrtimharrington
      @mrtimharrington 2 года назад +1

      Yep, said this elsewhere here. I am seeing around double at 1440P from 3080 10GB to 4090 and I expect it is quite the jump over 3090 ti too. I haven't tried on my 4k yet, but seems 50-55 to 80-85 3090 ti -> 4090 from RUclips videos.

    • @adoredtv
      @adoredtv  2 года назад +6

      Problem with that is it's not anywhere near indicative of today's gaming performance. It's why Nvidia chose to show Portal as well, a game from 2007. The bandwidth requirements on both games is basically nothing compared to today's games.

    • @وليدحسيناشتيوي
      @وليدحسيناشتيوي 2 года назад

      @@adoredtv The real question in Quake 2 RTX is the raster/shading units doing anything with the rendering of the game or is it just RT cores ? (excuse my English)

    • @mrtimharrington
      @mrtimharrington 2 года назад +3

      @@adoredtv
      There are no 'proper' ray traced AAA titles though, so none of the examples you provided are particularly useful either, mainly because each implementation will be different - not 'complete' as we see with Q2RTX. We are probably a decade away from properly ray-traced AAA games.

  • @yasunakaikumi
    @yasunakaikumi 2 года назад +36

    as a VFX and 3DCG artist like me, definitely those optix and cuda cores are helping us to create stunning beautiful visuals in a matter of secs compare to AMD GPU which barely anyone is taking seriously in that industry.... although in terms of gaming, I like how AMD is doing with the RT shaders stuff bit by bit increasing the performance instead of trying to beat Nvidia on it, as even to this day there's barely any games that is actually using RT and AMD is definitely doing it the right way.

    • @holthuizenoemoet591
      @holthuizenoemoet591 2 года назад +11

      Same in the AI and ML branches, which is a same because I'm more of an AMD fan given the companies ethics.
      The reason behind this trend is the CUDA is easer to work with we trying write GPGPU accelerated applications.

    • @veckgames
      @veckgames 2 года назад +8

      I echo this. On top of there being few games that benefit from RT, there's even fewer where it actually results in a significant visual uplift. To devote so many resources to the pursuit of RT performance over raster is not something I see as sustainable at this point.

    • @RobBCactive
      @RobBCactive 2 года назад +3

      An unusually reasonable post.
      AMD must focus on gaming because trying to duplicate a proprietary software stack is a losing proposition. CUDA won out a decade ago, first to market and they had the money to build a monopoly.
      The problem for gamers is Nvidia use RTX to increase the bar to market entry and push up prices.
      It's an unhealthy market governed by a quota system, which has perverse incentives as should an AIB sell out a value AMD card, it needs to sell the Nvidia quota rather than more value cards.
      That's lead to the GPU glut which has stalled the new releases with a huge amount of stock, following on managed artificial scarcity.

    • @DiscovererAlpha
      @DiscovererAlpha 2 года назад +1

      @@holthuizenoemoet591 Yeah I think it's one of AMD's biggest failures and a reason why even if they got a card that was outright better you couldn't really use it. Everything either directly depends on CUDA or at least has the cuda version be the only option for something accelerated. AMD on the other hand doesn't even really take care of the compute subset of Vulkan, arguably their biggest active contribution in anything.

    • @Mopantsu
      @Mopantsu 2 года назад +2

      This is because RT in content creation is very specific and optimized for those applications. Games are all over the place with different engines and techniques. Many devs are not even properly trained in use of RT and often use off the shelf plugins and rarely design a game engine from the ground up. Preferring to port from older game engines they already used on different and older hardware platforms. They also have to take into account varying hardware capabilities, optimizing so that even a low end system can run the game with reduced settings. Someone in Vray is not aiming for a steady 60 fps at 4K. They are aiming for maximum quality rendering as fast as possible. UE5 luminance looks more promising than RT for gaming IMHO.

  • @ayyymd8860
    @ayyymd8860 2 года назад +12

    Maybe it's a scheduling issue? i've noticed that in all the rtx on/off benchmarks, cpu utilization appears to be higher with rtx on than off, so maybe it's their cpu heavy hybrid scheduling overhead finally rearing its ugly head.
    Somebody needs to test cpu utilization with amd rt on/off vs nvidia rt on/off and see if there are any patterns showing up.
    And even if this doesn't end up being the main cause (perhaps it is just insufficient amount L1 cache per SM as some others here have already speculated on) it's still a useful reminder that nvidia's scheduling issues are very real and getting worse with every generation.
    If they're not going to rework their drivers or go back to hardware scheduling, it won't be soon till we have cpu bottlenecks even with midrange cards.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      ⬆️Congrats you won a prize🎁🎉🎊

  • @gameguy301
    @gameguy301 2 года назад +21

    I think the discrepancy between Turing to Ampere vs Ampere to Lovelace can best be explained that many of Lovelaces new raytracing performance features are not supported in existing games. Shader Execution Reordering needs to be in the API and called in the game code, Displaced Micro Mesh and Opacity Micro Maps come down to content authoring in other words it’s how the models and transparencies are being formatted which is why you see Nvidia partnering with Simplygon and Adobe. Supposedly cyberpunk will be receiving SER in the same patch that includes the overdrive RT mode, and Portal RTX features both SER and OMM, but AFAIK not DMM.
    Lovelace will need new games in order to pull further away from Ampere, but in existing titles this is all we’re really going to get unless there are patches to these games.

    • @gameguy301
      @gameguy301 2 года назад +17

      @TheCloop123 the march of new features supported in new hardware is nothing new, it was only a few years ago gamers where cheering on Mantle / DX12 / Vulkan for ushering in low level graphics APIs, render thread multithreading, Asynchronous Compute, and double rate FP16 most of which where features deemed necessary or introduced first by AMD.
      mesh shaders, tier 2 VRS, texture space shading, sampler feedback, are all commonly supported features since Turing / RDNA2 that are defined in DX12U / Vulkan 1.3 that we are still waiting on new games to utilize when the cross generational period with PS4 / Xbone ends.
      DirectStorage requires a GPU that supports shader model 6, and yields its best returns for Windows11 and newer faster storage interfaces.
      Not even sure where to fit ReBAR / Smart Access Memory but again same sort of situation.
      furthermore while DMM is accelerated on Lovelace it still provides benefit to ALL cards including AMD it is simply a more efficient way to author content in general which is why Simplygon adopted the tech. Go read their blog post the results are promising.
      There is a fake quote that is quite popular erroneously attributed to Henry Ford “if I had asked people what they wanted, they would have said faster horses”. The point is that working harder at the existing method vs introducing a new method entirely has always been a balancing act in GPU development and transistor budget is a precious thing when developing a chip so it’s not to be taken lightly.

    • @defeqel6537
      @defeqel6537 2 года назад

      Will be interesting to see how much SER actually affects things, nVidia claims 25% increase in RT game performance which can range from "insignificant" to "great" depending on which games get that increase. It's regrettable that once again this isn't a standard extension to be pushed forward, but a proprietary one, but I guess that's not too surprising.

    • @gameguy301
      @gameguy301 2 года назад +2

      @@defeqel6537 it exists as a proprietary extension now but Microsoft and Khronos are bringing it into DXR and Vulkan RT. Intel Arc also supports shader hit sorting so Intel will also be promoting such a feature inclusion because that’s what APIs do they standardize access to the features that the GPU vendors add to their hardware. And considering the sound reasoning for its inclusion it’s only a matter of time before AMD will likely require the same and they’ll be glad the support is already there waiting for them. Hell pre-release I suspected that RDNA3 would support it.
      GPUs operate under the assumption that pixels will require similar shader execution to their neighbors, but raytracing changes this, bounces could mean neighboring pixels sample entirely different secondary surfaces for global illumination or reflections this can lead to under saturation of your SIMD width and is very cache unfriendly, which can be resolved by sorting the rays after registering the hits but before executing the hit shaders.

    • @dex6316
      @dex6316 2 года назад +1

      @TheCloop123 it’s not a “new features” thing. It can be likened to a new instruction set for CPUs. You can use your existing instruction set, or upgrade to the new one. CPUs with the new instruction set just perform much better than the old one, and in some cases old CPUs can’t even run the software. Think of AVX-512 and how CPUs with it completely dominate in workloads that support the instruction set. If there is a new instruction set that accelerates workloads, then there is no reason not to adopt it. Intel used to have the AVX-512 lead because they supported it, and now AMD does because they support.
      It should be up to the GPU vendors to design an architecture with the best and latest instruction sets. If you are worried about Nvidia and AMD just regurgitating random instruction sets, software won’t adopt them unless they are good.

  • @YelovXD
    @YelovXD 2 года назад +20

    Dropping from 120fps to 60fps means going from 8ms to 16ms per frame, meaning ray tracing takes 8ms. Going from 60fps to 30fps means going from 16ms to 33ms, where ray tracing takes 16ms off per frame. The same 50% reduction in framerate, but double the reduction in frametime.

    • @LambdaHDvideo
      @LambdaHDvideo 2 года назад +2

      We are looking at RT relative to raster performance. In your examples, the raster frametimes would also increase with the RT ones, so relatively, whey would match up the rate of improvement that the simple framerate provides

    • @Psychx_
      @Psychx_ 2 года назад

      If Raytracing can start in parallel, it may even take more than just 8ms per frame in your example.
      I.e. if RT starts after 4ms of raster and overlaps with the other 4ms, you still get 16ms frametimes, but the amount of raytracing work done was actually 12ms.

    • @YelovXD
      @YelovXD 2 года назад +6

      ​@@LambdaHDvideo The point is that using framerates for measuring improvement doesn't work like people think it does. It's a similar thing to something like DLSS. It usually takes a fixed number of ms to process, meaning it has a greater relative impact for higher framerates, even though it takes the same amount of time to do.
      Chart at 10:46:
      4090 NON-RT: 6.5ms
      4090 RT: 10.3ms
      4090 lost 3.8ms
      3090-Ti NON-RT: 10.3ms
      3090-Ti RT: 16.6ms
      3090-Ti lost 6.3ms
      4090 managed raytracing in 60% of the time of 3090-Ti.
      ^^ this is literally the only thing that matters.
      Saying that 4090 is no faster at raytracing than 3090 Ti is just misleading. Or rather the way you get to that point of wrong. Yes, the percentage drop is similar, but that's because 4090 started with higher framerate. If 3090 Ti was magically getting 4090 levels of raster performance but had the same raytracing performance, it would drop more than 4090. It would probably drop closer to the 6ms it lost, meaning it would go from 6.5ms to around 12ms, while 4090 gets 10.3ms.

    • @LambdaHDvideo
      @LambdaHDvideo 2 года назад +2

      @@YelovXD you dont get it. you are just looking at normal performance improvement. this video is about relative rt improvement vs raster.
      of course the 4090 will do raytracing workloads faster, it has more SMs, of course it will improve in raytracing.
      the question here is whether or not the actual architecture improved, which can only be measured in RELATIVE terms, NOT ABSOLUTE terms (because even if no architectural improvements got made to the RT cores, just by having more SMs, the RT performance would increase)
      so if you want to look at it in frametime, sure, lets do that:
      4090 NON-RT: 6.5ms
      4090 RT: 10.3ms
      -> took 58% longer to render the frame with RT
      3090-Ti NON-RT: 10.3ms
      3090-Ti RT: 16.6ms
      -> took 61% longer to render the frame with RT
      meaning that relatively speaking, the rt efficiency is not improving, it is still as costly as it was 2 generations ago

    • @YelovXD
      @YelovXD 2 года назад +4

      @@LambdaHDvideo Ok, so your point is that lovelace should've seen eg 2x improvement in raster and 4x improvement in raytracing, as opposed to similar scaling of both raster and rt.

  • @Frej84
    @Frej84 2 года назад +1

    I believe you are mixing up percentages and percentage point, for instance, at 00:16:59. Here you say the improvement is 11%, but in reality an increase from 156 to 167 is an increase of 7.05% (at least if my math is correct), and a difference of 11 percentage points.

  • @Rigardoful
    @Rigardoful 2 года назад +7

    By the way, I think the data would have been better presented as a ratio of improvement rather than a straight up difference! It is not obvious to everyone that 80 percentage points can be less than 30 percentage points due to the relative values, when talking about rates

  • @aggies11
    @aggies11 2 года назад +1

    As other posters have mentioned, the calculations may need to be done with frame times in mind. Relative performance metrics can actually be different compared to frame time. Eg 45fps is halfway between 30 and 60 fps but 25ms (40fps) is halfway between 16.6 and 33.3ms. Plus if you consider an oversimplified view where a portion of the frame time is spent on raster and another portion on RT (with some overlap), then you can see that a doubling of RT performance (halving of frame time portion) would not lead to a doubling of the overall frame rate as it only reduces one portion of the overall frame time. So depending on the ratios if each it can lead to significantly different amounts of overall relative performance changes. To get accurate numbers you'd probably have to be able to look inside the game engine to see the breakdown.

  • @Tritiumfusion
    @Tritiumfusion 2 года назад +16

    I love the amount of videos you've created, it makes my day *every time*
    Thank you.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      ⬆️Congrats you won a prize🎁🎉🎊

  • @mechkg
    @mechkg 2 года назад +1

    I think the problem with trying to evaluate RT performance like this is that raytracing is not just some generic feature that you turn on or off and that does roughly the same thing in all games like, for example, anti-aliasing or SSAO. Each game or game engine has its own way of using raytracing-based algorithms and comparing fully path-traced games and games that only do raytracing for a specific effect on top of a rasterizing renderer isn't that useful, each game and each RT effect has to be evaluated and analyzed on its own to get some understanding of where the bottlenecks are.

  • @anthbobo2783
    @anthbobo2783 2 года назад +17

    I think you should talk about how unreal engine 5 is basically using Lumen instead of Hardware Raytracing due to how much better performance increase Luman is going to get it might actually make hardware raytracing obsolete. If Lumen is as good as they say it is why the hell are people going to pay insane prices for raytracing.

    • @tankerock
      @tankerock 2 года назад

      This ^
      RT is still a gimmick 2 years after its inception. UE5 is the future.

    • @christophermullins7163
      @christophermullins7163 2 года назад +1

      Few games are noticable better with RT. Good raster shading can be just as good if done properly. And like he said.. RT still requires a large amount of work. Personally RT is WAY overhyped and just a way to get people to upgrade GPUs.

    • @Unreatxplaya
      @Unreatxplaya 2 года назад +2

      @@tankerock it’s a good implementation but it only works on static objects since it’s all pre-baked.

    • @anthbobo2783
      @anthbobo2783 2 года назад +1

      @@tankerock nvidia needs to put in their place what a slimy company.

    • @gameguy301
      @gameguy301 2 года назад +6

      Lumen has a hardware and software raytracing mode, the matrix demo uses Lumens hardware raytracing. hardware mode is much more accurate for all effects but especially reflections and GI, and importantly can capture non static objects in a way the software mode cannot.
      Hell AMDs own GPU Open GI-1.0 takes Lumens software mode, and alters the world space representation in a way that makes it compatible with hardware acceleration yielding both better results AND better performance than the current software Lumen implementation.

  • @fintux
    @fintux 2 года назад +2

    At around 18:00, you're mixing up relative increases (rt over raster) to absolute increases (in rt itself), which explains some of the discrepancy. But of course not all of it.

  • @necrozim
    @necrozim 2 года назад +7

    Probably the most niche of use cases, but I LOVE the raytracing for map baking in marmoset4, it uses the cores and brings rendering times from atleast 10+ minutes down to about 1 to 3 seconds because of the raytracing cores. This is huge for me, as it goes from me having to labour to find mistakes, guess the fixes and make sure theyre all blindly fixed before baking, to being able to shrug off a mistake, fix them as i find them, and then slap re-bake getting an almost instant result. Granted, i don't have a 4xxx, only a 2070super, and a 3080ti in my other work machine, both being completely adequate to bake 80M tris or so in a near instant time. RTX has been a game changer for me as an artist and i love it haha.

  • @Real_MisterSir
    @Real_MisterSir 2 года назад +2

    Things to note: Not all raytracing is equal. Games will have various presets of raytrace quality and features - some games split these things up in multiple sub elements (like how CP77 has RT shadows, RT reflections, that can be toggled individually etc) - and this may cause differences in testing that mean games can't be compared directly. If one game has an extreme load of RT features (like CP), and another game has only an essential minimum of features, then the RT relative performance seen in the two titles will be vastly different. Some RT aspects can be more easily handled by non-RT optimized cores.
    On this topic, I also want to question why the 3090Ti and 2080Ti specifically are highlighted? None of these cards share the same product evolution line with the 4090.
    The 4090 should be compared with the 3090, and the 3090 can't truly be compared with any 20xx card, as the 2080Ti compared directly with the 3080Ti, not the 3090Ti.
    Overall, it undervalues the numbers that are used in this napkin math result based schematic analysis, and there are lots of factors that go untested or uncredited.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @r4dios1lence92
    @r4dios1lence92 2 года назад +6

    Thanks for making the numbers more clear and so simple.
    Frankly, this is the reason why I came to hate the GPU wars: it's full-blown cherry-picked advertising war. There are no standard tests that are completely transferable to each technology, so makers (mainly NVIDIA) decided that the way to take the market is to create metrics or pull BS like PhysX.
    CPUs? It's clock to clock, number of cores, and cache memory sizes. Clock and core count for computational power (clocks over core cpimt for applications that can't be heavily parallelized), and sometimes cache memory size for very few applications. In general, you can draw up a few different programs that show performance on a few applications, and call it a day in terms of benchmarking. Easy.
    GPUs? "Oh, it's application based. Change game to game and count framerate. Oh, framerate above limits of current displays don't matter? We don't care, it's about the size of the number. Then, pick the games that show best results. If you can't do that, create a need for a technology, and market it hard while oversaturating the need for it. If everything else fails, keep close ties to developers, give them tools to optimize for your GPU, maybe throw another PhysX, and call it a day." There are so many different bottlenecks besides clocks and memory now that each game deals and are developed around, that performance between games can be day and night. It's the perfect environment for the kind of shitty marketing you're exposing.
    Worst part is that to call out this kind of BS you need huge spreadsheets with more data than the average consumer will ever bother looking through.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @Psychx_
    @Psychx_ 2 года назад +20

    As for Raytracing performance being the most important criteria for some people now: The smallest common denominators are still the game consoles and many games offer no RT at all, or the choice between RT off w/ high framerate and "eye candy mode" w/ a lower framerate. So no, I don't think that RT performance is that important, since the ability to choose how the games are rendered still translates over to PC ports, often with much more granularity even.

    • @Mopantsu
      @Mopantsu 2 года назад +1

      When I play a game (especially on a console) I am interested in fluidity and latency. Eye candy is nice but when you are actually playing how often do you stop to admire reflections let alone shadows.

    • @Psychx_
      @Psychx_ 2 года назад +6

      @@Mopantsu Personal preferences can vary. The type of game is also a factor (i.e. frametimes matter much more in fast paced or competitive multiplayer games, while some slow, story driven single player game may benefit from an increased fx budget in terms of athmosphere or conveying moods.)
      IMO it's funny that your focus when playing on console is set on latency, esp. considering that many TVs are just horrible in that regard… PCs can reach much lower input and frametime latencies than consoles if specced appropriately or the in-game settings allow for some manual optimization.

    • @baronvonlimbourgh1716
      @baronvonlimbourgh1716 2 года назад +10

      When the radeon 290 and fury came out suddenly energy use was the most important. Having a card use that much power was the end of the world because their room would become hot and whatnot.
      Now we get these little spaceheaters from nvidea and it no longer matters and everybody is happy because it's nvidea.
      It's just how fandom works, always has, always will.

    • @MadBlazer89
      @MadBlazer89 2 года назад +8

      @@baronvonlimbourgh1716 Exactly, Nvidia keeps beating the drums on ray tracing because they know AMD (for now) doesn't perform as good as them. As soon as AMD will catch up on ray tracing performance you won't hear about it anymore from Nvidia. They will come with the next "best thing" that will "revolutionize" for the God knows how many time the gaming industry, it will be a gAmE cHaNgEr. And ray tracing will join the list with Physx, Game werks, pubic hair werks and the rest of the "must gave gaming features" that you don't hear about anymore.

    • @baronvonlimbourgh1716
      @baronvonlimbourgh1716 2 года назад

      @@MadBlazer89 nvidea has the propaganda machine locked down. They know what they are doing in that regard.

  • @imglidinhere
    @imglidinhere 2 года назад +12

    Yeah this is 100% more Nvidia marketing BS. Love the content dude, you're pointing out the same crap they tried to pull last time when you analyzed Ampere and Nvidia's "most efficient GPU architecture" claim or whatnot. Stuff like this makes me angry and angrier when I see others blindly follow behind Nvidia "because muh X, Y and Z".
    Nvidia didn't improve raytracing performance at all. It's relative based on the raster performance, AGAIN, *for the third time now!* The inability to fully max out games with RT with my 6700XT is kinda disappointing, but at the same time any game where I'd enable the feature kills framerates so much that I'd have to use FSR or DLSS to make up the difference. I'd have to reduce quality to make up for the performance lost of increasing visual quality! xD It's absurd. The only time when RT will become relevant is when there is no discernable impact to performance when it's enabled. Until then I don't care about it...
    ...not that it matters anyway, Nvidia have priced me out of their GPUs. So while I dislike saying I'm stuck paying for AMD's hardware, I'm stuck paying for AMD or buying second-hand Nvidia parts. Genuinely dumb.

  • @sinephase
    @sinephase 2 года назад +1

    I'm confused by this -- RT is a coprocessor, why would its relative performance to raster increase matter at all? RT increase is relative to older tech, not relative to new raster tech...

  • @N0N0111
    @N0N0111 2 года назад +3

    As Jim knows it the best from any techtubers here on RUclips.
    Doubling, Trippling and even Quadrupling transistors has diminishing returns on performance now.
    It's now more about larger and faster cache layers that brings these giant jumps in performance.
    In as having a larger and larger bottle, but the neck of the bottle just minimally increases in size.
    You won't be able to poor more out while having a lot more in the bottle.
    AKA: Bottleneck is all layers of cache now.

  • @mapesdhs597
    @mapesdhs597 2 года назад +1

    This is the most Marmite analysis video I've seen in a fair while, comments here and elsewhere suggest people seem to either love it or hate it. :D TGOG was rather critical, on various grounds, but I don't know enough about RT technology, 3D gfx dev issues, etc. to infer how right he is (let's face it, who typically does?).
    However, a thought comes to mind. A decade ago, at the end of the Fermi 2.0 era, NVIDIA redesigned the CUDA core in various ways for a number of reasons, especially power consumption and efficiency. The result was a new shader design (Kepler) which clocked almost 50% higher, making it much faster for gaming, while using 20% less power (just based on the TDP numbers for the GTX 580 vs. GTX 680). However, the tradeoff was twofold: the design needed a lot more cores to achieve the same performance (at least twice as many, though this didn't matter because of the power savings) and it sucked for CUDA-compute. I could include links but YT hates that, I keep finding my posts auto-deleted if I include web references, so just look up reviews of the GTX 680/780 on toms or AT at the time, find the compute results. Note that NVIDIA also took the opportunity to cripple FP64 on consumer cards, something for which the GTX 580 was rather good (the FP ratio was changed from 1:8 to 1:24, though the Titan allowed one to alter this behaviour in exchange for a drop in core clock, a feature removed after the TItan Black).
    Way back I was talking to Chris Angelini about this because I couldn't figure out why the CUDA results in Kepler reviews were so bad compared to Fermi. I was doing CUDA performance testing for After Effects (AE) using an X79 mbd with 4x GTX 580 3GB. The 600 cards were faster for gaming, but for GPU acceleration in apps like AE they were horrible (there's a long thread on Creativecow about AE performance with hundreds of relevant posts). This difference persisted into the 700 series with the 780 and 780 Ti, shader counts were much higher, though clocks were back down again, yet the TDP had gone back up to a level higher than the GTX 580. End result, the 780/Ti were great for gaming, and finally Kepler could also consistently beat Fermi for CUDA-compute (varied by application; the 780 could beat the 580 in AE, but not for other compute tasks, while at least for AE the 780 Ti is about 2x faster than a 580).
    Chris asked NVIDIA about the differences and received a number of explanations, but anyway it occurs to me, with the rise of RT and other factors, perhaps NVIDIA needs to undergo this redesign process again. They need a *better* core, from the ground up, rather than just an iterative meddling with existing tech. What's being used atm has its origins in a design put together long before RT was a thing. Quite what they could do I don't know, it's not my field, but the changing demands of 3D (largely of their own making re RT and DLSS) just remind me of what they did after Fermi 2.0. It might also give them a chance to come up with something far more efficient. Atm the enormous number of required cores must surely impose diminishing returns, and make it harder to produce dies without defects. If they had a better core and thus required fewer of them to provide the same performance, while using less power, then the dies would not need to be so large and could thus also be cheaper, assuming they stuck with monolothics.
    Btw, Jim, have you noticed AMD has stopped using CRay for its CPU PR? I wonder if Zen4 doesn't shine quite so well due to how it scales, combined with how the Ecores doubtless benefit Intel for this test. :D

  • @maxstepaniuk4355
    @maxstepaniuk4355 2 года назад +9

    If you don't care for rays-you are rastaman

    • @Mopantsu
      @Mopantsu 2 года назад

      Not so much Bomber Man more Bomba clart.

    • @SirBrucie
      @SirBrucie 2 года назад

      Bombocloth Ray's

  • @Vinci480
    @Vinci480 2 года назад +2

    My biggest issue with the whole NVIDIA Raytracing spiel is that almost everywhere, where they push hard for ray tracing, Rasterazation and Shading and Lighting in total get a huge Quality decrease.
    You can really spot it in games like F1 or Cyberpunk where if you don't have RT on and look at stuff like puddles or glass, the reflections look like stuff that you saw 5-10 years ago, compared to what we are already capable off.
    Reflection quality that we had already figured out and had good and quick solutions are now black, brown and grey spaces on the floor.
    Other good examples are how so many RT "focused" games become Mirror worlds or locations experiencing the highest downfall numbers on earth.
    There are so many way RT could be implemented much more sophisticated and better for Shading and other lighting effects compared to just pure reflections, which would also probably help a lot with RT not being a huge performance hole.
    But they focus on trying to basically say "Hey look we can render the world 2,3,4, 5 times at the same time with precision.
    There are some really good showcases in UE5 or some Videos on RUclips on what Path/Ray Tracing can be used for and I find the reflection stuff so basic compared to how Global Illumination, Shadows, Light Bouncing, etc. can change the Scene.
    I know Ray Tracing is basically the future and I want to love it but when it comes at the cost at already reached Quality in Rasterazation then thats just pushing more people against it because games suddenly look worse even though "More Fps, better quality, lower frametimes" etc.

  • @CharcharoExplorer
    @CharcharoExplorer 2 года назад +3

    Ray Tracing has a CPU cost. SO the 4090 is likely bottlenecked even at 4K with RT on.
    I would love it if someone did 8K RT vs Raster. BUt alas no one does those.
    Also - Jensen says that a part of the 4090's RT requires specific game optimizations. This is not optimal, I 100% agree, but it means we still dont know how fast exactly it is in games.

    • @korcommander
      @korcommander 2 года назад

      Is it really that dramatic of a cost, or maybe the cards are bottleneck by the ray accelerators, possibly using less power like the 20 series?

    • @Tippotipo
      @Tippotipo 2 года назад

      ​@@korcommander Real time ray-tracing are heavily taxing for current hardware including Nvidia RTX 4090. That technique is hardly worthy the use when upscaling is required to take advantage in detriment of quality. Raster was made to address the shortcoming of using ray-tracing in real time. Nvidia was clever to use their tensor cores in their hardware and their software.
      A skilled artist can produce excellent visual without the need of upscaled graphic with real-time ray tracing where the difference is minimal at best compared to its rasterized counterpart.

  • @ramanmono
    @ramanmono 2 года назад +2

    Relative to the raster... but absolutely it is way better, so why even compare gen to gen relative to raster increase.

    • @adoredtv
      @adoredtv  2 года назад +1

      Because it's an investigation of how far ray-tracing perf has come, not raster perf.

    • @ramanmono
      @ramanmono 2 года назад +2

      @@adoredtv yeah, but it makes it seem like the product is somewhat bad or bad compared to previous gen. Which is not the case. The only bad thing here is the price...
      And it's size.

    • @adoredtv
      @adoredtv  2 года назад

      @@ramanmono It's just data, I didn't say anything about the 4090 or the arch in general, the video is only about RT uplift gen-gen.

  • @Bayonet1809
    @Bayonet1809 2 года назад +10

    Tell me I'm wrong, but I though that a say, 50% improvement in raster framerates and 60% improvement in RT framerates actually shows that the performance improvement of RT has kept pace with raster, which is unimpressive, but not as bad as you suggest when you say RT is improving by only 10% every gen.

    • @adoredtv
      @adoredtv  2 года назад +11

      One way to look at it is, if you swapped the 4090 and 3090Ti's RT hardware, the 4090 would still be much faster in raster and RT games. That's because it's the raster performance that is providing the vast majority of the fps increase, not the RT.

  • @Ph42oN
    @Ph42oN Год назад +1

    Maybe its getting to the point where games cant properly utilize those extra RT cores. How it is in case of something like Quake 2 RTX or Portal RTX, would the difference be bigger in them because they very heavily use raytracing?

  • @wilhemfaust250
    @wilhemfaust250 2 года назад +4

    Great video here. Quite interesting too. I hope we'll see a similar video for the RX 7900 XTX. Thank you for the time you spent to inform us.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @munfurai8083
    @munfurai8083 2 года назад

    Thank you for pretty much answering my question I had about the RT angle on your last video!

  • @ChrisM541
    @ChrisM541 2 года назад +5

    Another excellent analysis, cheers.
    RT tech, clearly, does have a heck of a long way to go - especially if all it's currently good for appears to be the now standard 'shiny chrome and puddles'. I'm thinking some chiplet innovation can come to the rescue here, in scaling and speeding up this feature...a definite future feature.

    • @starkistuna
      @starkistuna 2 года назад +2

      go watch 2 minute papers channel , the breaktrough has to happen with software first rtx is never ging to happen in the next 10 years without some artificial cutting of corners like dlss 3 and yet to be worked on techniques the hardware isnt even close to being there yet

    • @jaromor8808
      @jaromor8808 2 года назад

      @@starkistuna 10y? lol, way sooner

    • @starkistuna
      @starkistuna 2 года назад +1

      @@jaromor8808 clearly you havent you dont realize how long it takes a single frame of pure raytracing to render in real time, for example Big Hero 6 (2014) render time was almost half a year (~172 days) was spent rendering the film on a 55,000-core supercomputer . The listed running time of 108 minutes * 60 seconds * 24 frames per second = 155,520 frames in the film, giving us an average render time of 1,221 compute hours per frame, or a rendering speed of 2.27e^-10 FPS. Which means that, if Moore's Law continues to hold, in 26.7 years or so we'll have a super computer that could render this film in realtime at 24FPS.

    • @starkistuna
      @starkistuna 2 года назад

      @@jaromor8808 Sooner? Remember Big Hero 6 (2014) that took an intel super with 55,000 core computer 1.1 million render hour / day capacity of the farm and the quoted total of 190 million render hours for the film.
      That means almost half a year (~172 days) was spent rendering the film on this supercomputer. The listed running time of 108 minutes * 60 seconds * 24 frames per second = 155,520 frames in the film, giving us an average render time of 1,221 compute hours per frame, or a rendering speed of 2.27e^-10 FPS. Which means that, if Moore's Law continues to hold, in 26.7 years or so we'll have a super computer that could render this film in realtime at 24FPS and thats was at 2k resolution.. And That was Overwatch Teamfrortress 2 light on photo realism not even trying to emulate reality.

    • @starkistuna
      @starkistuna 2 года назад

      @@jaromor8808 Sooner? remeber the Movie Big Hero 6 (2014)? that took an Intel super computer with 55,ooo cores half a year (~172 days) was spent rendering the film on this supercomputer. The listed running time of 108 minutes * 60 seconds * 24 frames per second = 155,520 frames in the film, giving us an average render time of 1,221 compute hours per frame, or a rendering speed of 2.27e^-10 FPS. Which means that, if Moore's Law continues to hold, in 26.7 years or so we'll have a super computer that could render this film in realtime at 24FPS and that was just at 2k. The graphics were not even going for reality and were kinda like Overwatch and tf2 very stylized and simplistic.

  • @edh615
    @edh615 2 года назад +2

    In blender open data, a 2080 super gains 700 points with hardware ray tracing, the 3080 gains 2000.

    • @jaromor8808
      @jaromor8808 2 года назад

      Would you make a wild guess, by extrapolating from 6950 TX, where could a ref. model of 7900 XTX end up in the ranking when HW-RT support for AMD arrives with Blender 3.5? (fingers crossed)

    • @edh615
      @edh615 2 года назад +1

      @@jaromor8808 If it's double it would be like a 3080.

  • @alun1038
    @alun1038 2 года назад +3

    It genuinely is baffling that a humongous $1600 4090 can't even run Cyberpunk at 4K RT Ultra over 30 fps without upscaling. I think the current RT algorithms in games are still far from polished, it's only been introduced since 2018, might take us several more years to actually come up with something acceptable.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      ⬆️Congrats you won a prize🎁🎉🎊

  • @Shane-Phillips
    @Shane-Phillips 2 года назад +2

    Pretty interesting analysis, I don't miss a lot tech these days but I'll admit I hadn't considered the role that raster improvements made in raytracing improvements, I hope you continue down this rabbit hole, as it would be interesting to know if it's an nVidia/AMD problem or just game engines not being able to use the capability of the hardware well enough. At the moment I'm still not all that blown away by raytracing so it would be interesting to see where its future might lie.

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @Eternalduoae
    @Eternalduoae 2 года назад +3

    @10:47... Just FYI, Jim. TechPowerUp has a serious flaw in their testing - they tested all non-40 series Nvidia GPUs on old drivers - at all of them are before the huge DX12 boost in 522.25. In fact, you can even see in their driver analysis of the 522.25 the gains that the 20 and 30 series have... their current 4090 and 4080 reviews are all essentially worthless for inter-generational comparison right now.

    • @Eternalduoae
      @Eternalduoae 2 года назад +1

      @12:25 I also believe that your analysis here is a bit flawed too. One pitfall when using percentages is that you need to be careful about the scaling factor of small to large numbers.
      So, you appear to be pointing out that the 3090 Ti had a bigger performance gain over the 2080Ti compared to the 4090 vs 3090 Ti at 4K resolution.
      The problem here is that performance percentages do not scale from small numbers to big numbers - the percentages appear smaller, even if the absolute number differences are the same or bigger.
      i.e. 1 fps increase to 2 fps is a 100% gain. But 2 fps to 3 fps is only a 50% gain. The amount of improvement is the same but it gets much harder to scale the percentage over the prior number. A person reading the percentage gets the wrong impression that the gain is much worse, gen-on-gen for the third card.
      However, at larger numbers this does not hold true for our perception.
      An increase of 1 fps from 60 to 61 and from 61 to 62 gives you an absolute gain of 1 fps - the same, but the percentage gain is 1.6-1.7% for both.
      Okay, applying this to the real world situation:
      At TPU's review, conrtol @4k with RT enabled the results were:
      2080Ti - 21.5 fps
      3090 Ti - 40.3 fps
      4090 - 68.1 fps.
      That's a gen on gen increase of 87% for the 3090Ti and *only* a 69% increase for the 4090. But the thing is, the absolute increase is 18.8 fps for the 3090 Ti but a big 27.8 fps increase for the 4090. The difference between the 4090 vs 3090 Ti is 50% better than the 3090 Ti vs 2080 Ti.
      The 4090 is the more impressive jump in performance, hands-down, but putting the context into percentages changes the interpretation of the person consuming the information because it biases towards smaller jumps at smaller number ranges.

  • @AthanImmortal
    @AthanImmortal 2 года назад +2

    Omg, 1 minute ago, I can't wait to get through this. Always enjoy a Jim investi-rant video.

  • @tootsy5238
    @tootsy5238 2 года назад +3

    ur the hero we don't deserve, thank you for continued content

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @zodwraith5745
    @zodwraith5745 2 года назад +2

    The amount of die space wasted on RT is laughable. It looks to me that RT is being held back by mediocre raster performance. Nvidia is double and tripling down on RT and DLSS and leaving raster as an afterthought. The 4090 is no slowpoke, but the sheer number of transistors on the GPU tells you it should be closer to 200% performance, not 150%. All for raytracing that most people are still not sold on. Sure it looks pretty, but it's still not worth the massive performance hit. People are buying Nvidia for DLSS, not ray tracing.
    The 4090 should have been much faster, or much cheaper. I don't like paying for something I'm never going to use.

  • @celdur4635
    @celdur4635 2 года назад +3

    Imagine the MCM approach allows AMD to "glue together" a Raster GPU and an RT GPU. Both at 300mm or something.

    • @rattlehead999
      @rattlehead999 2 года назад

      Imagine going back to the conventional shading, considering it's more than good enough and has no drawbacks.

    • @sungjin-woo3894
      @sungjin-woo3894 Год назад

      ​@@rattlehead999 Imagine actually making a breakthrough discovery and actually create something similar to what the guy above says and have fully raytraced games at the same performance as rasterized games.

    • @rattlehead999
      @rattlehead999 Год назад

      @@sungjin-woo3894 It would be possible with enough Ray Tracing cores/engines. If they made dedicated ray tracing cards with a few thousand RT cores, that would be reality even right now, the problem is that nobody would buy them. And ray tracing might actually be good if you had dedicated ray tracing cards.
      Ray tracing is a gimmick so far.

  • @ssimone1973
    @ssimone1973 2 года назад +3

    An interesting thing to look at with RT is this, every GPU right now can perform rasterization but only a small number can perform RT. The reason we have RT is due to Nvidia needing some kind of real world test/demonstration of the AI performance in their tensor cores, hence RT which is real time AI calculation of how light reflects and illuminates objects. Essentially all RTX owners were used to test and provide a real world proof of concept for Nvidia's tensor cores. When performance was poor, along comes another AI software in DLSS to improve framerates by image upscaling in real time showing potential AI purchasers the advantage of buying Nvidia's AI servers. Now Nvidia's proprietary RTX API was designed to make game developers lives easy by removing allot of the work in lighting a scene. The AI in the RTX API takes care of the illumination of the scene including shadows and reflections without the need to bake it into the game engines lighting. But since only a small number of GPUs can perform RTX game developers still have to do work to illuminate the scene in game to old way for those who don't have raytracing GPUs. A quick look of the steam hardware survey shows almost 50% of Nvidia GPUs used for gaming are older non RTX cards. Include the early 20 series RTX cards and it jumps up to about 60% or so. To me it does not appear to be worth it for game developers to include raytracing in games as most cards still used today don't support it. 40 series cards are marketed to those who have 20 series cards but so many still use 10 and 16 series cards. Hell a lot of people are still using the old 700 and 900 series cards to play games and not much seams to be moving the needle for people using older Nvidia cards. So is it really worth it to pay huge amount of money for a performance killing API that in reality you don't even pay attention to in game. Even if you claim you can see the difference between RTX on and off, when playing you tune out all those visual bells and whistles. Not to mention that a lot of RTX reflections in my opinion are over done to enhance the RTX visual and look nothing like in the real world when the goal is to create realistic visuals. City line reflections on a waters surface making the water look like a mirror isn't that realistic when compared to real life. Shadows are where RTX really works best but who the hell pays any attention to shadows? RTX was introduced so Nvidia to market AI and AI servers to potential customers. Now we're stuck with this API which is useless when a large portion of GPU users don't have RTX capable GPUs.

  • @paul1979uk2000
    @paul1979uk2000 2 года назад +1

    Ray tracing is the future of gaming, no doubt about it but, raster performance is still far more important and by having overall better raster performance, it's going to drag up ray tracing performance kicking and screaming.
    It will be interesting to see with the AMD new cards because raster performance is getting a big boost compared to ray tracing performance, and even thought I don't expect it to beat Nvidia's best, the raster performance could have some unusual results with some games that get a lot of performance on raster.
    Honestly, I think that until the next gen of consoles are released like the PS6, it's hard to get excited about ray tracing, yes it looks good in some areas but it feels like it's just slapped onto games that were designed around baked lighting and sometimes the results can look a bit off, ray tracing is going to be really interesting once games are designed from the ground up around that with no fallback option to baked lighting, that's only going to happen once consoles can do it easy enough and even dirt cheap PC gpu's can do it like APU's, so we are still talking many years before ray tracing becomes a game changer whereas for now, it's a nice to have feature but not that big of a deal, raster performance is far more important still.

  • @gvd-l3o
    @gvd-l3o 2 года назад +9

    Another nice analysis, the pure numbers.
    Keep up the good work Jim, the channel is reviving again!
    People nowadays often buy because it costs a lot and it became a prestige thing.
    If its worth it ok but if the numbers say something else, better think twice...

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @HeyImGaminOverHere
    @HeyImGaminOverHere 2 года назад

    Oh how I have missed your analysis videos Jim… I know you have been back for a little now but welcome back!

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      Congrats you won My prize🎁🎉🎊
      Claim Now!

  • @1ch4lM
    @1ch4lM 2 года назад +6

    Using the same numbers (hopefully I didn't make a typo) and comparing the performance loss with RT on vs. RT off, on average the 2080 Ti gives up 40.29%, the 3090 Ti 37.15%, and the 4090 33.61%. That means the 3090 Ti is an 8% improvement vs. the 2080 Ti, and the 4090 is a 10% improvement vs. the 3090 Ti.

    • @realgamer1998
      @realgamer1998 2 года назад +2

      yup. and that is actually the problem. despite 3 generations and 4 years later. Ray tracing is still uncommon in games. not enough generational improvement as compared to raster (no matter what those marketing BS says). takes up important die space but still doesn't give back enough value of that die space as compared to raster.
      the only place RT currently makes sense for those 2-3x improvement is synthetic benchmarks and professional graphic works.

  • @eugkra33
    @eugkra33 2 года назад +1

    AMD is in the same boat. They are reporting 50-70% raster increase, but more like a 46-60% RT increase when RT, and FSR2 is on.

  • @Khang-kw6od
    @Khang-kw6od 2 года назад +15

    Idk why people still care so much about raytracing. Linus made a video comparing games with raytracing on/off and people half of the time don't even notice a difference. For a 50% performance dip, you get a barely noticeable increase in graphics. It is way beyond the law of diminishing returns at this point.

    • @TomatePasFraiche
      @TomatePasFraiche 2 года назад +7

      It’s crazy how many time we have to remind people of this.

    • @Khang-kw6od
      @Khang-kw6od 2 года назад +9

      @@TomatePasFraiche people just eat up Nvidia marketing without any thought. It's pretty sad to see these days.

    • @AVerySillySausage
      @AVerySillySausage 2 года назад +4

      It has quite a big affect on my immersion in game, depending on the game and they affect. It's rarely worth enabling all the RT settings and shadows are never worth it. At least don't knock it until you have actually tried it. It's the future and the most demanding and high tech games are going to be using it, it matters a lot. AMD being a generation behind is always going to push me away from them. But I'm not buying Nvidia either at current prices so I wait. All that said, it matters but still not more than raster performance. You give me a choice of AMD GPU and an Nvidia GPU with same raster performance but Nvidia has much faster RT and a price premium, I will pay the premium(within reason). What I won't do is pay more for a card that is actually slower in raster just for RT performance (the 4080).

    • @fraserlamb5787
      @fraserlamb5787 2 года назад +1

      It makes a huge difference if you look a kellys rt for minecraft it makes the world seem alive

    • @Mopantsu
      @Mopantsu 2 года назад +2

      @@fraserlamb5787 minecraft and Portal lol. I want realistic graphics not shiny blocks from year 2000 era.

  • @lilyounggamer
    @lilyounggamer 2 года назад +9

    hard pass on rtx 40 series they need to get on mcm chiplets like amd nividia uses way too much power not to mention melting cables and psus which nividia reddit censors just go with last gent or amd rdna 3 or intel good video jim

  • @agentoranj5858
    @agentoranj5858 2 года назад +1

    When did Jim come back? Good to hear your voice.

  • @Psychx_
    @Psychx_ 2 года назад +8

    4090 being bottlenecked may be a result of Nvidia still using the driver/software scheduling to distribute work between the SMs. More ALU clusters -> more overhead.

    • @shepardpolska
      @shepardpolska 2 года назад +2

      With GPU outpacing CPU improvements, AMDs approach of hardware scheduling is leaving them better off in raster in the near future, assuming GPU advancement doesn't slow down and that CPUs won't catch up. Could be trouble for Nvidia, switching from software to hardware scheduler could be troublesome and cause issues, you pretty much need a completely different hardware architecture and a new driver too.

    • @Psychx_
      @Psychx_ 2 года назад

      @@shepardpolska I wouldn't say that GPU development is outpacing CPUs. Both classes of hardware are just getting broader and are starting to implement special accelerators.
      Since this GPU generation, going broader is starting to get less and less viable aside from pure compute workloads though, as IPC had to regress in order to make that happen, while the clockspeeds had to be bumped up, just to compensate for that - this could be Nvidia's Bulldozer moment, although RDNA3 also didn't benefit that well from the increase in ALUs, but could at least maintain IPC and significantly increase efficiency.
      When looking at CPUs, these are making steady progress in width (128 x86 cores per socket in 2023, aswell as inclusion of special purpose accelerators), IPC and clockspeed.
      Additionally, they are a bit further along when it comes to new manufacturing techniques like chiplets and stacked dies.

    • @tilburg8683
      @tilburg8683 2 года назад

      I think mostly CPUs are still outclassing GPUs. But so far the best I've had is a 3070ti, although on lower resolutions my cpu can get 200fps in pretty much any game. (Besides some terrible ones that I haven't tried maybe like cyberpunk 2077).
      It's still as fast as the 13th gen in single core so about as fast as it gets for gaming.

    • @GreenJalapenjo
      @GreenJalapenjo 2 года назад

      I don't know if that's a factor, but it's not the whole story. Say the game needs 8ms to do physics, and 10ms to render a frame at 4k; you spend 10ms per frame for an FPS of 100, and the CPU spends 2ms idle waiting for the GPU every frame. You turn down the resolution to 1440p, which has 0.44x as many pixels. The game still needs 8ms to do physics, but renders the frame in 4.4ms; you now spend 8ms per frame for an FPS of 125, and the GPU spends 3.66ms idle waiting for the CPU every frame. So because you went from being GPU-limited to being CPU-limited, your GPU draws frames 2.25x faster, but your FPS only got 1.25x faster.
      FWIW, this is (part of) why Jim said frame times is a better measure than FPS for this, since that doesn't count the time the GPU spends idle waiting for the CPU to give it work to do.

    • @shepardpolska
      @shepardpolska 2 года назад

      @@Psychx_ I just meant that we have a GPU that gets CPU bottlenecked at 4k, that is literally a first. What I mean is CPUs can't seem to feed those new GPUs fast enough. If games become more CPU bottlenecked, I would say that GPU progress outpaced CPUs, in gaming that is.

  • @s7robe297
    @s7robe297 2 года назад +1

    I love your vids man. Idk what else to say, just really appreciate your attention to detail in your content

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @defeqel6537
    @defeqel6537 2 года назад +3

    On the TechPowerUp results, I wouldn't include 4090 results for Far Cry 6, as that is almost certainly partially CPU limited, even at 4K, same for Watch Dogs Legions, with 1080p, 1440p and 4K raster results of: 125, 122, 114 and 111, 110, 105 respectively. Not sure how that affects your analysis. (edit: I see you acknowledge this in the video, teaches me to watch until the end before commenting)

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      ⬆️Congrats you won a prize🎁🎉🎊

  • @19vangogh94
    @19vangogh94 2 года назад +1

    IMO the simplest way to test the RT improvement is to measure performance loss with RT enabled over generations.. made up numbers example - if rt2xxx has i.e. 0.6x performance loss (so 100 fps becomes 40) and 4xxx has 0.45 performance loss, then it has improved by 0.6/0.45=1.33% improvement

  • @Psychx_
    @Psychx_ 2 года назад +3

    Nvidia's GPU architecture basically is modern day GCN now. Sips a lot of power and only compute tasks see scaling that is somewhat in line with the increase in execution units. Also, it seems like (gaming) IPC dropped significantly… Pipelining and introduced machine latency, along with scheduling overhead maybe?

    • @shepardpolska
      @shepardpolska 2 года назад

      Yeah, I noticed that too. While I wasn't looking, AMDs and Nvidias approach to GPUs switched places. It's now the AMD with more efficient, smaller dies while Nvidia pushed die size and power as far as they can.

  • @MrNova39X
    @MrNova39X 2 года назад

    Love you man, keep doing what you do, you are needed and missed. Stay healthy and happy 😘

  • @brendanbutkus2392
    @brendanbutkus2392 2 года назад +3

    Me personally all i care about is native rastorization performance and nothing else... then focus on price to performance... so lately ive been sticking with amd... and im sticking with it after seeing the announcement of the 7900xtx when you see the performance on the 4080 with the price... if nvidia got their heads out of their asses they'd have a bigger customer base... i know im not the only one who feels the way i do about price to performance and the "value" of nvidia products especially right now

  • @karathkasun
    @karathkasun 2 года назад

    From my rudimentary understanding of how RT works for things like reflections...
    Every bounced ray for a reflection then incurs a cost in rasterization performance because you now have to render whatever is bounced. Which includes independent triangle setup, texturing, and shading from the "viewport" of the surface that bounced the ray (at least for off screen entities). You can see the tricks used to reduce this load in Cyberpunk, EVERYTHING reflected is taken from a much lower LOD and detailed objects are culled if more than something like 20ft from the reflective surface and not in the primary viewport.
    Shadowing is similar, you cant just cull occluded geometry behind the camera anymore. You have to setup and transform that geometry at least relatively often to have it occlude rays from off screen. RT at its core amplifies demand for raster performance unless you are doing PURE RT/PT rendering.

  • @Phil_529
    @Phil_529 2 года назад +3

    Typical Jim being disingenuous. Shader Execution Reordering alone is offering up to 20-50% performance uplifts for RT workloads and it's not in any games yet. It's a few lines of code and it's simple for developers to implement. Displaced Micro-Mesh and Opacity Micro-Maps will also optimize RT further but requires integration into the engine/game. Let's wait for Cyberpunk's overdrive patch before you want to be taken even remotely serious about what Ada ray tracing brings to the table.

    • @adoredtv
      @adoredtv  2 года назад +5

      Right and what happened with the supposed huge RT increase from Turing to Ampere that wasn't seen either? For example with concurrent RT & shading, which Nvidia claimed would be a >2x speed improvement? I think I'll believe it when I see it, which will be never.

    • @Phil_529
      @Phil_529 2 года назад

      @@adoredtv The RT improvements scaled with the architecture. Ampere is undoubtedly more capable than Turing at RT which is obvious to anyone who used both. Ada is going to increase that gap but you predict it's slowing down. It's like you live in an alternate reality and are completely out of your depth when it comes to analyzing this situation. You really think you know better than NVIDIA engineers?
      I'll be happy to come back when we can actually see the new hardware and software stack working in tandem. But of course you chose to completely ignore that information and pander to your AMD crowd and pump your Patreon. Best of luck to you.

    • @adoredtv
      @adoredtv  2 года назад

      @@Phil_529 I never claimed to know better than Nvidia's engineers. I do however know Nvidia's marketing a lot better than you seem to. They pump these huge gains every gen, and it's never seen.
      While you claim Nvidia will see a huge RT increase "in future", the fact you're ignoring today is that they are seeing a reduction relative to raster in some games, and overall RT performance has gone nowhere.

  • @JusticeGamingChannel
    @JusticeGamingChannel 2 года назад

    One thing that is not being taken into account is Shader Execution Reordering, which currently is not being used for any game. Shader Execution Reordering on Ada Lovelace is a method by which could potentially improve Ray Tracing performance vastly. When NVIDIA talks about RT performance on Ada Lovelace, it is my opinion they are taking SER into account as part of the performance increase with the new architecture. However, it isn't being utilized yet, no games use it. SER needs to be accessed and utilized by the game developer, and perhaps as that feature is used in the future in specific games with Ray Tracing, perhaps that is when Ada Lovelace will see a bigger RT performance benefit from the previous generation. This is something to keep in mind at any rate, SER is an important and useful technology, and it's a bag that can be used, if developers latch on and use it. So Ada has more in its bag of tricks, it's just not being fully utilized in that regard.

  • @pewpew518
    @pewpew518 2 года назад +3

    far cry 6 has a RT setting but it looks like total crap and tanks performance by 40%. I guess there are just bad implementations.

    • @Phil_529
      @Phil_529 2 года назад

      Yeah because it's an AMD sponsored game and it had to run on the 6800XT/6900XT at reasonable levels. The only AMD sponsored game that has good ray tracing is RE: Village's Global Illumination.

  • @Accuaro
    @Accuaro 2 года назад

    I was gonna sleep, but I always have time for your vids

  • @lrmcatspaw1
    @lrmcatspaw1 2 года назад

    Jim, I know you once said that you were preaching to the choir, but its nice to have you back.
    I have a suspicion regarding ray tracing: Because the current ray tracing is an approximation of the real thing, the closer we try to move the needle to the real thing, the harder it gets to run it fast (probably even worse than overclocking vs power consumption).
    I will however say that I am intrigued about their claim that older games can be "remastered" with ray tracing. (or something like that, I dont really know what it is).
    Playing old games like elder scrolls oblivion, mass effect 2, starcraft broodwar, STALKER, witcher 3... Sounds really good (mayb too good?).

  • @Xerpadon
    @Xerpadon 2 года назад +2

    Oh please.... Raster performance means everything, no one really cares about raytracing.
    Anyone who plays any games for the sake of Ranking or performance just to get about that top edge always plays the same Low settings and everything off.
    from the entire gaming industry only about ~8% cares about 30fps 8k ray tracing high quality to have a cinematic approach of games.
    its a nice to have... but its not even a real Feature, its a gimmick just like Nvidia hairworks and all the other Nvidia features from the last 20 years... its a marketing ploy.
    and no i'm not a Amd Fanboy, i own a RTX 3080 but i'm not blind or stupid enough to make up excusses for nvidia they have done horrible things to the consumer in the last 8 years and amd getting on top only makes them more competitive again this is just good for all of us.

  • @exklimexklim
    @exklimexklim 2 года назад

    This is a bit wrong.
    Resolutions are mostly based on vRAM speed.
    The Higher the resolutions is, the most-likely to cap the vram bandwidth.

  • @asynesthesickid
    @asynesthesickid 2 года назад +1

    What on earth are you going on about comparing the delta between rasterization and raytracing uplifts at 8:30? This is such an apples and oranges comparison I can't understand what you are getting at. Are you saying it's bad that there is cases where the ray tracing uplift is less than the rasterization? Or that the uplift has a lot of variance compared to rasterization across games? I really don't understand what point you are trying to make.

  • @Ren04700
    @Ren04700 2 года назад +2

    Love the analysis, mate. You truly make a difference in how I perceive all of this. Learning how to be and stay critical. Much appreciated - keep on keeping on!🙏

    • @messageAdoredTV
      @messageAdoredTV 2 года назад

      👆Thanks for Commenting Dm I have something for you🎁

  • @keybraker
    @keybraker 2 года назад +1

    I like how leaving in Sweden has made you call Jensen, Yensen.

  • @tastethesoup
    @tastethesoup 2 года назад +1

    The thing with most of the gaming benchmarks you used in your comparison is that even with RT on, rasterization is still doing most of the work when generating each frame. I’d be curious to see what the difference would be for fully path traced games like Minecraft RTX or Quake RTX, I’d imagine the improvement would be a lot more apparent, right?
    Btw have you thought about doing your own benchmarks? Obviously it wouldn’t be cheap but seeing just how deeply you analyze benchmarks from reviewers who typically only evaluate them at surface level, I feel like you’d be capable of finding much more interesting results on your own.

  • @piotrj333
    @piotrj333 2 года назад +2

    You cannot compare raytracing like that because games feature diffrent level of utilization of raytracing. Watch dogs legion doesn't have sophisticated raytracing and diffrence doesn't exist - game is raster bound on nvidia. But when you have 2 most impressive raytracing titles, Cyberpunk and Metro exodus diffrence is bigger (around 15% relativly) and those titles feature global illumination - the most demanding (outside of quake 2 rtx and minecraft) raytracing that use full path tracing. Also raytracing in most games is rasterization + raytracing meaning if rasterization takes a lot of time to render it will be majority of time. People forget that even if RTX units on nvidia would be instant, i wouldn't expect bigger speedup then 30%.

  • @GraveUypo
    @GraveUypo 2 года назад

    13:38 i'd like to correct you on that. it wasn't 22%, it was only 8,4%. you have to normalize the percentages before doing that calculation. divide both both 2,62 so the raster improvement is the baseline 100% and you get that raytracing is just 108,4%. raytracing improved 8,4% more than raster did in two generations. it basically didn't improve at all, it's still the same crippling performance penalty it was back then.

  • @nicestguyinhouse6112
    @nicestguyinhouse6112 2 года назад

    Thank god you are doing more videos again, happy days

  • @sauce777
    @sauce777 2 года назад

    Love your videos, it was such a down time when you went away.

  • @briggsg
    @briggsg 2 года назад +1

    Why not to test RT with Quake 2, where there is literally only RT to calculate for those cards, and it is open source, and different types of RT...

    • @jaromor8808
      @jaromor8808 2 года назад

      why not google such benchmark?
      "RTX 4090 review: Spend at least $1,599 for Nvidia’s biggest bargain in years"

  • @Think666_
    @Think666_ 2 года назад

    Great to watch your fantastic content again!

  • @manoelBneto
    @manoelBneto 2 года назад

    I think this shows that ray-triangle intersection is no longer the bottleneck: the work that needs to be done *after* a ray hits something uses the same HW as raster (shader cores, caches, etc) so adding more RT cores is giving diminishing returns.

  • @utubie24
    @utubie24 2 года назад +1

    Adored I have a serious question 🙋🏻‍♂️. Do you actually play games? And if so I would love to see your gamer profile.

  • @Thundermonk99
    @Thundermonk99 2 года назад +1

    Looking at the sample of games you investigated for RT performance, isn't it pretty clear what's going on here? The games that include a higher number of RT effects and push their RT effects harder (Cyberpunk, Metro Exodus, Control) show a much larger relative improvement gen-on-gen relative to AMD sponsored titles that include a single, token RT effect (RE Village, Far Cry 6) that offers very marginal improvement to IQ (but at a lower perf cost). Calling RE Village or Far Cry 6 a "ray-tracing game" is a misnomer when 95% of its rendering is traditional rasterization. Over the coming years, we will probably look back at these first few "RT" generations and laugh at what used to "count" as raytracing. Nvidia knows this, but they needed to start somewhere. The first step was selling consumers on the idea that RT is something they want in their games. Now that consumers have been sold on RT as the future of graphics and are demanding RT from developers, Nvidia has ensured that future generations of games will leverage RT more heavily in their rendering pipeline, benefiting Nvidia.

  • @spoots1234
    @spoots1234 2 года назад

    From what Jensen was alluding to, RT doesn't scale linearly like Rasterization with cores, clocks and cache like how modern GPUs are currently scaling. It's not a problem solved by manufacturing advancements alone but a lot of optimization to the architecture when new bottlenecks arise from jumps in hardware performance, similar to the early 2000s with rasterization optimizations. AMD's ray accelerators look good in theory, scaling linearly in quantity with core counts, but they are definitely seeing the same issues with RT performance not scaling linearly with hardware. Their own slides show a 70% uplift in raster and a 60% uplift in RT. Once again though, different solutions will come with different ceilings and the AMD solution may have a ceiling far lower than the Nvidia's standalone solution when ray tracing moves to path tracing, ie tracing each ray for longer - as seen in path tracing benchmarks that tank the 6900xt's performance.

  • @xXAngelmlXx
    @xXAngelmlXx 2 года назад

    Thanks for another great video Jimbo!