NVidia Turing Architecture Technical Deep-Dive: SM Rework & Big TU102

Поделиться
HTML-код
  • Опубликовано: 12 сен 2024
  • Correction on the core count: Should be 4352, not 4532 (typo). This architectural deep dive of nVidia's Turing GPUs talks about TU102.
    Ad: Buy Thermal Grizzly Kryonaut on Amazon (geni.us/gntgkryo) or Conductonaut Liquid Metal on Amazon (geni.us/gntgcon...)
    We'd strongly encourage reading the full article: www.gamersnexu...
    This content delves into the RTX SDK and its viability in gaming, RTX cards (like the RTX 2080 Ti and RTX 2080), specifications, and performance characteristics. We show block diagrams for TU102, which is larger than the RTX 2080 Ti initially presents, and offer our own diagrams on GPC, TPC, and SM layout.
    We have a new GN store: store.gamersne...
    Like our content? Please consider becoming our Patron to support us: / gamersnexus
    ** Please like, comment, and subscribe for more! **
    Follow us in these locations for more gaming and hardware updates:
    t: / gamersnexus
    f: / gamersnexus
    w: www.gamersnexus...
    Host, Editorial: Steve Burke
    Video: Andrew Coleman
    Links to Amazon and Newegg are typically monetized on our channel (affiliate links) and may return a commission of sales to us from the retailer. This is unrelated to the product manufacturer. Any advertisements or sponsorships are disclosed within the video ("this video is brought to you by") and above the fold in the description. We do not ever produce paid content or "sponsored content" (meaning that the content is our idea and is not funded externally aside from whatever ad placement is in the beginning) and we do not ever charge manufacturers for coverage.

Комментарии • 676

  • @GamersNexus
    @GamersNexus  6 лет назад +100

    *Correction: Should be 4352 FPUs, not 4532. Typo!* This video is REALLY dense. If you need the information written-out for later review instead, we've done that in our article! www.gamersnexus.net/guides/3364-nvidia-turing-architecture-technical-deep-dive
    Grab some merch on the GN store to support our efforts directly: store.gamersnexus.net
    Watch our RTX discussion with Jay here: ruclips.net/video/2TdOYKWPnd8/видео.html

    • @theeskimo9875
      @theeskimo9875 6 лет назад +1

      yes, this is s dense damn

    • @andljoy
      @andljoy 6 лет назад +1

      So all that " oh the 2018Ti is the titan now thats why its expensive" shill crap is bullshit then ..... right. As i said what a load of crap.

    • @buzzman4860
      @buzzman4860 6 лет назад +1

      People who own 2k and 4k monitors can t do Ray Tracing? All the demos of RT were in 1080 at 60fps

    • @buzzman4860
      @buzzman4860 6 лет назад +1

      RE: RT can help in sound directions.. I am warming up now because vertical sound (3d sound) in games need help

    • @firefox5926
      @firefox5926 6 лет назад

      0:29 this time on gamer nexus : acronyms and you

  • @Zamm1n
    @Zamm1n 6 лет назад +65

    NVIDIA: "Don't disassemble our cards!"
    Steve: *Tosses the backplate over his shoulder*
    Mah man.

  • @totinospizzarolls4737
    @totinospizzarolls4737 6 лет назад +353

    While everyone else is doing unboxings, our man Steve is making interesting content that we actually care about. 👍

    • @RaspyEdits
      @RaspyEdits 6 лет назад +19

      how many ppl tried scrathing this guys picture of there screen

    • @WayStedYou
      @WayStedYou 6 лет назад +1

      both of the steves doing teardowns and technical specs :D

    • @aikatheshibainu3994
      @aikatheshibainu3994 6 лет назад +2

      @@RaspyEdits i did

    • @shmubogdan9624
      @shmubogdan9624 6 лет назад

      you know not everyone cares about deep technical stuff and just wants to play....

    • @Atilolzz
      @Atilolzz 6 лет назад +1

      Raspy edits it really got me, I m a smartphone user hahah

  • @_Agosto_
    @_Agosto_ 6 лет назад +368

    6:19 *BUT*

    • @iTw3ak
      @iTw3ak 6 лет назад +11

      BUT WHAT. BUT
      WHAT

    • @vanteal
      @vanteal 6 лет назад +22

      "To make it sound greater than it really is"....BUT....It's not........I love how he just let that one hang, because he knew we were all thinking it. So he didn't even have to say it.

    • @tovenz5949
      @tovenz5949 6 лет назад +5

      But 1200 is stupid

    • @EliteProductions3129
      @EliteProductions3129 6 лет назад +11

      I died

    • @iTw3ak
      @iTw3ak 6 лет назад +1

      we can still have hope lol

  • @GurtTarctor
    @GurtTarctor 6 лет назад +248

    6:06 I love you.
    27:31 (one frame) I love you more.

    • @garretmcr
      @garretmcr 6 лет назад +6

      JESUS?

    • @euga5653
      @euga5653 6 лет назад +1

      Tech jesus has real life, real time ray-tracing

    • @piers389
      @piers389 6 лет назад +6

      For anyone who can't see it/can't get the video to pause at exactly the right frame, here's a screenshot (sorry for the 4K): imgit.org/im/eBI

    • @reverendaero
      @reverendaero 6 лет назад

      aw fuck I just died laughing

    • @Kepe
      @Kepe 6 лет назад +7

      Use "," and "." (comma and period) to move forward and backward frame by frame.

  • @Ninia103
    @Ninia103 6 лет назад +67

    Just wanted to say I really appreciate what you do and I hope you will never burnout, please take some time off if needed! Thanks for being the best tech-tuber for us who enjoy these very informative and technical videos.

  • @Keyvanizator
    @Keyvanizator 6 лет назад +63

    6:22 was amazing, thank you for your hard work

  • @zenairzulu1378
    @zenairzulu1378 6 лет назад +54

    ok, for those who got lost in the tech speak I have the translation. Based on the fact his hair is swept to the side where the card is and the fact he moved the bottle of alcohol closer to the card at 24:00. He favors the cards increased performance of 24-25 %. However, he rotated the bottle of alcohol as well so he thinks the price to performance ratio is not good enough. At 24:36 this all changes. You see I was dead wrong about everything. By first showing the fans {Fanboys} and pointing at the camera then revealing the brand name of the alcohol = these 1st gen cards are a hard pass. Also, I think NVidia is holding the shop cat hostage until the end of the embargo because of the two fake cats in the background. You are welcome.

  • @thecappy
    @thecappy 6 лет назад +221

    So much better than the RTX unboxing garbage of the other tech-tubers.

    • @Nosterex
      @Nosterex 6 лет назад +2

      Agreed.

    • @MasterOfInfinity
      @MasterOfInfinity 6 лет назад +3

      He’s the best

    • @thecappy
      @thecappy 6 лет назад +5

      @@MasterOfInfinity definitely the first stop on release day info.

    • @bitscorpion4687
      @bitscorpion4687 6 лет назад +9

      @@bcp6524 they do, they are just trying to appeal to a major demographic (who don't care bout in-depth dives)
      + There's a white-sheet out there with all the tech info by Nvidia.

    • @Dizzinator2114
      @Dizzinator2114 6 лет назад +1

      I figure he does this because he isnt under embargo or he asked them coukd he do this ahead of time. While others don't get this deep into this stuff beyond fps.

  • @jaim1555
    @jaim1555 6 лет назад +10

    Great piece on Turing guys! Another excellent example of quality Tech Journalism by the Gamers Nexus Team of which many Tech news outlets should be taking notes.

  • @billschannel1116
    @billschannel1116 6 лет назад +4

    As a person with a strong computer hardware engineering background I have no problem with their ops calculations. They provide a convenient, well defined mechanism for describing the hardware configuration. Think of the alternative, a table of various values that require a lot of interpretation. You can always ignore it or generate your own, but for me there is value.

  • @djquantize
    @djquantize 6 лет назад +57

    Who cares about performance, did the box open from the top or the side Steve?

  • @LongFacedBastard
    @LongFacedBastard 6 лет назад +5

    Correction on floats vs ints: Floats are not more "accurate" because they have decimal places. Floats are not used because they are fully accurate, but because of their properties in terms of being able to approximate the value of a calculation. Integers, like you said, are "fully accurate" they are simply the number they say they are, whereas a floating point number may say to the user that it is 24, while being 24.000000003, etc

  • @zywypl
    @zywypl 6 лет назад +132

    ROTFL @27:31, nice easter egg xD

    • @randomnoobpt
      @randomnoobpt 6 лет назад +3

      OMFG I'm laughing hard 😂😂😂
      Now why aren't there any effects to make Steve look shinier? 🤔

    • @robertmoorhead3731
      @robertmoorhead3731 6 лет назад +1

      Perfect!!! Lol!

    • @ralanham76
      @ralanham76 6 лет назад +2

      .25 speed is your friend. On my phone it's most of the way through 27.:32 seconds.

    • @coreycarpenter2489
      @coreycarpenter2489 6 лет назад +1

      That took me about 5 minutes of going back and listening to Steve say highly selective at 0.25x speed like 100 times.

    • @ToTheGAMES
      @ToTheGAMES 6 лет назад +3

      Corey, on pc you can frame-step with < and >. Pause video and use it :)

  • @donotworried
    @donotworried 6 лет назад +49

    @27:31 The real memes are on Gamers Nexus...

    • @Norbea
      @Norbea 6 лет назад

      Ray tracing Jesus! 😂

    • @thomascooley2749
      @thomascooley2749 6 лет назад

      if he was after a bunch of tech jesus comments it worked lol

  • @amp888
    @amp888 6 лет назад +20

    I'm glad to see that while everyone else is dropping useless unboxing videos, GN is providing something worth watching.

  • @antiMatterDynamit
    @antiMatterDynamit 6 лет назад +63

    wait 4532 cuda cores on the 2080ti?
    this contradicts every other source on the specs of the card including the official page ...
    pretty sure you mean 4352 cuda cores...

    • @GamersNexus
      @GamersNexus  6 лет назад +52

      It is. Typo in the script. Thank you!

    • @maxim099
      @maxim099 6 лет назад +4

      @Gamers Nexus, You talked about the Int32 operations in the GPU in combination with games, I can assure you shaders almost don't use Integer operations in the shader code (running on gpu) because there is simply no need for it, outside some edge cases, such as step counts for certain effects and so on, int operations are almost never done. int operations that are in a shader are normally just handled by the FP32 threads on a gpu, and are very trivial at that.
      I don't expect any real performance uplift for games, both now and future.
      Yours, a game developer.

    • @GamersNexus
      @GamersNexus  6 лет назад +6

      Figured that would mostly be the case. Thanks for posting your experience! Unreal Engine claims roughly 30% INT operations through its engine, but we are not clear on whether those are pushed to the CPU or GPU. Any idea, Maxim?

    • @AdamBoozer
      @AdamBoozer 6 лет назад

      @@GamersNexus Maxim Munnig Schmidt
      I don't think this is really even relevant to game development. It's more of a programming question to me. I mean, GN specifically stated they were unsure which typically handles that job and I don't see them stating it's a performance boost at all in the video. ^. 18:32 their source did though.
      It's the cpu that would likely handle those btw. Gpu is typically slower and means a performance loss when not handling parallel or packed operations. I.E. Scaler operations mean a performance loss for gpu based int 32 operations.
      stackoverflow.com/questions/4364879/integer-calculations-on-gpu
      stackoverflow.com/questions/4364879/integer-calculations-on-gpu
      On Scaler integers:
      stackoverflow.com/questions/36048502/sse-instruction-movsd-extended-floating-point-scalar-vector-operations-on-x8#36121889

    • @AdamBoozer
      @AdamBoozer 6 лет назад

      Oh I see. He wasn't referring to you. He's talking about your source.
      18:42
      He claimed the gpu could or might handle those better. That's not likely. Fp and int, sure. Unless they changed things with the rtx cards. Int and FP are equal, but unless they changed something, scaling operations like a boolean aren't equal performance wise. Right now at least.

  • @Klefth
    @Klefth 6 лет назад +57

    Oh hey, this isn't an unboxing. Huh. Neat.

  • @mashygreen6974
    @mashygreen6974 6 лет назад +1

    I'm a programmer in academia doing computational fluid mechanics. Just thought I'd throw a bit of light into int vs FP outside the scope of gaming (I have no idea how to program games, sorry!).
    A floating point is, as you described, a number with a decimal. These number require more memory (bits) to represent them, and the precision is the number of bits (so FP32 uses 32 bits of memory to represent the number). These can be used for keeping track of values of things like a location in 3D (points on the x,y,z-axis). Integers are 'whole' numbers, as said, and are useful for different things like counting objects, but also booleans - which is the key aspect of where having asynchronous compute between Int and FP is helpful as far as I understand. The reason for this is that when you encounter a boolean statement like an 'if else' then the whole pipe needs to wait until all boolean statements have been evaluated over the data in the warp before the program can execute the algorithm of each branch further. From my understanding, the asynchronous compute between Int and FP allows for the thread warp independent scheduling for branches in if / else statements to be interleaved in time (devblogs.nvidia.com/inside-volta/), decreasing wait time between the thread executions.

  • @MunkeyChips
    @MunkeyChips 6 лет назад +32

    I'll be buying this card... years later... used. I'll be "looking back" at past technology for the novelty.

    • @__aceofspades
      @__aceofspades 6 лет назад +10

      Why bother? These cards barely do ray tracing today, as admitted by developers of Tomb Raider and BF5, and will be obsolete in a year. The 1080 ti is the better value.

    • @MunkeyChips
      @MunkeyChips 6 лет назад +2

      ac3ofspades878 Why? Think, PhilsComputerLab but without making a video. :)

    • @Blustride
      @Blustride 6 лет назад +1

      ac3ofspades878 I've been looking at getting a Titan Z, or Titan Black, or a Fury X or something. There's some novelty in owning what was the best of a generation, but is now almost worthless.

  • @PixelPipes
    @PixelPipes 6 лет назад

    Very technical and heavy piece, but greatly informative. I used to love this stuff back when AnandTech would do it, so I'm glad someone out there with quality content is taking up the mantle.

  • @anasevi9456
    @anasevi9456 6 лет назад +3

    GN is the english language Torchbearer of the highest standard of tech journalism. Amazing video!

  • @euga5653
    @euga5653 6 лет назад

    This was actually really useful compared to videos where people just opened the cards and gave their "gut feel" about it :D You guys make me feel way more technologically educated. Keep up the fantastic work. Really clear and interesting information!!

  • @Lazronaut
    @Lazronaut 6 лет назад

    I’ve been watching a lot of tech-channels the last couple of months while planning my new build. GN manages to combine rigorous & honest reviews, concise explanations of technical details, and avoid the iterative flashy nonsense that plagues most channels. Not only that but they manage to be legitimately entertaining as well, and because of all that I give the info I find here a lot more weight than from other sources.
    Thank you tech-jesus and team for all the hard work and providing the kind of journalism I wish we would get from national news networks.

  • @halgari
    @halgari 6 лет назад +4

    The RTX ops is a bit weird, but it's a tad like the introduction of GPUs. "What, your GPU only runs at 100Mhz? My computer runs at 600Mhz!", that's when they moved to "triangles per second" and eventually to "texture fill rate". But now we have a ton of new dimensions to measure: Speed of CUDA cores, number of CUDA cores, query rate of the BVH silicon, and the speed of the Tensor cores. So yeah, the RTX thing is a bit hand-wavey, but how else are you going to quantify all these values into a single scalar. Better to just read the different dimensions though as each of the values (CUDA core speed, core count, etc.) gives you a more in-depth view.

    • @peterhermina656
      @peterhermina656 6 лет назад +1

      How about publishing the actual individual #s? Like FLOPs, Giga Rays, and memory bandwith.
      When you buy a car; you don't just get horsepower. You get max hp & torque @rpm, fuel economy etc

    • @halgari
      @halgari 6 лет назад

      That would be optimal yes. But seeing all the people saying "I just want more FPS!!!" means they'd probably not get very far. But the indepth dives like done in the OP are a great start. There's dozens of variables that go into this all, so it's really hard to compare numbers at all.

  • @gameguy301
    @gameguy301 6 лет назад +15

    18:00 a quick peak into 2020 as team red, team green, and team blue battle it out for GPU supremacy in space.

    • @Slizzo82
      @Slizzo82 6 лет назад

      This was actually a really great tool that a Reddit user posted to both the r/nvidia and r/amd subreddits.

    • @PainterVierax
      @PainterVierax 6 лет назад

      Gratuituous Space Battle 3 : The tool.

  • @k4RtInk
    @k4RtInk 6 лет назад

    I love the website's articles that you make concurrently with each video. They are a perfect compromise between a technical deep dive and accessibility. tl;dr great journalism guys!

  • @MojojoJenkins
    @MojojoJenkins 6 лет назад +1

    Great explanation! Thaks for continually examings things with a micrscope when others will barely hold more than a broken, frosted mirror to them.

  • @QuickshotGaming
    @QuickshotGaming 6 лет назад

    I have notifications on for two tech channels, turns out the only two that did more than just unbox the card(s). Appreciate it even if you kinda were told not to tear it down last minute. Gigantic companies who can't even figure out how they will allow coverage of their products.

  • @henrymek
    @henrymek 6 лет назад +10

    27:31 HAVE A FANTASTIC IMMAGE

  • @ColonelRPG
    @ColonelRPG 6 лет назад +4

    The background shelves are looking sweeeeeeeeeet!

    • @1invag
      @1invag 6 лет назад

      Just looks like more annoying stuff to clean to me haha

  • @Bugattiboy912
    @Bugattiboy912 6 лет назад

    Great video Steve. Really appreciate these deep dives. However, Blinn's Law comes to mind whenever someone talks about saving time in rendering.

  • @elr77
    @elr77 6 лет назад

    As usual, very informative analysis. I had no clue about a lot of the SMs and GPC and TPCs and stuff you were talking about, but your analysis definitely got me excited even more about the future of graphics tech. These GPUs are a work of art!! Luv that you also showed 16:25 how close to the Volta cards Turing is design wise something I have felt has been overlooked in all the noise. So a lot of the issues are I guess price and usefulness for per-existing games.
    O also like that you gave AMD a little feather for their cap re the command processor 14:20

  • @swayingGrass
    @swayingGrass 6 лет назад

    I think the finding triangle "box in box" thing is like what's often used when making user custom skins to figure out where's a certain parts in the dds/tga file. You color the file with a template that has 4 (for simplicity) different colors in each quarter > go to the game and check what color the part is, say purple > paste the template on the purple quarter > check again, and repeat until you find the part.

  • @MarcoGiordanoTD
    @MarcoGiordanoTD 6 лет назад

    @GamerNexus, amazing video, as usual, just a minor correction, int are used a lot even in regular shading, in regular pixel shaders maybe not so much directly by the user, but every kind of addressing computation involve indices, address offsets etc. Also, on the other side, games uses more and more compute shaders nowadays, that is a cluster fest of addresses computation, all that is integer math.

  • @WinterCharmVT
    @WinterCharmVT 6 лет назад +1

    This is Gamers Nexus at its absolute best. I love architecture deep dives

  • @Vaxtin
    @Vaxtin 6 лет назад

    The very first thing that I noticed; The plush feline friend in the background.
    - Lol.. Love that one frame comparison of RTX on vs RTX off.

  • @mr_beezlebub3985
    @mr_beezlebub3985 6 лет назад +17

    So glad it's not another unboxing video. I saw every other tech RUclipsr upload that kind of video, and decided not to watch any of them

  • @wilhitman1
    @wilhitman1 6 лет назад

    Awesome - Thanks Steve - "Toss over the shoulder of Back plate" Priceless.

  • @EldaLuna
    @EldaLuna 6 лет назад

    just seeing that back plate go flying really made my day at the end of this. it sure shows how i feel about these cards lmao.

  • @mateuscampello
    @mateuscampello 6 лет назад

    That's why I love Gamers Nexus, no simple unboxing, but rather a deep technical explanation of the architecture to keep me from doing what I was supposed to do.

  • @danieldudas9026
    @danieldudas9026 6 лет назад

    Never thought that it will be so easy to fall asleep over the Graphic processing clusters bit.

  • @Gravstein
    @Gravstein 6 лет назад +9

    Calling it now! Titan T 2999$ MSRP

    • @notahuman369
      @notahuman369 6 лет назад +1

      The Titan V was/is the same price, so you're probably not too far off.

  • @OscarCastillo1
    @OscarCastillo1 6 лет назад

    This is the best explanation of an RTX I've ever seen so far.

  • @Adrian-yn4qg
    @Adrian-yn4qg 6 лет назад

    Damn bro. Great video, this is why you're the gold standard for what real, relevant, "non-shill" tech content should be.

  • @jubuttib
    @jubuttib 6 лет назад

    STEVE! I friggin' love you! You're one of the only persons left on this world who can use "six times faster" correctly to mean 7x speed! I'm gonna buy all the beer glasses just for that!

  • @spork8655
    @spork8655 6 лет назад

    The fact that this is not yet another skippable unboxing is exactly why I'm a patron.

  • @lukew.9110
    @lukew.9110 6 лет назад

    Anyone else notice the Tech Jesus RTX frame inserted at 27:31 Nice one Steve.

  • @willis936
    @willis936 6 лет назад

    I love this stuff. I've taken classes and I know the balancing acts that modern processors have to make. That won't stop me from using this video to fall asleep.

  • @emichal1986
    @emichal1986 6 лет назад

    27:32 Have not done so much ROTFL in years! Tech Jesus, glad to have you here, dear Lord of internets ;)

  • @Thor_1872
    @Thor_1872 6 лет назад

    lol the subtitles at 30:52 "you weren't allowed to do retroactive Lee"

  • @axe693axe
    @axe693axe 6 лет назад +5

    *30:04** LIKE A BOSS*

  • @Raattis
    @Raattis 6 лет назад

    About this 19:38
    int32 and fp32 both have exactly 32-bits of precision (they can represent 2^32 different numbers), but the operations you can do with them is different. You can't for example index an array with a decimal number. To a computer 1.0 and 1 are two entirely different things.
    floating point 1.0 in base-10 is 00111111100000000000000000000000 in 32-bit binary (IEEE-754)
    integer 1 in base-10 is 00000000000000000000000000000001 in 32-bit binary
    So arr[1] would access the second element of an array and arr[1.0] would *try* to access the 1065353217th element of an array and likely crash the program. Coercing a floating point number into an integer is not very fast either, which is why having integer operations available directly on GPU can accelerate indexing data structures (such as octrees) required for ray tracing but also other stuff. Integers are not just for counting stuff.

  • @MrMoxes
    @MrMoxes 6 лет назад +3

    5:40 It might be still, very, very early to tell but this math is based on a lot of assumptions in workloads. I am really looking forward to seeing what the GN team can come up with for a testing method to prove this metric.

    • @totalermist
      @totalermist 6 лет назад

      That'd be impossible to prove or disprove - the metric relies entirely on assumptions (e.g. "20% DNN processing, which is simply useless for games that don't use DLSS or whatever that's called).
      The metric is basically as useless as fuel consumption and range figures from auto manufacturers...

    • @MrMoxes
      @MrMoxes 6 лет назад

      @@totalermist I'm not trying to be pessimistic, just stating that it will be interesting when they can test what it does & how effective it is. I suppose it is usually if you can't put it to use.

  • @Serpher1
    @Serpher1 6 лет назад

    Also I love what Steve and the editor did there 30:02 xD

  • @Atilolzz
    @Atilolzz 6 лет назад +63

    Oh wow, of course their $1300 RTX 2080Ti is a CUT DOWN VERSION of the highest end GPU
    Wow, great job, nGreedia!
    Remember when you paid $600 for the highest end Nvidia card, the GTX 580? Then they made the Titan and cut down the GTX 680, then they made the GTX 780Ti to cut the GTX 780 further down.
    And now they couldn't cut it further down, so they increases the price.
    Steve, for the love of God, make a video about this Bullsh!t please!

    • @oreolamp5676
      @oreolamp5676 6 лет назад +1

      Atilolzz Manufacturing cost is actually a concern too. They did the same with pascal, the 1080ti was not the biggest they could make pascal afaik. I could be wrong on that though.

    • @Blustride
      @Blustride 6 лет назад +11

      The RTX 2080 Ti is a cut down version of the TU102.
      Just like the GTX 1080 Ti is a cut down version of the GP102.
      And the GTX 980 Ti was a cut down version of the GM200.

    • @backupplan6058
      @backupplan6058 6 лет назад

      inteli722, I must have missed the when the cost of the cut down 1080 TI cost the same as a Titan.

    • @Blustride
      @Blustride 6 лет назад +7

      A A I guess you also missed when the first Titan X Pascal was a cut down GP102 for $1200.

    • @Bellinkx1
      @Bellinkx1 6 лет назад +7

      AdoredTV does.

  • @AinurEru
    @AinurEru 6 лет назад +1

    Nice coverage, though has some inaccuracies: I work at Weta Digital, and we've had Turing hardware to play around with for some time now (as Jensen pointed out in his keynote). RTX is NOT an SDK(!) It's not some API that developers have to use. It's just an umbrella-term for marketing. APIs are thinks like CUDA and OptiX. These devs have to code using. In fact, OptiX sits on-top of CUDA and is an implementation or ray-tracing machinery that has existed for a few years now, so ray-tracing on GPUs is really nothing new - but it's not real-time. The new API for real-time ray-tracing in games, is DXR and is a Microsoft spec. NOT an NVIDIA tech. It's vendor-agnostic. So, talking about RTX as some NVIDIA specific SDK that devs need to code specifically for, is misleading. It's just a marketing term that covers the hardware-capabilities dedicated for accelerating ray-tracing computations, used by both OptiX (NVIDIA SDK) and DXR (Microsoft SDK). Game developers are not using OptiX and are never going to. They will be using DXR, so it's NOT going to be NVIDIA specific code. Once AMD comes out with driver-support for DXR, the same games with ray-tracing capabilities will be able to run on SMD hardware as-is, with no code-changes. I don't understand why nobody is talking about that...
    It's only true NOW that you'd only be able to run those on Turing GPUs, but it's not like it's presented everywhere like it's an NVIDIA specific tech. That is just plain false. RTX is just NVIDIA's specific way of implementing hardware-support for Microsoft DXR. AMD will have it's own implementation, named something else.

    • @AinurEru
      @AinurEru 6 лет назад

      All "RTX Games" can run on AMD hardware - because they are NOT really RTX-specific, but instead are DXR specific: ruclips.net/video/iYLsTS6WfH8/видео.html
      RTX is NOT an SDK(!) So media should stop saying it is...

  • @pegasusted2504
    @pegasusted2504 6 лет назад +1

    I think they have made incredible progress and major leaps in capability. I really don't understand why anybody questions it.

  • @danielmonsanto8286
    @danielmonsanto8286 6 лет назад +6

    #AskGN is it possible that a completed form of AMD's HSA implementation could have done the same thing that Turing is supposedly able to do, with the 3D object lookup and interger operations, and do you think that this is/was what AMD was planning all along?

    • @notahuman369
      @notahuman369 6 лет назад +2

      that's a good question, i hope they see it and try to find an answer!!

    • @S.ASmith
      @S.ASmith 6 лет назад +1

      Lets hope Navi brings HSA back.
      Given AMD are staying quiet on this and watching Nvidia (presumably), I would say...watch AMD's space..they're not going to leave this unanswered that's for sure.

  • @azoey
    @azoey 6 лет назад

    when will the embargo will be lifted? also I love how you don't do umboxing and actually post useful information when the rest only do umboxing. great work dude keep it up

  • @JoonasD6
    @JoonasD6 6 лет назад +10

    3:38 "So does that mean it's 600 % better? 'cause it's 6 times higher?"
    Math doesn't add up here. 600 % better would be a sevenfold improvement. (Comparison: if you add 100 % to something, you have doubled it; adding 200 % is tripling etc.)

    • @jubuttib
      @jubuttib 6 лет назад +3

      Yeah it does, 78/11.3=6.902655 which is ~7. I actually came down here specifically to commend Steve on his correct use of "600% better" and "6 times faster" to mean 7x the performance. "6 times faster" = x + 6x, not x*6, that would be "6 times as fast". Too many people use "6 times faster" to mean "6 times as fast". =)

    • @JoonasD6
      @JoonasD6 6 лет назад +1

      I'm also in the group of people who acknowledge and spread forth the linguistic details of "times faster" (also in many languages; plus I went around it by using doubling and tripling :)), but with only the quote, it was impossible to know what was meant to the extent that it sounded more reasonable that it was a casual figure of speech with no rigour. ... which would be totally fine as a RUclips everyday phrase. When I teach mathematics, I explicitly avoid using that "times " and find other ways to talk unambiguously, and I also have to touch upon this topic everytime because people have problems with bad intuition concerning the percentages, not just the phrasing in question. Not to mention that although I do teach my students to notice the semantics, I _would not_ correct them, as at this point I wouldn't say there's anything to correct as the meaning is most often clear and the only problem there would be me making a problem out of it. :)

    • @Dizzinator2114
      @Dizzinator2114 6 лет назад +2

      I mean this with the utmost respect... You two are some dweebs 😂

    • @jubuttib
      @jubuttib 6 лет назад +1

      Mr. G Thank you, I take it as a compliment. =)
      I work a lot with numbers and orders of magnitude, and it's kinda vital to be accurate with those. The problem with 600% performance being referred to as "6 times faster" is that if you go down the ladder with it, 100% performance (i.e. original) is "1 times faster", and that should ring some alarm bells. =)

    • @jubuttib
      @jubuttib 6 лет назад +1

      Joonas Mäkinen Yeah, I agree with you. Steve has been good about it in the past, and has also used more unambiguous language, but then people complained about that sounding clunky. Steve seems like the kinda guy that would try to use the correct terminology and would only mess up by accident. =)
      Ja hyvä että opetat matematiikkaa, hiton tärkeätä ihmisten elämässä, vaikkeivat sitä äkkiseltään ymmärrä monesti. Olisi ollut kiva jos koulussa olisi näytetty käytännön esimerkkejä miten nopeasti esim. "korkoa korolle" maksut kasvavat pilviin pienillä eroilla, olisi vähemmän ihmisiä joilla olisi ongelmia raha-asioidensa kanssa. =)

  • @michaelbergman1708
    @michaelbergman1708 6 лет назад

    BTW: I really appreciate the in depth coverage hear. Parallelizing the integer math is a really intriguing idea and if it implemented as well as on CPUs then it could pretty useful for other thing than raytracing. As a developer (not in gaming), I am very much interested in these new features.

  • @TechnologistAtWork
    @TechnologistAtWork 6 лет назад

    the most knowledgeable tech guy on RUclips.

  • @e4r281
    @e4r281 6 лет назад

    This is the best RTX review I've seen yet. Thanks GN.

  • @Nobody-vr5nl
    @Nobody-vr5nl 6 лет назад

    Thank you for being the only person that stopped doing unboxings embargoes after saying how stupid they are.

  • @0b1ivion
    @0b1ivion 6 лет назад

    Best Turing video today. Tons of info, no box teasing bs.

  • @WyFoster
    @WyFoster 6 лет назад

    I hope you guys are surviving the hurricane! I'm in NC here with you!

  • @papaalphaoscar5537
    @papaalphaoscar5537 6 лет назад +2

    Hardware unboxed has an actual teardown video!

  • @mr_jarble
    @mr_jarble 6 лет назад

    Top marks my man I have been dying for a real deep dive on this and as always you have delivered. On a speculative note I wonder what the chance of the "titian class card will be for this generation as it feels like they really should have bumped the vram from the 10X to 20X cards.

  • @willkern6
    @willkern6 6 лет назад +6

    Was about to comment that it sounds like Nvidia is just pulling numbers out of thin air to make the new stuff look better... then I heard @6:06 :D

  • @Luredreier
    @Luredreier 6 лет назад +2

    20:18
    Actually, you got those things mixed up.
    Ints are *more* precise then floats.
    However floats are able to deal with a bigger number space then ints, takes up less memory space when you start dealing with big numbers and there's a lot of tricks you can use to make doing math with floats in hardware easier.
    Essentially a int is something like 200+0.3 while a float would be something like (2*10^2)+(3*10^-1) it's the same exact values being expressed, but with two different ways of expressing those values.
    200 vs 2*10^2 becomes more interesting when dealing with a really big number like 2 000 000 vs 2*10^6 both numbers represents 2 millions.
    But if you where to try to represent 2 000 003 that way you'd get a issue as the 2 in the beginning of 2*10^6 just isn't a big enough number to represent that "3" at the end of two millions.
    The "2" just doesn't have a high enough precision to represent that value as a part of that scientific notation.
    Of course with floats and ints we're dealing with base 2 and not base 10 like scientific notation uses.
    But you get the idea.
    A float 32 got one bit defining if the value is positive or negative, 8 bits representing the exponent (the "2" in two million for instance) and 23 bits representing the fraction. (the "million" in the two million number)
    en.wikipedia.org/wiki/Single-precision_floating-point_format
    For a float value that can be represented in both float 32 and int 32 the precision is the exact same.
    However when doing math with two float values you'll sooner or later encounter values that can't be represented with a float so you essentially end up having to do a rounding error.
    Think of the 1/3 and then 0.3333... * 3 on a calculator.
    You know that adding up that fraction three times leaves you with a whole number.
    But you just can't express the infinite value of 1/3 as a decimal value in base 10 so you end up having to do a rounding somewhere and that introduces an error.
    For a int value we know that those errors will always involve values that exceedes the lengths we can represent as a int.
    With a float those values that can't be represented are located all over the number line, not just as infinite fractions but also the whole numbers in our own day to day math.
    You might want to read this stuff:
    www.toves.org/books/float/

    • @Xilefian
      @Xilefian 6 лет назад

      I see what you're saying and I'm inclined to agree with your definitions, but integer by definition is "whole number", so you can't have 200+0.3 (with integers in computer science that would generally equate to 200).
      Your description of the floating point rounding error is good, so is your description of the binary equivalent of the 1/3 = 0.3 problem, but those are a given in scientific notation in any given numeric base, it's not related to the concept of "precision".
      We say floating point has more precision as to us humans it's like we're able to address numbers between integers. Sure, the rounding error appears if you get too precise and you can't have 1/5th or 1/10th in a binary floating point, but representing 1/5th in floating point is still much more accurate than trying to represent it with a single integer number (even a fixed point integer).
      A floating point number can have integer values represented fine (up to the precision margin, as you described), but an integer can't represent floating point values, it doesn't have that level of granularity/precision.
      In computer science, you will see floating point described as scientific "precision" types - so the video gets the definition the right way around from a computer science point of view and is more 'correct'. Definitions in general are a tricky thing, though, as you have put forward the argument that floating point's rounding error and the 1/5 problem show that it isn't as 'perfect' as integer in representing numbers, but in the computer science world the precision is seen as how fine of a number can you represent. Aliasing 0.5 to an integer is not precise in the slightest as it will equate to zero.

    • @Luredreier
      @Luredreier 6 лет назад

      +Xilefian
      Thing is though that you can write ints representing any and all of those desimal values too.
      You'll have to use software to keep track of where in a number chain the int values actually belong, but they're every bit as precise.
      In fact they're more precise within the ranges that you can represent with ints in the sense that the errors are more predictable/intuitive for a human mind.
      You can just use two int values, one for everything before the comma and one after and then use the leftover after any math and add that to the before comma numbers if you're adding two of them together.
      Doing the same with floating point values might lead to unexpected side effects within the ranges that can be represented there.
      Add another couple of ints and you can have scientific notation too and you'll know that your math will be correct both with the big numbers and the small ones between the whole numbers...

    • @Luredreier
      @Luredreier 6 лет назад

      +Xilefian
      I have no idea where your new comment has gone.
      And ok, I'll accept that calling it "scientific notation" was perhaps a mistake of mine.
      But 34028234 6638528859 81170418348 4516925440 can be represented as one 32 bit float, or four 32 bit integres.
      Demonstrating why floats are popular.
      It's also interprented as 340282350000000000000000000000000000000 by the IEEE-754 floating point standard.
      And sure, for things like games using ints would generally be anything but helpfull.
      But you *can* use them both there (in GPUs) and in CPUs (both AMD and Nvidia has int calculation capabilities).
      And all CPUs do as well.
      All you end up doing to represent the comma in your math is to put the numbers in different registers and us a seperate calculation to calculate the remainder when doing division and using a flag for any numbers moved up or down across the comma.
      I don't have a formal computer science background.
      I'm more of the hobbyist homebrew CPU kind of guy.
      Sure, you can keep in mind how floats behave and do reasonable math with floats without implementing things like desimals yourself.
      It's much easier to get innaccuracies, anything but "precise" numbers with floats.
      If you implement decimals etc yourself using ints you can decide yourself what you do or do not need to keep of the information.
      You *can* then save space by writing down how many decimal places those numbers are shifted in one way or other in base 2 so you get roughly the same number back when you expand it again.
      But unlike the fixed floating point standard *you* are the master of how many digits are in the exponent and how many are in the mantissa then.
      Or even if you *do* have a mantissa or not.
      When doing precise math I certainly prefer ints despite the dissadvantages.

  • @raveutcars
    @raveutcars 6 лет назад +1

    LOL at the 1 frame tech jesus photo. Took a lot of double clicking the start pause to see as couldn't be bothered to capture and go through frame by frame :D

    • @coreycarpenter2489
      @coreycarpenter2489 6 лет назад

      I just found out that on the PC you can pause and skip to next frame with < > or the .

  • @invertexyz
    @invertexyz 6 лет назад

    There's a lot of shader calculations that could benefit from faster integer/bool ops as well. Calculating pixel offsets or array indices, branching, writing shaders for pixel restricted games where you don't need floating precision, as a few examples.

  • @AaronMatlock
    @AaronMatlock 6 лет назад

    Really informative video! You need some more research on int32 to make the comparisons to float you are expressing. To this day, a floating point core is a luxury in DSPs when power and size are important. They get around this limitation when fractional numbers are required by tracking the base 2 (binary) radix point and even using two int32 values for example to increase range and/or precision if needed. Yeah, this is my life for the past +20 years.

  • @FadNad0731
    @FadNad0731 6 лет назад

    I love this content from you, so detailed and highly appreciated, thank you!

  • @Xilefian
    @Xilefian 6 лет назад

    21:00 one example for "forward-plus" rendering could be having 3D integer buffers in a pre-pass that counts (and indexes) light affectors on a scene in camera-space before preparing that information for forward rendered lighting.
    That would be awesome as it would be a good step away from the current, rather standard, deferred rendering model which allows thousands of light sources, but requires heckin butt-load of memory for GBuffers, is bandwidth intense, isn't very friendly with multi-sampling (memory and bandwidth usage multiplies massively for each sample) and sucks for virtual-reality games that need to render as fast as possible and shouldn't waste time with pre-passes and rendering individual GBuffers.

  • @DNAReader
    @DNAReader 6 лет назад

    Steve about the CPU/GPU split, all the game calculations are going through onto the CPU and the integer units can do small machine learning models (through CUDA or their game API) so the GPU does rendering on the FPUs, RT rendering in RTcores, and the Tensor Cores - Integer units and mixed precision will do machine learning for rendering denoising or anti-aliasing. If I had to guess. Now the challenge is powering all that up.. power hog GPU ??

  • @TheStigma
    @TheStigma 6 лет назад +3

    Hmmm, the sound-tracing aspect of this is actually pretty appealing in theory.
    It's all up to the implementation obviously, but this has been a lacking area in sound design for a long time where intersecting walls and objects either don't affect the sound at all - or it's just approximated extremely simply, resulting in a lot of edge cases that just don't make a lot of sense or sound wrong. I think you could get a lot of bang for the buck here on the hardware since I expect you don't need anywhere near the same precision to do a convincing sound-trace as you need to do a ray trace of a scene.
    That said you probably need good support in the game engine to do this properly. You probably need good support for "materials" so that a padded wall doesn't interact with sound the same as a stone wall. Luckily, materials have been a thing for a while for other reasons so the framework should be there to build on.

    • @S.ASmith
      @S.ASmith 6 лет назад

      Well, given Nvidia seem to be adding more support to Vulkan with the Turing series of cards and ray tracing is being pushed out with Vulkan over DX12, it'd be good to see audio implementation from Nvidia and AMD too.
      Navi might bring back HSA and change things up a bit to implement some form of ray tracing on the AMD cards say next generation or the one after that. I don't think AMD will let this go unanswered & will probably be dumping their R&D budget into GPUs. Especially since Intel announced their plans to revive that discrete GPU department.

    • @TheStigma
      @TheStigma 6 лет назад

      No, AMD will respond no doubt. I think ray-tracing is inevitable - I just don't think it will be worth it this gen for serious gaming (or maybe even next) but I am glad to see the first babysteps happening. Inevitably it will be the way to do things going forward.
      I just hope that all this new tech going forward will be cross-compatible (implemented via open standards is probably to much to hope for). Tired of seeing so many lock-in tactics being used. It's probably inevitable for raytacing now, but I hope it won't be a thing on soundtracing too. It's double-work for developers and extra headaches for the consumers. (just look to Gsync/freesync for a prime example. Nvidia is largely to blame on this one to be fair).

  • @maxfmfdm
    @maxfmfdm 6 лет назад

    You dont need to bash to make a point. This was a really great breakdown of the tech and its inner workings. Saying but and then cutting implies that nvidias formula for peformance is inaccurate but it is likely a valid metric. What should be questioned is if the metric even matters. Will 1-2 ray traces per pixel in select parts of the screen with ai denoising and the requirement for developer support matter in a significant way to the consumer? This is where the real question lies and the real doubt of the technology lies. It's childish to attack it from the angle that you did in the way that you did. Otherwise keep up the great work. Very good breakdown.

  • @stevenarvizu3602
    @stevenarvizu3602 6 лет назад

    God damn it steve I wrote out a fat ass paragraph asking if the benchmarks were allowed yet and you answered it before I could finish my paragraph lol, I'll start waiting til the end of the video to comment

  • @XFourty7
    @XFourty7 6 лет назад

    What Steve manages to make a real 30 minute video on will be turned into 30 videos on other channels lol :X Thanks for the level of information density dude.

  • @michaelthompson7217
    @michaelthompson7217 6 лет назад

    You can actually get way more precision and speed with integer fixed point math than floating point. Floating point is a convenience, fixed point is speed, power and precision. Both represent the same set of numbers.

  • @josequintero8483
    @josequintero8483 6 лет назад

    This is better explanation then other Top subscribers channels.

  • @HeavenlyStrike1
    @HeavenlyStrike1 6 лет назад

    The only worthwhile unboxing video.

  • @techfusionaz2496
    @techfusionaz2496 5 лет назад

    Golly, now I completely understand why there are way more triple slot cooler designs for Turing, once games take advantage of all the cuda, tensor, and RT cores, that's a heck ton of power and heat.

  • @HaouasLeDocteur
    @HaouasLeDocteur 6 лет назад

    The frame at 27:31 was dope

  • @Krommeniedijk
    @Krommeniedijk 6 лет назад

    This video with lo-fi on in another tab is the fucking best.

  • @myownalias
    @myownalias 6 лет назад

    Thank you Tech Jesus for not doing an unboxing, I watched one, and it was Jayz... Although most of what was said went over my head, I like Steve's presentation style, I love the no BS attitude, and the BUT... is comedy gold. No teardown, what are Nvidia hiding?

  • @malcolmsovani1275
    @malcolmsovani1275 6 лет назад

    its a little too much TECH/number/random 2-20 letters numbers together that its over my head. Great video, I almost understand nothing, have a good day

  • @cliffs1965
    @cliffs1965 6 лет назад

    Tech...
    OVERLOAD! :-)
    Good job GN!

  • @johndoeh6004
    @johndoeh6004 6 лет назад

    @Gamers Nexus 27:31 "Highly Selective" , I watched that over and over a good ten times..

  • @jimmahT
    @jimmahT 6 лет назад

    Good informative video.
    Wish we got to hear what you were about to say around the 6:18 mark.

  • @pioter229
    @pioter229 6 лет назад +1

    one question:
    IF game is DX11 it means that only 1/3 gpu is used? no int32, no fp16 and part for RT is like OF? so it uses % of power (in wats too) and lower temps. So 1rst reviews are mostly wont show full gpu usage and temps/watts will be lower?

  • @zywypl
    @zywypl 6 лет назад

    Turings are a great start for RT in gaming. People bitch about RTXs being capable of doing RT in "only" 1080p 60FPS, but think about it - before we didn't even got RT on 600p 24FPS, they basically showed a Ferrari when people were adding second horse to the chaise. Next-gen 7nm GPUs will bring RT to the mainstream, just imagine how many cores/tech NV will be able to put into them.

  • @OverlordActual
    @OverlordActual 6 лет назад +5

    "BUT!......*next scene*". Wait lol.

  • @benitollan
    @benitollan 6 лет назад

    30:03 lol that sounded like a movie sound effect xD

  • @elvandar
    @elvandar 6 лет назад

    That pause... Orson Welles would be proud.

  • @skoopsro7656
    @skoopsro7656 6 лет назад

    Already preordered on unveil day! SUPER EXCITED!!!!!!!!!

  • @ddmath
    @ddmath 6 лет назад

    Great work, but apparently some of the tech tube folks didn't get the email and tore down their cards, as you can see in at least one video that I will not link. The person is in Australia though, and perhaps NVidia sent the email a tad late due to time zone confusion?

  • @sidewinder666666
    @sidewinder666666 6 лет назад

    I think this is the first GN video I've watched where I haven't understood one damn thing, lol. Upvoted anyway.