Patrons can now vote for the next video! Thank you for your support. Also, some additional links from the video: ❤Patreon: www.patreon.com/simondevyt 😍Courses: simondev.io WolfFire Games Article: blog.wolfire.com/2010/10/Imposters
Another good video that talks about this is "Rasterization, Overshading, and the GBuffer" under the "An In-Depth look at Real-Time Rendering" Series on the Unreal Learning Centre.
Ok so not to be an ass but you used footage from Wolffire games website and it would be cool if you could link them in your description. They are a fantastic developer and deserve more recognition
@@lazygenie5616 Oops, I clearly wanted to credit everybody since I added their links in the video itself, and tried to include them all in the description. I've edited the pinned comment to include anything missing.
This is the stuff about gamedev that almost nobody talks about, and if they do, it's way too indepth. I love channels like this. You're doing a lot for making people understand how complex and smart Game Engines and Render Tech actually are.
Hey guys welcome to my game tutorial. We're going to make a complete game from scratch. 2 episodes. Last video was rendering a jpg to the screen. Last upload 3 years ago. Every time.
@@aeoliunyou should not need a tutorial to make a game from scratch I'm not the best or most knowledgeable coder but I am making my own game engine using c++ and I mainly use c# but I'm getting it done
@@aeoliunhonestly I kinda get them. Very few people will watch it in comparison to a much faster tutorial on an already built game engine. Imagine the dedication and free time required to make such a long thing... If doing it alone seems awfully long, imagine having to record AND edit that video (potentially)
@@youtubeliskHave fun knowing this :) Knowledge is its own end, imo. It's just fun to learn, when the education is properly executed. Might as well ask what people will do with their time playing video games- except this has a small chance of being slightly more practical
This also really helps explain how 4k gaming is possible on these GPUs. In terms of GPU usage efficiency, smaller pixels is effectively the same as having larger triangles. So while 4k screens have 4x as many pixels, you're also throwing away far less work that the GPU is doing, which helps regain some of the performance loss
im wondering if eventually stuff like nanite gets integrated on a software level for gpu drivers and the architecture architects just go yolo and increase 2x2 quads to 3x3 nines or 4x4 sixteens. or a mosaic pattern of pixel groups that mimics and has statistically highest coverage with most common triangles, so that less culling is in place.
Conversely it also explains why using technology like DLSS to render at ever smaller and smaller resolutions doesn't improve the performance nearly as much as you'd expect. Like, you'd think rendering at 480p should be orders of magnitude faster than rendering at 4k, since there's orders of magnitude less pixels, but it's only a little faster.
@@MustacheMerlin *"Like, you'd think rendering at 480p should be orders of magnitude faster than rendering at 4k, since there's orders of magnitude less pixels, but it's only a little faster."* Errr.... Rendering at 640 x 480 IS a buttload faster than rendering at 3840 x 2160. Just as one would expect. DLSS involves a whole lot of processing, just like AA or anything else. And so --- just as one would expect --- it doesn't deliver the same performance as actually rendering at a given nominal resolution.
Nanite rendering is absolutely destroying the performance in Remnant II, along with "smooth framerate" I know positively because using a mod to disable them increased performance so much I turned off upscalers, at 4k. So Ive been looking into nanite trying to understand why this tool for making and running high levels of details could also destroy performance, for little fidelity benefit
One LOD technique I like a lot especially in the mobile world where nanite-like LOD engines currently aren't feasible is Progressively Ordered Primitives/POP buffers. The core idea is to cluster vertices through quantization at different levels of precision and sort them such that lower-precision vertices are first in the vertex list and higher-precision ones are last. The end-result is you can change the LOD of a model just by changing the quantization level and how many vertices you choose to draw without storing any more data than original mesh used. The benefit is four-fold: - Artists only need to make one model with any arbitrary attributes - Cracks/seams can be handled perfectly - Can have dynamically adjustable LOD levels without popping from mesh swaps - Can stream in the extra vertices as they're needed or upload the whole vertex buffer once and instance multiple LOD levels in one draw call. The latter is useful as draw calls are still disproportionately expensive on mobile. Since you're sorting the vertex list anyway (which can be done in O(n) time) and vertex order within quantization clusters doesn't matter much, you can also sort them on a secondary level to maximize vertex fetch efficiency which is important for mobile because of binning (more-so than overdraw since tile-based deferred renderers often have near-perfect hidden surface removal). The neat thing is you can scale the quantization level according to how large quantized grid would make triangles appear on screen to help maximize quad occupancy while maintaining enough detail for it to look good. It does have some drawbacks, notably that it doesn't play well with vertex skinning and lower LOD levels tend to have a bit more triangles than hand-made LODs, but it's great in a mobile environment or for procedural mesh LODs. As a side note, optimizing for mobile tile-based deferred-rendering is a lot of fun and it feels so much more rewarding to make a mobile engine run fast. Most mobile developers just port PC games or graphics techniques to mobile as-is and call it a day while limiting gameplay to negate poor performance; however, with careful optimization you can achieve between Xbox 360 and Xbox One levels of performance on most modern (>5 year old) mobile hardware. I'm definitely biased though as I've found my niche in mobile optimization.
That sounds amazingly cool! Can you say what you've worked on? (Sorry for going of on a tangent here, I'm just musing 😅) I've always found the capability of these mobile chips to be bizarrely good, so seeing stuff like full Resident Evil running on an iphone wasn't *that* shocking, but I've found it weird and a bit disappointing that despite this the market doesn't seem to care about it at all. The Steam Deck and Switch's popularity implies there are absolutely potential buyers, but why aren't they biting? It's far cheaper to get a game controller attachment than these devices after all, so it shouldn't be input. Is it just *relative* to the Candy Crushes of the world that they don't show up? Poor app store presentation? Investors not willing to back it?
@@SimonBuchanNzThis is due to a couple of factors. 1: The mobile market has been dominated by F2P monetization models since the mid-2010s. Paying even $5 for a fully-fledged game was a thing of the past a very long time ago, especially when you consider regions that the mobile market is at its most popular in: the whale-hunting model is especially effective. 2: The Switch is a 7-year-old console powered by a 12-year-old chipset. High-end smartphones exceeded the Switch’s hardware capabilities years ago, especially Apple hardware. 3: The iOS port of RE8 is exclusive to the highest-end versions of Apple’s highest-end smartphone, limiting potential audience significantly. 4: Resident Evil 8 is built upon a game engine *designed* for modern scalability on top of being a game originally built for last-generation home consoles as-is, as opposed to now where the hardware difference between the average smartphone and a current-generation console is somewhere in the ballpark of 3000%. 5. Newer console hardware and game engines has allowed developers to spend less and less time needing to optimize their assets, but it’s still faster and easier to build your game from-the-ground-up around the hardware you’re targeting anyway, especially if expensive graphical effects are integral to gameplay systems and your game’s art direction.
I don't see how that would work. You could order the vertex buffer, sure, but you'd still need entirely separate index data for each optimization level.
I like that Epic's approach to finding out that subpixel polygons kill render times isn't to tell artists to avoid intricate geometric detail and to make more LoDs, it's to double down and make a system that lets the artist go absolutely wild with detail and not need to even think about LoDs.
There is one downside to it however, mostly from inexperienced devs working with Nanite, people are using Nanite as excuse to have extremely high detailed models in the game, even for things the player will never be close enough to notice. While Nanite alleviates many issues with the performance of such assets, it leads to far, far larger game sizes as many people just throw high detail photogrammetry scans into the game. Just because Nanite can handle automatic LOD doesn't mean no effort should go into optimizing the base mesh still, just that you don't need to author LODs manually, the base model still shouldn't be far more detailed than it needs to be, we already have 300GB games now, we don't need 500GB games because people throw in hundreds of 5-10GB models.
14:30 the "maximum area" triangulation improving performance so much is news to me! I usually create slices since it looks "nicer", but I should probably think more about the final result.
optimize when you know your targets and know you're going to need it. you can't "optimize" everything. it's wasteful. if you can do "worst case" example scene and figure out how well it runs on the hardware you're targeting, then you might be able to get some inkling early of how to balance things out and spend your texture/triangle budgets. For the art itself, most people suggest trying to get even, quad-only topology. They don't tend to worry about triangulation performance. Stuff like UV density and how textures stretch when animated tend to dominate. For example, I wouldn't even bother trying to act on stuff like that "maximum area" example vs triangle fans, etc. That was more a tech demonstration to exacerbate a problem, rather than sound art advice. That all said, it can be good to avoid long, thin stuff when possible and to prefer workarounds, especially at lower LODs. For example maybe it's best at low LOD to have a flat quad with an alpha cutout texture instead of a lot of polygons that form boards on a bridge. And when making LODs, it can be good to zoom an object out to its max distance it will appear at that LOD, look in wireframe, and see how big polygons are compared to pixels they cover. That can give you some good ideas of how to spread out detail, and where to reduce detail. Or, if you're using UE5, enable nanite. Then you don't have to care :D
The maximum area is probably also the worst looking. Many rendering techniques look way better when the density of triangles is uniform along a model surface. And this is more important than small optimization.
@@blarghblarghActually, for the “maximum area” triangulation I found a method in Blender that makes these triangulations super easy to construct. Step 0: Bind Checker Deselect to a key of your choice(here I’ll use Mouse5) Step 1: Select the circle loop Step 2: Alternate Mouse5 and F until you’ve run out of smaller triangulations/hit an edge Step 3: Select all the overlapping faces and hit Delete, then “Only Faces” Step 4: Select the edge loop, it’ll automatically select every edge in there Congratulations, you now have an optimally-triangulated circle.
@@thecat8411 I usually do the triangle fan out of habit, even on flat geometry or the bottoms of things. There are certainly cases where it'll look worse, but also cases where I did more work (usually extruding a default circular face and merging it to a point) for less than zero gain. Blender's default circle triangulation looks nearly the same as this "max area" algorithm.
Really enjoyed this! As an artist you get told what things to avoid - e.g. long thin triangles and extremely small triangles, but it's rare to get a good explainer on *why*.
Dev might want it to be optimal, but the company suits at the top just care about fast releases and pleasing shareholders, which puts optimization at the rock bottom on the priority list.
Ill confess, even up until about 10 minites in I has assumed, naively, that I did actually understand why billboards were so kuch better for performance, with thoughts along the line of "reading a precalculated viewing angle from a file/ram is MUCH cheaper than doing the matrix multiplication to calculate the physics perspective appearance of even a very low poly 3D mode". When you went through how the physically arcitexture of the GPU differently handles triangles with a size close to the pixel size, it was so mind-blowing. I haven't realised I was so starkly wrong about something in a while, such a great feeling!
That may have been the case a very long time ago. In 2016 my laptop still ran vertex shaders on the CPU. Intel GMA architecture - absolute garbage. Thankfully Intel stopped using that architecture and put real GPUs on their chips.
I am an absolute novice at game dev whos been toying around in Unity and now Godot for about two years, and I have to say each one of your videos feels like I should be paying you for this kind of info. The fact that this is your FREE content is insane, and I'm excited to see what your paid content looks like. You have a knack for beginning small, simple, and approachable, and then expanding to the point that I'm pausing and writing things down and yet still not feeling overwhelmed. I've read through documentation and white papers before for plenty of other coding subjects, but nothing has ever made me WANT to like your videos do.
MK11, and to a lesser extent MK "1" (12), have some of the most impressive and advanced rendering in realtime graphics. MK11 looks almost as good as some UE5 games today despite being made on UE3. They're aware...
I don’t think I’ve seen anything that talk about this stuff in this much detail, much respect for your career choice, I am truly amazed at the information dump and how accessible you’ve made it, thank you.
The moment I saw the 2x2 grid in the thumbnail and you mentioned LODs I knew exactly that this is gonna be about workflow scheduling for fragment shaders. Great video as always
@@Draganox25 I've seen some buggy videos, Maybe they fixed them then. My system is low end. Can it improve chunk loading speed if I keep the render distance pretty low?
Wow, this really explains why Nanite was such a breakthrough, of course it doesn't dig into the difficulty of implementation, but it does show how it eliminates the excessive tiny triangle issue.
@@lanchanoinguyen2914 we don't have a clear picture when that transition will happen, so a solution that solves problems now is still valuable, and understanding the limitations of those solutions is also valuable. it's like saying "eventually we'll have cheap consumer space travel, so who cares about light rail?". maybe not as extreme, since full scene tracing with no rasterization MIGHT be plausible within the next 10 years. but I kinda doubt it. not only does the hardware that supports such a thing have to come out, but everyone has to have bought it and phased out older hardware, too. unless you're saying current high end gaming hardware is already capable of doing zero rasterization and getting all the same level of effects. I am not sure I've heard anything that says that's true.
I doubt rasterization is going away any time soon, considering every GPU available today has a rasterization pipeline, and future GPUs have to keep a rasterization pipeline to maintain compatibility with - any web browser - any game on Steam (or GOG, or the Epic Games Store, etc.) - any commercial software Realtime raytracing is a neat addition to modern graphics cards, for sure, but it's not a silver bullet by any means.
Amazing. Not only does this explain why intuition fails, but you also back up the technical reasoning with real industry solutions. Watching your videos, I somehow always come out with more information than I was expecting. Well done!
Very well done. Even covered some stuff that never gets covered, like how billboards are used along with LOD and occlusion culling. They never talk about billboards. First time I made a model doing all this, I was floored at how well it worked. This stuff changed everything back in the day. You also said one of my favorite words, lol ... Automagically.
One iconic example of billboards is the infamous 1000 Heartless fight in Kingdom Hearts 2. Back then, it would've been impossible to have 1000 entities individually moving and acting all at the same time. Square Enix's work around to this was to have only have a handful of active enemies actually nearby. Meanwhile, the rest of the Heartless would be represented by these 'billboards'. While in combat, it's hard to notice this detail at first, but it's extremely obvious on repeat playthroughs. This work around is also present when there's a swarm of Rapid Thrusters, except it's a lot less noticeable since the enemies are flying above you and often spawn offscreen rather than right in front of the player.
It's similar with the cabin fight in the RE4 remake. You will see the ganado horde on the outside of the cabin, but they don't react to your shots and only a small amount of ganados is trying to enter the cabin at a time
Amazing video. Really like when I stumble on such high quality content. Would really like a part 2 covering Nanite and maybe alternatives other people developed
Creators like you kept me motivated and today I'm a 3D and tech artist in my team. I'm not hardcore into graphics programming (yet) but I'm learning! One baby step at a time.
You are an excellent teacher, the language and presentation was so easy to process even though I am only tangentially related to the topic, thanks for making this!
This is an amazing video. Really good information in a short and clear video. And this is information that is not found too often on RUclips. Much appreciated!
This video is an absolute gem, Really thank you Simon for putting this up and making this so straightforward. Really got some deep insights about how GPU's work.
In 2004, I was 4. Feels like I should've been a graphics engineer and bought a house, instead of being a little child. Shame on me. Seems that I've destroyed my opportunities with this one simple mistake.
Very informative. I'm not a game developer but I enjoy understanding some of the complexities. I didn't realize that small triangles were an issue. Not being familiar with how gpu's work at that level of detail I'd figured a triangle is a triangle.
You have by far the best game optimization content I’ve come across, can really tell that you know what you’re talking about and not just repeating words said by someone else. Really love the way you explain things with just the right amount of information for you to be able to understand these things fundamentally and really grasp the mechanics 🙏🙏
I am thankful to my brain for deciding to click on this video after looking at the thumbnail for 10 seconds. This is one of the most grateful ways I discovered a brilliant new channel. Amazing work!
I just found this through a reddit comment and I didn't knew how much I needed to know this. Thank so much for explaining. This is so valuable and I definetely will buy ad start with your game math course. Awesome stuff
In intro to 3D modelling we were told very sternly, "keep the triangle count to a minimum, and remove as many unnecessary triangles as possible. So this is nothing new to my ears... but it is fun to listen to anyways.
Great video! Very nicely explained, and truly an unexpectedly fascinating topic. It never even crossed my mind to ask whether there was more to why LODs work. Super interesting stuff.
These are the kinds of videos that inspire me to one day dive into the graphics side of computers. I've always wanted to touch shaders and 3D modeling, but it has always felt beyond my understanding. This helps.
You just explained what is, really, a constant, complex technical process, in a way pretty much anyone can follow and understand, and that's more than can be said for a lot of teachers/professors... I knew, for instance, a good amount of what was covered here, just accumulated knowledge over the years of gaming and satisfying my curiosity, and guessed cranking up sheer volume of triangles would tank FPS pretty quickly... But now, I actually understand *why*... Thanks for the insight 😍
because of you i found Lexx, been trying to find it for the past 15-20 years :D thank you, subscription deserved from all the info (and the extra one)!!!
Very interesting video, I’ve always wondered why more/smaller verticies were taxing on preformace and I’m glad that you made a video explaining it in a easy to understand manner
Amazing video as always man! I have heard about this on the surface in my career, never found something that covers this subject with so much ease before, thanks
Thanks for this deep dive. You should use frame time instead of FPS on your plots though. As FPS is a reciprocal and thus makes reading the plot much harder.
Yeah, I had to choose 1 of 2 ways. I try to err on the side of caution, and go with what I *think* more people will understand intuitively. Go with fps, that's broad, less technical people will understand this but at the annoyance of technical people. The technical people should still, in theory, understand just fine though. Go with frametime, you end up potentially confusing less technical people, but it's a more straightforward metric for technical people. Either way, you're wrong for a % of people, unless you spend extra video time explaining your choice.
@@simondev758 Funny that you mention Humus in the video, I read the FPS vs. framerate post only today. It annoys him so much in scientific settings but for the average person 60 FPS is more intuitive than 16.666 MS frame time. I'm trying to benchmark my GLSL shader library, in your experience Simon, any other pitfalls when benchmarking fragment shaders or presenting results in an academic setting? Love the videos by the way, you got me to recreate a OSRS demo in Three.js.
@@simondev758 Thanks, that’s a fair point, I may actually watch tutorial videos on shaders to see how they’re explained. I’m presenting to those with a broad knowledge of CS but not necessarily anything graphics related.
This voice is so recognizable that I only watched 1 video a year ago after stumbling on it while on my game dev journey Now, another video pops up suddenly and I click on it, just to recognize the voice immediately 😂
What a great video, I kinda always wanted to understand the performance optimization techniques better, but just didn't really had time to do the digging. Thank you for your effort!
Personally learn this naturally from playing Just Cause 2. Amazing game that use this extremely well. popping in/out is noticeable if you look for it but if casually playing it doesn't take you out of the moment.
I'm not a gamedev, but surprisingly I can follow and understand the explanation. Simon here really filters out what's important and deliver it in an easy to understand manner. on 0:36 mark, I know this will be a good one.
The Crew Motorfest uses imposters (or, as I got to know them many years ago, sprites) for trees when you're using the map and zoom out. It's a nice throwback to when I started PC gaming.
My takeaway. Please let me know if I'm wrong: small triangles are less efficient with their quads and thus are inherently more inefficient but the main reason they are inefficient is that the triangle assembly is a linear process that assums you will have less triangles than pixels and gets bogged down when it can't keep up with the assembly load. So the bottleneck is the assembly.
Great video, even tho I doubt I'll be working as a game dev. I am always happy to learn more about optimisation methods, who knows I might make my game as a small hobby passion project on the side
Absolutely fantastic video ! Now I finally understand understood how triangle size and shape affects performance ! Although in my journey in graphics , the post fx has always been more expensive than just triangles
thanks for sharing as a solo gamedev who do read scientific research papers on the nitty gritty details but done nothing impressive I really appreciate simpler explanation that comes from experienced developer. thx again
I'm also interested in learning about middlewares like sinplygon and unitys scriptable rendering pipeline. I.e visibility buffer, forward+ and bindless texture
Love this - this is not at all how I think of the gpu pipeline, so this way eye opening. It would be very interesting to hear about instancing and batches in the context of all this. For example, I presume that even though instancing saves a bunch of one kind of work, it does not save any work when it comes to this step. Or maybe I’m wrong! And I’ve always heard that batch count, or alternatively setpass call count, was the biggest bottleneck. So for something with a ton of different impostors or LODs, that might really reduce the triangle count on screen, but could double or quadruple the batch count. Obviously tuning is a series of tradeoffs, but I’m very curious on your take on this.
Instancing mostly saves you on that "driver" step, in that you don't have to make X calls to the API in order to draw X copies of an object. Performance for GPU's is complex, simple rules like watching your batch count and using LOD's work, but they're just that, simple rules that are meant to cover up the complexity.
Of course wasted draw calls affects both cpu and gpu performance.When cpu has performance issues,it's much worse than gpu because you can notice the dramatic fps drop.Draw calls are your enemy not tiny triangles because draw calls *start* the entire rendering pipeline not triangles,it's simple like that.Drawing one LOD at a time doesn't increase any draw call at all logically.
This sounds like H. Jon Benjamin is teaching me computer graphics, and it's amazing. Great explanation of 2x2 quads, friend! I'd never really understood how to optimize for them before and I'm already excited to apply some of these tips.
Don’t draw triangles smaller than one pixel. Who would have thought lol. I experienced this first hand by doing a crude test of a shader I was working on by setting my virtual screen resolution to 4x the real screen resolution. I was kind of surprised how much the frame rate tanked, it seemed disproportionate, but now I know why (maybe)
Back in the ‘90s, Marathon used “imposters” for all of the moving characters. They were all sprites. Animated sprites with versions from 8 separate angles.
as a aspiring game dev with multiple game devs under my leadership. I HIGHLY appreciate the information. ITS GOLDEN. If i ever get any funds to throw away your getting the first bit of it.
This is good info. Mindless gamers fall for the mythology of "GPU is a God and can do everything faster than a CPU. CPU graphics are for noobs and old people." And yet, a software rasterizer can be 3x faster than a GPU (21:42). And I've seen another example of a software rasterizer that gets over 500 FPS. My entire engine is software/CPU only (except the OS probably blits the final image using a GPU), without any optimizations so far, and it runs faster than a human can perceive the frame rate. I wish more devs would utilize 8 or 16 CPU cores and stop blindly worshipping the GPU God so much. Instead, we get stuck with the "moar corez doesn't matter, devs doesn't know how 2 prgram teh moer corz!" propaganda myth.
In this context, software doesn't mean running on the CPU, rather it means that Epic wrote their own rasteriser in a set of compute shaders that bypasses the GPUs built-in rasterisation hardware. It's still running on the GPU due to using compute shaders, but it's using software to do the heavy lifting rather than hardware.
I had a scene I was prepping to add into a vr game, and was unhappy with the default low ploy trees; it took a lot of time, but I finally found a solid imposter setup that detailed branches (from a few feet away anyways, if you gave it physical distance) which overlaps the image. I don't intend to have the branches move, so I figure I can just set them all to static if I can figure out the normal maps (it is horribly buggy at times even with regular alpha maps or the like), though there is this one technique I hope to use where you only have 1 draw call for same objects (as they have rotation differences, I figure it should still work as they are the same size).
Patrons can now vote for the next video! Thank you for your support. Also, some additional links from the video:
❤Patreon: www.patreon.com/simondevyt
😍Courses: simondev.io
WolfFire Games Article: blog.wolfire.com/2010/10/Imposters
Another good video that talks about this is "Rasterization, Overshading, and the GBuffer" under the "An In-Depth look at Real-Time Rendering" Series on the Unreal Learning Centre.
of course none of it matters now because we got Nanite :O
Ok so not to be an ass but you used footage from Wolffire games website and it would be cool if you could link them in your description. They are a fantastic developer and deserve more recognition
@@lazygenie5616 Oops, I clearly wanted to credit everybody since I added their links in the video itself, and tried to include them all in the description.
I've edited the pinned comment to include anything missing.
Should it be better if we had hexagonial displays instead of square pixels?
This is the stuff about gamedev that almost nobody talks about, and if they do, it's way too indepth. I love channels like this. You're doing a lot for making people understand how complex and smart Game Engines and Render Tech actually are.
Hey guys welcome to my game tutorial. We're going to make a complete game from scratch.
2 episodes. Last video was rendering a jpg to the screen. Last upload 3 years ago.
Every time.
@@aeoliunyou should not need a tutorial to make a game from scratch I'm not the best or most knowledgeable coder but I am making my own game engine using c++ and I mainly use c# but I'm getting it done
@@aeoliunhonestly I kinda get them. Very few people will watch it in comparison to a much faster tutorial on an already built game engine. Imagine the dedication and free time required to make such a long thing... If doing it alone seems awfully long, imagine having to record AND edit that video (potentially)
What do you do with this knowledge?
@@youtubeliskHave fun knowing this :)
Knowledge is its own end, imo. It's just fun to learn, when the education is properly executed.
Might as well ask what people will do with their time playing video games- except this has a small chance of being slightly more practical
This also really helps explain how 4k gaming is possible on these GPUs. In terms of GPU usage efficiency, smaller pixels is effectively the same as having larger triangles. So while 4k screens have 4x as many pixels, you're also throwing away far less work that the GPU is doing, which helps regain some of the performance loss
im wondering if eventually stuff like nanite gets integrated on a software level for gpu drivers and the architecture architects just go yolo and increase 2x2 quads to 3x3 nines or 4x4 sixteens. or a mosaic pattern of pixel groups that mimics and has statistically highest coverage with most common triangles, so that less culling is in place.
@@vanqy. You can't put that type of thing on the driver level, maybe the API level.
Conversely it also explains why using technology like DLSS to render at ever smaller and smaller resolutions doesn't improve the performance nearly as much as you'd expect. Like, you'd think rendering at 480p should be orders of magnitude faster than rendering at 4k, since there's orders of magnitude less pixels, but it's only a little faster.
@@MustacheMerlin *"Like, you'd think rendering at 480p should be orders of magnitude faster than rendering at 4k, since there's orders of magnitude less pixels, but it's only a little faster."*
Errr.... Rendering at 640 x 480 IS a buttload faster than rendering at 3840 x 2160. Just as one would expect.
DLSS involves a whole lot of processing, just like AA or anything else. And so --- just as one would expect --- it doesn't deliver the same performance as actually rendering at a given nominal resolution.
Nanite rendering is absolutely destroying the performance in Remnant II, along with "smooth framerate" I know positively because using a mod to disable them increased performance so much I turned off upscalers, at 4k. So Ive been looking into nanite trying to understand why this tool for making and running high levels of details could also destroy performance, for little fidelity benefit
thank you so much for saying "impostor" at least a dozen times while not making a single among us reference, im proud of you
I didn't even think of among us when watching the video, my brain must be healing.
@@agushernandezquiroga9064 agreed. I think i am healing too
I haven’t thought about among us for 1 years.
GET OUT OF MY HEAD GET OUT OF MY HEAD GET OUT OF MY HEAD 📮🔪
oh hi tux
"Mommy, Daddy, where do pixels come from?"
SimonDev: "Sit down son"
See, when tree vertices love each other very much...
Comedy gold 🔥🤣
One LOD technique I like a lot especially in the mobile world where nanite-like LOD engines currently aren't feasible is Progressively Ordered Primitives/POP buffers. The core idea is to cluster vertices through quantization at different levels of precision and sort them such that lower-precision vertices are first in the vertex list and higher-precision ones are last. The end-result is you can change the LOD of a model just by changing the quantization level and how many vertices you choose to draw without storing any more data than original mesh used.
The benefit is four-fold:
- Artists only need to make one model with any arbitrary attributes
- Cracks/seams can be handled perfectly
- Can have dynamically adjustable LOD levels without popping from mesh swaps
- Can stream in the extra vertices as they're needed or upload the whole vertex buffer once and instance multiple LOD levels in one draw call. The latter is useful as draw calls are still disproportionately expensive on mobile.
Since you're sorting the vertex list anyway (which can be done in O(n) time) and vertex order within quantization clusters doesn't matter much, you can also sort them on a secondary level to maximize vertex fetch efficiency which is important for mobile because of binning (more-so than overdraw since tile-based deferred renderers often have near-perfect hidden surface removal). The neat thing is you can scale the quantization level according to how large quantized grid would make triangles appear on screen to help maximize quad occupancy while maintaining enough detail for it to look good.
It does have some drawbacks, notably that it doesn't play well with vertex skinning and lower LOD levels tend to have a bit more triangles than hand-made LODs, but it's great in a mobile environment or for procedural mesh LODs.
As a side note, optimizing for mobile tile-based deferred-rendering is a lot of fun and it feels so much more rewarding to make a mobile engine run fast. Most mobile developers just port PC games or graphics techniques to mobile as-is and call it a day while limiting gameplay to negate poor performance; however, with careful optimization you can achieve between Xbox 360 and Xbox One levels of performance on most modern (>5 year old) mobile hardware. I'm definitely biased though as I've found my niche in mobile optimization.
That sounds amazingly cool! Can you say what you've worked on?
(Sorry for going of on a tangent here, I'm just musing 😅)
I've always found the capability of these mobile chips to be bizarrely good, so seeing stuff like full Resident Evil running on an iphone wasn't *that* shocking, but I've found it weird and a bit disappointing that despite this the market doesn't seem to care about it at all.
The Steam Deck and Switch's popularity implies there are absolutely potential buyers, but why aren't they biting? It's far cheaper to get a game controller attachment than these devices after all, so it shouldn't be input. Is it just *relative* to the Candy Crushes of the world that they don't show up? Poor app store presentation? Investors not willing to back it?
Awesome, I didn't know that. As someone who hopes to someday make a mobile game that isn't hot garbage this gives me hope.
@@SimonBuchanNzThis is due to a couple of factors.
1: The mobile market has been dominated by F2P monetization models since the mid-2010s. Paying even $5 for a fully-fledged game was a thing of the past a very long time ago, especially when you consider regions that the mobile market is at its most popular in: the whale-hunting model is especially effective.
2: The Switch is a 7-year-old console powered by a 12-year-old chipset. High-end smartphones exceeded the Switch’s hardware capabilities years ago, especially Apple hardware.
3: The iOS port of RE8 is exclusive to the highest-end versions of Apple’s highest-end smartphone, limiting potential audience significantly.
4: Resident Evil 8 is built upon a game engine *designed* for modern scalability on top of being a game originally built for last-generation home consoles as-is, as opposed to now where the hardware difference between the average smartphone and a current-generation console is somewhere in the ballpark of 3000%.
5. Newer console hardware and game engines has allowed developers to spend less and less time needing to optimize their assets, but it’s still faster and easier to build your game from-the-ground-up around the hardware you’re targeting anyway, especially if expensive graphical effects are integral to gameplay systems and your game’s art direction.
I like your funny words magic technology man
I don't see how that would work. You could order the vertex buffer, sure, but you'd still need entirely separate index data for each optimization level.
I like that Epic's approach to finding out that subpixel polygons kill render times isn't to tell artists to avoid intricate geometric detail and to make more LoDs, it's to double down and make a system that lets the artist go absolutely wild with detail and not need to even think about LoDs.
This was incredibly challenging and I think it was the end result of a decade of research.
@@djmips Decade to find bottle neck. Some month to fix one.
There is one downside to it however, mostly from inexperienced devs working with Nanite, people are using Nanite as excuse to have extremely high detailed models in the game, even for things the player will never be close enough to notice. While Nanite alleviates many issues with the performance of such assets, it leads to far, far larger game sizes as many people just throw high detail photogrammetry scans into the game. Just because Nanite can handle automatic LOD doesn't mean no effort should go into optimizing the base mesh still, just that you don't need to author LODs manually, the base model still shouldn't be far more detailed than it needs to be, we already have 300GB games now, we don't need 500GB games because people throw in hundreds of 5-10GB models.
@@DreadKyllerNanite also prebakes and quantizes meshes, so there's really no excuse as it's basically moving a slider.
@@DreadKyller GTA V is over 10 years old, and 100GB in size, games should be 1TB in size by now...too bad Storage tech hasn't kept up.
14:30 the "maximum area" triangulation improving performance so much is news to me! I usually create slices since it looks "nicer", but I should probably think more about the final result.
optimize when you know your targets and know you're going to need it. you can't "optimize" everything. it's wasteful.
if you can do "worst case" example scene and figure out how well it runs on the hardware you're targeting, then you might be able to get some inkling early of how to balance things out and spend your texture/triangle budgets.
For the art itself, most people suggest trying to get even, quad-only topology. They don't tend to worry about triangulation performance. Stuff like UV density and how textures stretch when animated tend to dominate.
For example, I wouldn't even bother trying to act on stuff like that "maximum area" example vs triangle fans, etc. That was more a tech demonstration to exacerbate a problem, rather than sound art advice. That all said, it can be good to avoid long, thin stuff when possible and to prefer workarounds, especially at lower LODs. For example maybe it's best at low LOD to have a flat quad with an alpha cutout texture instead of a lot of polygons that form boards on a bridge.
And when making LODs, it can be good to zoom an object out to its max distance it will appear at that LOD, look in wireframe, and see how big polygons are compared to pixels they cover. That can give you some good ideas of how to spread out detail, and where to reduce detail.
Or, if you're using UE5, enable nanite. Then you don't have to care :D
The maximum area is probably also the worst looking. Many rendering techniques look way better when the density of triangles is uniform along a model surface. And this is more important than small optimization.
@@blarghblarghActually, for the “maximum area” triangulation I found a method in Blender that makes these triangulations super easy to construct.
Step 0: Bind Checker Deselect to a key of your choice(here I’ll use Mouse5)
Step 1: Select the circle loop
Step 2: Alternate Mouse5 and F until you’ve run out of smaller triangulations/hit an edge
Step 3: Select all the overlapping faces and hit Delete, then “Only Faces”
Step 4: Select the edge loop, it’ll automatically select every edge in there
Congratulations, you now have an optimally-triangulated circle.
@@crimson-foxtwitch2581 nice one! thanks for the tip
@@thecat8411 I usually do the triangle fan out of habit, even on flat geometry or the bottoms of things. There are certainly cases where it'll look worse, but also cases where I did more work (usually extruding a default circular face and merging it to a point) for less than zero gain. Blender's default circle triangulation looks nearly the same as this "max area" algorithm.
Really enjoyed this! As an artist you get told what things to avoid - e.g. long thin triangles and extremely small triangles, but it's rare to get a good explainer on *why*.
“I’ve always been interested in optimizations” man I wish this was a requirement to work for any big game company 😭
It is, they optimize their expensive dev time spent so your PC can be utilized better.
Dev might want it to be optimal, but the company suits at the top just care about fast releases and pleasing shareholders, which puts optimization at the rock bottom on the priority list.
The problem is not only the developers, but the budget and the deadline
Actually they care more about protecting their positions than improvements.
If someone can improve everything, then these devs are out of a job.
Ill confess, even up until about 10 minites in I has assumed, naively, that I did actually understand why billboards were so kuch better for performance, with thoughts along the line of "reading a precalculated viewing angle from a file/ram is MUCH cheaper than doing the matrix multiplication to calculate the physics perspective appearance of even a very low poly 3D mode".
When you went through how the physically arcitexture of the GPU differently handles triangles with a size close to the pixel size, it was so mind-blowing. I haven't realised I was so starkly wrong about something in a while, such a great feeling!
That may have been the case a very long time ago. In 2016 my laptop still ran vertex shaders on the CPU. Intel GMA architecture - absolute garbage. Thankfully Intel stopped using that architecture and put real GPUs on their chips.
@@thewhitefalcon8539They stopped using that way before 2016
i dont understand
@@thewhitefalcon8539that's such a nice way of saying "You might not have been wrong, maybe you're just poor."
I am an absolute novice at game dev whos been toying around in Unity and now Godot for about two years, and I have to say each one of your videos feels like I should be paying you for this kind of info. The fact that this is your FREE content is insane, and I'm excited to see what your paid content looks like. You have a knack for beginning small, simple, and approachable, and then expanding to the point that I'm pausing and writing things down and yet still not feeling overwhelmed. I've read through documentation and white papers before for plenty of other coding subjects, but nothing has ever made me WANT to like your videos do.
If you want to get into paid resources worth their price(?) then I hear books on graphics programming are good.
@Techno tuna, but you are paying it, by watching it dude.
Nothing is stopping you from donating
He DOES have a Teachable if you want to buy full courses.
This is a very nice video! I have heard avoiding micro triangles is a big thing but now I finally know why
Nice profile pic
Now send this to Mortal Kombat
What do you mean? I don't get it
@@danifurka6790
I assume this relates to how much downgrade models had to receive for the Nintendo Switch port of MK1.
Would say to Cloud Imperium Games for Star Citizen First :D
MK11, and to a lesser extent MK "1" (12), have some of the most impressive and advanced rendering in realtime graphics. MK11 looks almost as good as some UE5 games today despite being made on UE3. They're aware...
Mortal kombat file sizes are insane
I don’t think I’ve seen anything that talk about this stuff in this much detail, much respect for your career choice, I am truly amazed at the information dump and how accessible you’ve made it, thank you.
From one 20+ year dev to another, your content is solid. And thanks for mentioning ATI!
The moment I saw the 2x2 grid in the thumbnail and you mentioned LODs I knew exactly that this is gonna be about workflow scheduling for fragment shaders. Great video as always
this channel is insanely underrated. The amount of knowledge you offer with each video is absolutely great. Pleas keep this up.
Minecraft doesn't have LODs and its performance is embarrassing for the pixel art level of graphics it has.
If you're on pc, you can get the distant horizons mod that adds lods
@@Draganox25 Yeah I'm aware of the mod but it seems like it's not polished yet.
@satana8157 idk why you think that I use it and it looks fine to me let's me have a render distance of like 300 but only render like 32 chunks
@@Draganox25 I've seen some buggy videos, Maybe they fixed them then.
My system is low end. Can it improve chunk loading speed if I keep the render distance pretty low?
Feel like getting the mod and trying it out can’t hurt
Wow, this really explains why Nanite was such a breakthrough, of course it doesn't dig into the difficulty of implementation, but it does show how it eliminates the excessive tiny triangle issue.
Raytracing will replace rasterization sometime and then nanite will have become obsolete.
@@lanchanoinguyen2914 we don't have a clear picture when that transition will happen, so a solution that solves problems now is still valuable, and understanding the limitations of those solutions is also valuable.
it's like saying "eventually we'll have cheap consumer space travel, so who cares about light rail?". maybe not as extreme, since full scene tracing with no rasterization MIGHT be plausible within the next 10 years. but I kinda doubt it. not only does the hardware that supports such a thing have to come out, but everyone has to have bought it and phased out older hardware, too. unless you're saying current high end gaming hardware is already capable of doing zero rasterization and getting all the same level of effects. I am not sure I've heard anything that says that's true.
I doubt rasterization is going away any time soon, considering every GPU available today has a rasterization pipeline, and future GPUs have to keep a rasterization pipeline to maintain compatibility with
- any web browser
- any game on Steam (or GOG, or the Epic Games Store, etc.)
- any commercial software
Realtime raytracing is a neat addition to modern graphics cards, for sure, but it's not a silver bullet by any means.
1) We're extremely far from that happening
2) It's not like raytracing is immune to triangle counts
@@lanchanoinguyen2914how come? What does one thing do with the other?
Finally! I no longer have to explain this every other day. Another brilliant coverage, well executed.
I mean, you'll still have to link to it every other day :D
Amazing. Not only does this explain why intuition fails, but you also back up the technical reasoning with real industry solutions.
Watching your videos, I somehow always come out with more information than I was expecting. Well done!
Crazy informative video! Big props
Very well done. Even covered some stuff that never gets covered, like how billboards are used along with LOD and occlusion culling. They never talk about billboards. First time I made a model doing all this, I was floored at how well it worked. This stuff changed everything back in the day. You also said one of my favorite words, lol ... Automagically.
One iconic example of billboards is the infamous 1000 Heartless fight in Kingdom Hearts 2. Back then, it would've been impossible to have 1000 entities individually moving and acting all at the same time. Square Enix's work around to this was to have only have a handful of active enemies actually nearby. Meanwhile, the rest of the Heartless would be represented by these 'billboards'. While in combat, it's hard to notice this detail at first, but it's extremely obvious on repeat playthroughs. This work around is also present when there's a swarm of Rapid Thrusters, except it's a lot less noticeable since the enemies are flying above you and often spawn offscreen rather than right in front of the player.
Just watched that, good example 👍
It's similar with the cabin fight in the RE4 remake. You will see the ganado horde on the outside of the cabin, but they don't react to your shots and only a small amount of ganados is trying to enter the cabin at a time
didn't serious sam support hundreds of active models back in 2001?
I think BOTW uses them a lot too, in bushes etc, in a very visible manner, presumably because the console hardware is extremely limited.
Amazing video. Really like when I stumble on such high quality content.
Would really like a part 2 covering Nanite and maybe alternatives other people developed
been hoping for content like this for many years, great job
Creators like you kept me motivated and today I'm a 3D and tech artist in my team. I'm not hardcore into graphics programming (yet) but I'm learning! One baby step at a time.
You are an excellent teacher, the language and presentation was so easy to process even though I am only tangentially related to the topic, thanks for making this!
This is an amazing video.
Really good information in a short and clear video. And this is information that is not found too often on RUclips.
Much appreciated!
This is fantastic, thank you Simon! Just bought both of your courses, looking forward to diving in.
Hope you get a lot out of them!
This is so incredibly in-depth, I learned so much from this video
This video is an absolute gem, Really thank you Simon for putting this up and making this so straightforward. Really got some deep insights about how GPU's work.
I wish game optimization was a thing again..
absolutely eye opening 😮
Thank you so much!
my adblocker works completely fine, but for your videos I always pause my adblockers so that I watch the ads and support your wonderful content ✨✨
Hah thanks so much! Very appreciated :)
In 2004, I was 4. Feels like I should've been a graphics engineer and bought a house, instead of being a little child. Shame on me. Seems that I've destroyed my opportunities with this one simple mistake.
Bet you won't make that mistake again.
This is a joy to watch! Glad I found this video 🙏🏼
Thank you a ton for the captions, really made it much easier to follow! Great video :)
Very informative. I'm not a game developer but I enjoy understanding some of the complexities. I didn't realize that small triangles were an issue. Not being familiar with how gpu's work at that level of detail I'd figured a triangle is a triangle.
You have by far the best game optimization content I’ve come across, can really tell that you know what you’re talking about and not just repeating words said by someone else. Really love the way you explain things with just the right amount of information for you to be able to understand these things fundamentally and really grasp the mechanics 🙏🙏
Simon! You are a saint! You even reference everything you mention. S tier content 👌
I am thankful to my brain for deciding to click on this video after looking at the thumbnail for 10 seconds.
This is one of the most grateful ways I discovered a brilliant new channel. Amazing work!
I just found this through a reddit comment and I didn't knew how much I needed to know this. Thank so much for explaining. This is so valuable and I definetely will buy ad start with your game math course. Awesome stuff
In intro to 3D modelling we were told very sternly, "keep the triangle count to a minimum, and remove as many unnecessary triangles as possible.
So this is nothing new to my ears... but it is fun to listen to anyways.
This is fascinating, and very helpful. Thank you!
Great video! Very nicely explained, and truly an unexpectedly fascinating topic. It never even crossed my mind to ask whether there was more to why LODs work. Super interesting stuff.
Great video :) This will be my go-to when people ask about mesh-related optimization!
It blows my mind... one more day that I realize I know nothing. Thank you very much!! Perfect explanation and quality like always.
These are the kinds of videos that inspire me to one day dive into the graphics side of computers. I've always wanted to touch shaders and 3D modeling, but it has always felt beyond my understanding. This helps.
You just explained what is, really, a constant, complex technical process, in a way pretty much anyone can follow and understand, and that's more than can be said for a lot of teachers/professors...
I knew, for instance, a good amount of what was covered here, just accumulated knowledge over the years of gaming and satisfying my curiosity, and guessed cranking up sheer volume of triangles would tank FPS pretty quickly... But now, I actually understand *why*...
Thanks for the insight 😍
This flipped a switch in my head; this all makes WAY more sense now!
Thank you!
Good luck on your ventures, non-binary black horned dragon dev! Take it ez! 😎 👍
delete this @@crackedemerald4930
Please have more content in depth like this, no one ever talk about this deep, Thank you for your clear explaination
03:34 You just made my day for putting in Kai from the best TV Show ever made!
You're amazing, glad to have found this channel. Thank you for this easy to understand and useful information!
Thanks, this was actually really interesting and helpful!
Dude, this video is awesome! You have such a straightforward way to express your knowledge that I, not a graphics programmer, get it. Kudos!
because of you i found Lexx, been trying to find it for the past 15-20 years :D thank you, subscription deserved from all the info (and the extra one)!!!
May his divine shadow fall upon you.
@@simondev758 More people need to know about Lexx!
Very interesting video, I’ve always wondered why more/smaller verticies were taxing on preformace and I’m glad that you made a video explaining it in a easy to understand manner
For reasons unclear to me, I find 2d "fake" trees being called impostors extremely funny.
sussy
I’m so sad to hear about your esophagus cancer 😢. You’re my favorite youtuber I hope you get better soon!
this is a really good video and also i really like your voice. thank you for making this 😊
Thank you so so so much for this, this is extremely helpful!
So interesting. Thanks for your work, I really appreciate it!
I will watch it several times with taking notes and watch other resources to create understanding
Amazing video as always man! I have heard about this on the surface in my career, never found something that covers this subject with so much ease before, thanks
Thanks for this deep dive. You should use frame time instead of FPS on your plots though. As FPS is a reciprocal and thus makes reading the plot much harder.
Yeah, I had to choose 1 of 2 ways. I try to err on the side of caution, and go with what I *think* more people will understand intuitively.
Go with fps, that's broad, less technical people will understand this but at the annoyance of technical people. The technical people should still, in theory, understand just fine though.
Go with frametime, you end up potentially confusing less technical people, but it's a more straightforward metric for technical people.
Either way, you're wrong for a % of people, unless you spend extra video time explaining your choice.
@@simondev758 Funny that you mention Humus in the video, I read the FPS vs. framerate post only today. It annoys him so much in scientific settings but for the average person 60 FPS is more intuitive than 16.666 MS frame time. I'm trying to benchmark my GLSL shader library, in your experience Simon, any other pitfalls when benchmarking fragment shaders or presenting results in an academic setting? Love the videos by the way, you got me to recreate a OSRS demo in Three.js.
No amazing advice off the top of my head, only to think about what audience you're presenting to.
@@simondev758 Thanks, that’s a fair point, I may actually watch tutorial videos on shaders to see how they’re explained. I’m presenting to those with a broad knowledge of CS but not necessarily anything graphics related.
Great video, very technical but clear witch is really great ! If you like it, keep up the good work
which
You have a wonderful Bobs Burgers deadpan delivery that just works.
This voice is so recognizable that I only watched 1 video a year ago after stumbling on it while on my game dev journey
Now, another video pops up suddenly and I click on it, just to recognize the voice immediately 😂
Hah
What a great video, I kinda always wanted to understand the performance optimization techniques better, but just didn't really had time to do the digging. Thank you for your effort!
Absolutely love the way you explain things.
Personally learn this naturally from playing Just Cause 2. Amazing game that use this extremely well. popping in/out is noticeable if you look for it but if casually playing it doesn't take you out of the moment.
Very easy explained, thanks!
I'm not a gamedev, but surprisingly I can follow and understand the explanation.
Simon here really filters out what's important and deliver it in an easy to understand manner.
on 0:36 mark, I know this will be a good one.
The best example i can call off from memory is GMOD’s Flattywood sign. I always thought it was a model from the distance it was. It isn’t.
The Crew Motorfest uses imposters (or, as I got to know them many years ago, sprites) for trees when you're using the map and zoom out. It's a nice throwback to when I started PC gaming.
My takeaway. Please let me know if I'm wrong: small triangles are less efficient with their quads and thus are inherently more inefficient but the main reason they are inefficient is that the triangle assembly is a linear process that assums you will have less triangles than pixels and gets bogged down when it can't keep up with the assembly load. So the bottleneck is the assembly.
Great video, even tho I doubt I'll be working as a game dev. I am always happy to learn more about optimisation methods, who knows I might make my game as a small hobby passion project on the side
Rare Lexx reference! Don't think some of us didn't spot that.
May his divine shadow fall upon you.
No mentioning of Lexx will go unnoticed ;D
Hell yeah was searching for this comment!
Top notch video, really helped me fill some gaps in understanding quad overdraw and nanite!
0:34 that prison game, where the woman is looking from the window, what game is that?
"Grand theft auto 6", its comming out next year
Darude Sandstorm
Bro's is living under rock i swear
dude it literally says the game name down roght corner are you blind
Grand theft woman
Just love the John Carmack quote on the article 😂
1:23 my guy was lowkey hitting that shit, look at him
Dude, amazing content. How interesting!
Thx for sharing your knowledge, cheers!
I also wanted to give some attention to Nanotech, a Unity version of Nanite nearing release.
Absolutely fantastic video ! Now I finally understand understood how triangle size and shape affects performance ! Although in my journey in graphics , the post fx has always been more expensive than just triangles
This is golden content.
thanks for sharing as a solo gamedev who do read scientific research papers on the nitty gritty details but done nothing impressive I really appreciate simpler explanation that comes from experienced developer. thx again
I'm also interested in learning about middlewares like sinplygon and unitys scriptable rendering pipeline.
I.e visibility buffer, forward+ and bindless texture
Love this - this is not at all how I think of the gpu pipeline, so this way eye opening. It would be very interesting to hear about instancing and batches in the context of all this.
For example, I presume that even though instancing saves a bunch of one kind of work, it does not save any work when it comes to this step. Or maybe I’m wrong!
And I’ve always heard that batch count, or alternatively setpass call count, was the biggest bottleneck.
So for something with a ton of different impostors or LODs, that might really reduce the triangle count on screen, but could double or quadruple the batch count. Obviously tuning is a series of tradeoffs, but I’m very curious on your take on this.
Instancing mostly saves you on that "driver" step, in that you don't have to make X calls to the API in order to draw X copies of an object.
Performance for GPU's is complex, simple rules like watching your batch count and using LOD's work, but they're just that, simple rules that are meant to cover up the complexity.
Of course wasted draw calls affects both cpu and gpu performance.When cpu has performance issues,it's much worse than gpu because you can notice the dramatic fps drop.Draw calls are your enemy not tiny triangles because draw calls *start* the entire rendering pipeline not triangles,it's simple like that.Drawing one LOD at a time doesn't increase any draw call at all logically.
Abbreviation:
Less LOD setting isn't mean less vertices for performance.
It's just less smaller triangles/ Bigger triangles.
I think, I got it.
This sounds like H. Jon Benjamin is teaching me computer graphics, and it's amazing.
Great explanation of 2x2 quads, friend! I'd never really understood how to optimize for them before and I'm already excited to apply some of these tips.
i was thinking the same thing lmfao
Don’t draw triangles smaller than one pixel. Who would have thought lol. I experienced this first hand by doing a crude test of a shader I was working on by setting my virtual screen resolution to 4x the real screen resolution. I was kind of surprised how much the frame rate tanked, it seemed disproportionate, but now I know why (maybe)
Thank you Bob from Bobs burgers for explaining graphics optimization
Back in the ‘90s, Marathon used “imposters” for all of the moving characters. They were all sprites. Animated sprites with versions from 8 separate angles.
as a aspiring game dev with multiple game devs under my leadership. I HIGHLY appreciate the information. ITS GOLDEN. If i ever get any funds to throw away your getting the first bit of it.
This is good info. Mindless gamers fall for the mythology of "GPU is a God and can do everything faster than a CPU. CPU graphics are for noobs and old people." And yet, a software rasterizer can be 3x faster than a GPU (21:42). And I've seen another example of a software rasterizer that gets over 500 FPS. My entire engine is software/CPU only (except the OS probably blits the final image using a GPU), without any optimizations so far, and it runs faster than a human can perceive the frame rate. I wish more devs would utilize 8 or 16 CPU cores and stop blindly worshipping the GPU God so much. Instead, we get stuck with the "moar corez doesn't matter, devs doesn't know how 2 prgram teh moer corz!" propaganda myth.
In this context, software doesn't mean running on the CPU, rather it means that Epic wrote their own rasteriser in a set of compute shaders that bypasses the GPUs built-in rasterisation hardware. It's still running on the GPU due to using compute shaders, but it's using software to do the heavy lifting rather than hardware.
I had a scene I was prepping to add into a vr game, and was unhappy with the default low ploy trees; it took a lot of time, but I finally found a solid imposter setup that detailed branches (from a few feet away anyways, if you gave it physical distance) which overlaps the image. I don't intend to have the branches move, so I figure I can just set them all to static if I can figure out the normal maps (it is horribly buggy at times even with regular alpha maps or the like), though there is this one technique I hope to use where you only have 1 draw call for same objects (as they have rotation differences, I figure it should still work as they are the same size).
Thank you, Bob's Burgers
Bane became an engineer? Nice ...
Great video man. For real.
Ahhh you think the gpu is your ally?
bob from bob's burgers if he was a game dev