"It is hard to reverse engineer the design choices made during Quake's development since, as is often the case with highly optimised graphics code, John Carmack did it"
I love the Quake Software Renderer. I used to play QuakeWorld on a 320x200 Software Rendered well into mid 2000's. I wish more people knew that QuakeWorld is still around and is an INCREDIBLE FPS. It's been updated through many open source projects. Just google it. It's a great throwback! :)
@@goqsane I didn't play that much Q1 or QW as at that time I was still bound to dail-up internet connection. So I couldn't waste that much money ^^. Though I played and still play a lot of Q3 / Defrag and later QuakeLive. For me id Tech3 was another big milestone. A number of big games were made with that engine (RtCW, MOHAA, JediKnight2, CoD, ...).
I owned Abrash's Black Book as a 90s kid. It was inspiring and a fun read. It wasn't dry and boring, he spiced it up with a lot of philosophy and personal ramblings. I wish I'd never gotten rid of it.
One of the things I always praise about the book when talking about it to others are these extra chunks of wisdom that aren't tied to any technology. The tech is severely obsolete and teaching just about it wouldn't be as useful, but the wisdom he shares is just so valuable and you can have something useful out of the book instead of just reading about old tech for curiosity sake.
@@StereoBucket I'm doing a lot of graphics programming these days, and the thing that surprises (and pleases) me is that I'm finding ways that these techniques *aren't* so obsolete as we suppose. To be specific - yes, it's all obsolete IF you're rendering polygon meshes in the conventional ways, which have progressed so far since then that there's no relation to these old algorithms. BUT, if you're doing something experimental, maybe using the GPU in ways that aren't conventional, maybe rendering things that aren't triangle meshes, then suddenly a lot of these old "tricks" begin to apply again, and what's old becomes new again. It's exciting.
7:43 The thick book with the Quad sound is way funnier than it has any right to be I love these videos so much! Your visualizations and explanations are always nice and clean. I would love to see you go more in depth with the exceptions (like you mentioned doors and buttons in a comment), but sadly I understand the video has to end at one point...
Cool video! It's still hard to wrap my head around BSP. As i understand it a infinite plane is used to split the brushes into ever smaller parts. It's hard to imagine how this splitting is actualy coded
It always genuinely amazes me how much 3D game engines manage to process on each frame... and you do a fantastic job explaining it all in such a easily understandable way too with the visualisations. I'm glad I stumbled on your channel!
For every frame the CPU has to sort through 3 binary trees, then do the edge spanning, then do texture mapping, combining the texture and lightmap to determine the final color of a pixel. I'm surprised the CPUs of the day managed 20+ fps. Edit: and since the physics ran every frame instead of on a set tickrate, physics was run every frame too.
⚠ WARNING: This video contains flashing images. There's a lot of details I omitted to keep things brief. For example, there's slightly different behavior if a sub-model (door, lift, button, etc) is being drawn. Also I haven't spoken about texture mapping yet --- I think that topic warrants its own video.
A great follow-up to the texture video would to discuss what changes Carmack had to implement/omit to add hardware rendering for the 3Dfx and later GPUs.
The worldspace clipping of BSP entities against the world geometry is actually very relevant to this video, because it eliminates most z-fighting. Z-fighting can still occur between different BSP entities, as in one of the maps of the Contract Revoked mod. Texture mapping, however, should indeed be in a different video.
Seeing how those spans were generated from the active edge list was truly awesome. I've never seen a behind the scenes of a software renderer before, but this is the kind of stuff I love.
Just discovered your amazing channel last night and already a new video. Keep up the awesome content! Also your visualization techniques are some of the best I’ve seen. Love the attention to detail.
I think strategically this kind of optimization does as much in the CPU memory space as possible to reduce the amount of writing into the frame buffer that needs to be done, which was pretty slow on an unaccelerated graphics card.
This series finally helped me understand how the BSP tree and PVS work together to make the culling tests less expensive. I had a rough idea of what the PVS was doing and what the BSP leaves were, but hadn’t wrapped my head around the recursive division of space into convex volumes of empty or solid space, how you can use the tree to quickly figure out what leaf you’re in, culling out most leaves that aren’t possible to see, testing the bounding box of a leaf against the camera frustum to completely stop tests against its child leaves. It’s all so brilliantly elegant. The span solution of projecting lines into screenspace and testing whether they’re the left or right edge of a span is insanely smart too, it seems like the kind of overcomplicated hack solution I’d try to create when I was younger… except it’s smart and it actually works. Fantastic video. I’d love to see what kind of messy trigonometry had to be done to generate the world geometry from the BSP tree in the future.
Hey Matt! Thank you so much for your deep dive videos on Quake. I have been digging into Quakespasm source code for a while now and these videos give me a good overview before I go into more details by myself.
I could never visualize what sorted spans were when reading about them in the black book. I guess I tripped myself over by trying to think of them as a vertical construct so it obviously couldn't click. I put off trying to understand them for a few years for when I'd eventually get around to visualizing the renderer myself, but never got around to it. Can't say I'm happy about being beaten to the punch, but I am enjoying this series, keep it up! I can finally see what spans meant.
The Black Book is one of my favorite things from childhood. It was always just out of reach but I still learned things that helped me in my career. Good stuff.
I such a big fan of your videos! I can only imagine how long it takes to understand, summarize, and visualize these very complex topics. As a long time Quake junkie, it's quite fascinating to take a peak behind the curtain. Keep them coming!
Michael Abrash wrote a wonderful article on scan-line rendering for Dr Dobbs Journal, complete with a code project (not Quake ofc, but demonstrating scanline rendering)
Wow the span algorithm reminds me of an algorithm I made for my masters to rasterize convex shapes in 3d. Which ended up being used as a way to view frustum cull a spatial hash.
Gamedevs then: "We'll employ black magic to make a complex 3D game work on a glorified calculator" Gamedevs now: "Why does my visual novel lag when there are three character portraits on the screen?"
I have literally seen newer "boomer shooters" with all 2d speites and basic geometry and baked lighting perform about as poorly as these first 3D fps games did when they came out. Except on modern machines.
@Creative yeah, basically not knowing the engine deeply enough and not knowing how to optimize it well enough. When developers DO know what they're doing, the performance can be absolutely limitless
Thanks for making more videos! You have been one of my favorite technical channels i have stumbled upon in the past few years and the recent resurgence of videos is well appreciated!
if you just build the convex space adjacency tree, you get all the visible faces for free, for ray casting (inside a convex polygon space partition, you always see the inside walls)
Great breakdown, thank you. Having done my own reading of the Black Book, and other of Abrash's articles (which seem to be the most time capsule-y articles available on the Quake engine development) I'm struck by how pragmatic many of the choices made in building the engine were. For example, the PVS doesn't eliminate overdraw by itself, but it greatly reduces it. Carmack knew it was stupid to try to get the PVS perfect (in fact, fine-grained PVS is still an open problem today). PVS as implemented in quake gets the set of polygons low enough that span buffers can be used -- span buffers being even for the time a slightly outdated algorithm, that was designed in a time when writing a pixel was very, very slow, to completely eliminate the remaining overdraw. Overdraw in drawing the world was very expensive because every pixel had to be textured and lit, and world polygons tend to be large. But then at model drawing time, Carmack just said fuck it, the polys are small enough that we'll just the z-buffer.
A much simpler approach that comes to mind is to keep a list of occupancy spans while plotting polygons front to back. Then there will be no more than a handful spans per each line, and at their size, they might as well be an array. What speaks against that? One thing and a big one is integration of entities (enemies items etc) into the rendering. That you have discarded environment depth data from your buffer before you need it again to occlude the entities by the environment. And another is cache optimisation; that you have execution alternating between various domains of code like BSP traversal, clipping, triangle setup and rasterisation, this means code and hot data get pushed out of the cache regularly. Cache optimisation is a big must to make something run fast on a P5.
Hello, to render enemies etc a depth buffer is indeed used, it is just that for the main level only depth writes are required, which were apparently much faster than depth reads. Your idea with spans sounds like it should work. I've no doubt it was considered by Carmack, Abrash, and co; it'd be interesting to know why they went with the AEL/APL approach. Edit: Chapter 66 of Abrash's book talks about this in the Sorted Spans section.
@@MattsRamblings I'm certain they tried more than a few things. By drawing one surface at a time and having no interleaving tasks, good texture fetch locality is achieved, and cache can be utilised especially if they use swizzled texture format.
@@Ehal256 Once you start increasing geometric resolution, you're going to need better methods than just drawing one whole surface at a time to achieve better cache locality. But either is better than rendering the whole spanbuffer line by line top to bottom. Of course one can come up with tile based methods as well and so on but they can also be hit or miss.
I used to make levels on an editor called QOOLE - I wish I'd known then some of the stuff you've put in your videos as it explains why some things just wouldn't behave themselves.
Sometimes I wonder if stuff like this can be threaded to make it faster. Not that it really matters anymore, but as an academic experiment for those old dual and quad Pentium Pro setups.
The spans probably are better for avoiding memory cache misses as likely they are stored next to each other in memory...well...it's usually done that way anyhow, along widths rather than heights.
When I watch your videos on quake I usually try to recognize which parts are relevant to modern graphics programming and which aren't. I suppose if you are taking advantage of the graphics card and modern graphics APIs then everything in this video after the "Back-face culling" chapter wont be particularly useful since you can have your graphics API automatically perform lots of tests to prevent overdraw.
Yes and no. Yes, because programming on the CPU, you don't need to worry about such. However, no because that's actually not far off what the GPU does, so understanding Quake's rasterization helps in understanding how a GPU works. While algorithms might be differ (sometimes significantly), the biggest difference (by far) is Quake is single-threaded (it wasn't until after 64-bit became commonly available that multi-core cpus were commonly available), while GPUs do the vast majority of the work in parallel.
Holy crap... I can see why Carmack & Romero mentioned that Doom was kind of the sweet spot where the average person can reasonably understand how the engine renderer works and how building out a custom level works, where Wolfenstein was too simple create anything all that interesting and Quake is just that smidge too complex for most people to wrap their heads around.
Very nice video. While watching it, I was thinking which modern tech would be equally impressive. Maybe it's nanites in UE5. IMO the best presentation about them is called "Nanite | Inside Unreal". It's almost 3 hours long, but the stuff is not terribly complicated, these people simply tell too much information: history, requirements, their reasons to make various decisions, etc.
The book does describe the AEL / APL span algorithm: Check out Chapters 66 and 67. There are some differences compared with the final version in the GPL release, so refer to source for the true method. It was the best technique they found, in terms of maximizing the worst case performance, that is to say, still performing well on the hardest scenes.
Imagine going to all this trouble, only for OpenGL and Direct3D to arrive shortly after. I often felt sorry for the poor person tasked with building the software renderer that most people never used by the time 3D graphics cards were more commonplace.
Excellent as usual. If i might give some constructive criticism: Consider red-green colour blindness in your visualisation, perhaps switching to blue and orange. Also check that your audio is normalised, it is very quiet in places.
What about Z-fighting? I find it interesting that the Z-fighting traces bleeds across horizontal scans consistently and if the camera is kept still, the Z-fighting pattern does not update. You can get certain camera angles where one texture completelly obscures the other while you can get other camer aangles where there's a diagonal streak. Wouldn't Z-fighting be very costly rendering time wise with this algorithm?
I’m not sure if Z-fighting would really apply here given how the world geometry is generated from the CSG brushes. The span optimization is only done to the BSP geometry, which (if I understand correctly) already has hidden surface removal applied to it. Levels are paper-thin shells that have no exterior.
Quake was software rendered, development started before Doom was released, and they likely added minimal OpenGL support by replacing end of graphics pipeline to draw polygons using OpenGL. IdTech3 (Quake 3 arena) likely was similar but more optimized to work with 3D-hardware. IdTech4 (Doom 3) was overhaul to architecture. Sadly that failed because it used patented algorithm and engine was very dependent on those shadow volumes. IdTech3 was very long lived engine.
@@Biel7318 Quake 2 engine is basicly same as Quake 1. Software renderer where end of rendering pipeline works in graphic card. It is easy to make engine with zero overdraw. Like raytracing whole image. It is not necessarily good idea to have zero overdraw as it is faster to let GPU do the work than use CPU to minimize overdraw.
@@gruntaxeman3740 independent of it's optimalness when rendering. Carmack did made an intent of zero overdraw with Quake2, but using the GPU instead of the CPU unlike Quake1. I'm just asking how different is that method to Quake1's
I don't know if it was actually done this way in Quake 2, but with a GPU it's better to culling at the level of BSP nodes and not worry about a little bit of overdraw where one node overlaps behind another
Not qute: sprites use transparency (binary). However, yes, translucent water is a problem, and not just because of the overdraw elimination, but the palette too (though the water could be dithered for a probably rather ugly translucency, but only as a second pass, which is how the GL renderer does it (minus the dithering)).
@Bill Currie well that's quake2 water, but quake2 was designed with OpenGL acceleration in mind, so the different passes were expected. But Q1 is entirely solid render. If any, with transparent passes, the lines don't get discarded. Instead, they have a chance to get properly sorted and have correct transparency🤔
@@santitabnavascues8673 Quake had transparent water in OpenGL as a hack, but yeah, it might have come after Quake 2 was released. Certainly the software renderer was not designed with transparent water in mind. However, that's really after the bsp stage of the rendering pipeline. The bsp tree traversal itself is barely changed between software and OpenGL, and even my Vulkan renderer's bsp stage is very similar.
@Bill Currie the bsp is enough to render transparencies well, the way it is traversed sorts the polygons, Quake3 renders them just fine. Good luck with your renderer!🙂
@@santitabnavascues8673 Of course it is, I never said otherwise. The BSP tree how the original legacy OpenGL rendering, and both the modernish OpenGL and Vulkan renderers (both of which I wrote) support transparency, because the BSP tree supports not only depth sorting, but also texture sorting. It's the way spans are handled that (at least at first glance) make transparency awkward. I don't feel like digging into getting software transparency working as I have other things I want to work on (eg, shadows, diegetic UI, ...).
Ive been working with unreal engine 5 a lot and i love some of the modern optimisation methods possible with powerful gpus and multi core cpus, its hard to expain just how many advantages these have. But it gets frustrating when you see "new culling tech" being advertised when the problem was allready solved 30 years agao on less powerful hardware, in a more efficent way. If i had the time to go through unreal engine code and make some overhauls i think itd just end up using the same solutions found in quake (though some problems can be done better with modern hardware then could be done with methods used in quake". In short, quake did optimisation better then multi billion dollar companies 30 years before and its very frustrating.
Definitely a case of "yes, but no" as methods for transparency; in particular the lighting and shadows around them, have changed a lot and require greater efforts to prevent drawing errors. That said, having an expanded set of rendering options with a hierarchy of what other methods they rely on being parsable so the desired balance of older optimizations to realism could be achieved on demand would be a wet dream, especially if it could be injected into earlier titles. The specular mapping of the mid 2000's that carried on way too long is something I'd probably tweak in every title I could😅
@@Xeogin i completely agree, i love 3d graphics and things like ray tracing is pretty much a must have IMO, but binary space partition gave so much preformance improvements (in inside spaces) that its kinda stupid to not use.
@@hughjanes4883 BSP is highly focused on static geometry, and it also requires a fair bit of precomputation. While it's still very effective on modern CPUs (see Ironwail), I can see why it fell out of favor.
@@hughjanes4883 BSP goes very bad very quickly with complexity. Cross cutting planes are a minor problem in Quake but they get drastically worse once you start doing intricate geometry. It was a suitable solution for the time, not for all times.
Yeah, but think of the kind of content we handle now a days. Most of it is dynamic and changing, whereas Quake levels are fully static. Also, current content has millions, literally almost, of triangles, so an active line list iwould become a large structure, and having ever smaller triangles (such as the nanite system) would make this structure even larger than an equivalent depth buffer and sorting through them and referencing polygons would pose a computational complexity that would make their utility rather limited, setting its scope completely outside of the real time rendering. But the core optimizations are still there, for example, many of the current engines do a depth prepass precisely to prevent the overdraw of complex pixel shaders
I can't watch this video, for the same reason i am leaving this comment, but i want to commend you on the best flashing lights warning i have seen on youtube c:
"It is hard to reverse engineer the design choices made during Quake's development since, as is often the case with highly optimised graphics code, John Carmack did it"
cool to see you here! :)
I love the Quake Software Renderer. I used to play QuakeWorld on a 320x200 Software Rendered well into mid 2000's. I wish more people knew that QuakeWorld is still around and is an INCREDIBLE FPS. It's been updated through many open source projects. Just google it. It's a great throwback! :)
@@goqsane I didn't play that much Q1 or QW as at that time I was still bound to dail-up internet connection. So I couldn't waste that much money ^^. Though I played and still play a lot of Q3 / Defrag and later QuakeLive. For me id Tech3 was another big milestone. A number of big games were made with that engine (RtCW, MOHAA, JediKnight2, CoD, ...).
@@goqsanewhich would you recommend? It’s a PvP fps, right? Which is most fun and most played?
That's wild.
I owned Abrash's Black Book as a 90s kid. It was inspiring and a fun read. It wasn't dry and boring, he spiced it up with a lot of philosophy and personal ramblings. I wish I'd never gotten rid of it.
One of the things I always praise about the book when talking about it to others are these extra chunks of wisdom that aren't tied to any technology. The tech is severely obsolete and teaching just about it wouldn't be as useful, but the wisdom he shares is just so valuable and you can have something useful out of the book instead of just reading about old tech for curiosity sake.
@@StereoBucket I'm doing a lot of graphics programming these days, and the thing that surprises (and pleases) me is that I'm finding ways that these techniques *aren't* so obsolete as we suppose.
To be specific - yes, it's all obsolete IF you're rendering polygon meshes in the conventional ways, which have progressed so far since then that there's no relation to these old algorithms.
BUT, if you're doing something experimental, maybe using the GPU in ways that aren't conventional, maybe rendering things that aren't triangle meshes, then suddenly a lot of these old "tricks" begin to apply again, and what's old becomes new again. It's exciting.
I really hope this gets more views. You skilled visual explanation are amazing.
When will Quake be on GBA?
7:43 The thick book with the Quad sound is way funnier than it has any right to be
I love these videos so much! Your visualizations and explanations are always nice and clean. I would love to see you go more in depth with the exceptions (like you mentioned doors and buttons in a comment), but sadly I understand the video has to end at one point...
it's even funnier when you consider the effects the book has on the reader
Cool video! It's still hard to wrap my head around BSP. As i understand it a infinite plane is used to split the brushes into ever smaller parts. It's hard to imagine how this splitting is actualy coded
Thanks. Yes, the map compiler is certainly not trivial to describe and is worthy of its own video (probably many).
It always genuinely amazes me how much 3D game engines manage to process on each frame... and you do a fantastic job explaining it all in such a easily understandable way too with the visualisations. I'm glad I stumbled on your channel!
For every frame the CPU has to sort through 3 binary trees, then do the edge spanning, then do texture mapping, combining the texture and lightmap to determine the final color of a pixel.
I'm surprised the CPUs of the day managed 20+ fps.
Edit: and since the physics ran every frame instead of on a set tickrate, physics was run every frame too.
⚠ WARNING: This video contains flashing images.
There's a lot of details I omitted to keep things brief. For example, there's slightly different behavior if a sub-model (door, lift, button, etc) is being drawn. Also I haven't spoken about texture mapping yet --- I think that topic warrants its own video.
A great follow-up to the texture video would to discuss what changes Carmack had to implement/omit to add hardware rendering for the 3Dfx and later GPUs.
Did you forget to pin this comment?
@@tiagotiagot Lost the pin due to edit.
@Creative Unreal (software rendering) did checkerboard dithering with the u,v coordinates in screen space.
The worldspace clipping of BSP entities against the world geometry is actually very relevant to this video, because it eliminates most z-fighting. Z-fighting can still occur between different BSP entities, as in one of the maps of the Contract Revoked mod.
Texture mapping, however, should indeed be in a different video.
Seeing how those spans were generated from the active edge list was truly awesome. I've never seen a behind the scenes of a software renderer before, but this is the kind of stuff I love.
Just discovered your amazing channel last night and already a new video. Keep up the awesome content! Also your visualization techniques are some of the best I’ve seen. Love the attention to detail.
I think strategically this kind of optimization does as much in the CPU memory space as possible to reduce the amount of writing into the frame buffer that needs to be done, which was pretty slow on an unaccelerated graphics card.
Oh it tends to be quite quick on basic PCI cards and many VLB cards. But that was not something to rely upon.
This series finally helped me understand how the BSP tree and PVS work together to make the culling tests less expensive. I had a rough idea of what the PVS was doing and what the BSP leaves were, but hadn’t wrapped my head around the recursive division of space into convex volumes of empty or solid space, how you can use the tree to quickly figure out what leaf you’re in, culling out most leaves that aren’t possible to see, testing the bounding box of a leaf against the camera frustum to completely stop tests against its child leaves. It’s all so brilliantly elegant.
The span solution of projecting lines into screenspace and testing whether they’re the left or right edge of a span is insanely smart too, it seems like the kind of overcomplicated hack solution I’d try to create when I was younger… except it’s smart and it actually works.
Fantastic video. I’d love to see what kind of messy trigonometry had to be done to generate the world geometry from the BSP tree in the future.
I love this! This is great to learn about so thank you for sharing! :)
Hey Matt! Thank you so much for your deep dive videos on Quake. I have been digging into Quakespasm source code for a while now and these videos give me a good overview before I go into more details by myself.
Fantastic video, happy to see someone else finally talk about all the weird odds and ends of old games that I enjoy so much
Thanks, a very solid video. The span optimization is nifty, but your visualizations here are even niftier
John Carmack truly is a special person, I can't believe someone could be this much of a genius
A lot of this was done by Michael Abrash as well. And, there are many other games with equally impressive (or more so) tech!
What a wonderful explanation of the Active Edge and Polygon List techniques. Bravo!
I could never visualize what sorted spans were when reading about them in the black book. I guess I tripped myself over by trying to think of them as a vertical construct so it obviously couldn't click. I put off trying to understand them for a few years for when I'd eventually get around to visualizing the renderer myself, but never got around to it. Can't say I'm happy about being beaten to the punch, but I am enjoying this series, keep it up! I can finally see what spans meant.
yes that span thing was not explained really well by Abrash
The Black Book is one of my favorite things from childhood. It was always just out of reach but I still learned things that helped me in my career. Good stuff.
I such a big fan of your videos! I can only imagine how long it takes to understand, summarize, and visualize these very complex topics. As a long time Quake junkie, it's quite fascinating to take a peak behind the curtain. Keep them coming!
Michael Abrash wrote a wonderful article on scan-line rendering for Dr Dobbs Journal, complete with a code project (not Quake ofc, but demonstrating scanline rendering)
Thanks for taking the time to make these videos. They’re fascinating.
Binged your channel yesterday and now I'm treated to another one, awesome
I barely understand what the main takeaway is the first time, but after a couple rewatches everything makes sense from the perspective given.
This is the best video on Quake's software renderer visibility I've ever seen. Congrats.
you still can see the quake 2 software render in source code (the last quake to support software)
its a genius code with lot of assembly optimizations
I can only hope that there are many more jawdropping visuals being made for the next video in this amazing series.
Wow the span algorithm reminds me of an algorithm I made for my masters to rasterize convex shapes in 3d. Which ended up being used as a way to view frustum cull a spatial hash.
This series continues to explain the raison d'être of each of the lumps I'm seeing in BSPs; in this case, the vertices.
Gamedevs then: "We'll employ black magic to make a complex 3D game work on a glorified calculator"
Gamedevs now: "Why does my visual novel lag when there are three character portraits on the screen?"
@@sh-creative Autism?
To be fair Carmack was light years ahead of every one, back in the day.
I have literally seen newer "boomer shooters" with all 2d speites and basic geometry and baked lighting perform about as poorly as these first 3D fps games did when they came out. Except on modern machines.
@Creative yeah, basically not knowing the engine deeply enough and not knowing how to optimize it well enough. When developers DO know what they're doing, the performance can be absolutely limitless
@@Nothyn More like glorious calculator.
TL:DR a wizard did it
Just watched this for the 20th time and finally understood. SO SATISFYING!
Thanks for making more videos! You have been one of my favorite technical channels i have stumbled upon in the past few years and the recent resurgence of videos is well appreciated!
It's amazing how many calculations are made on each single frame when playing
I really appreciate the way you presented the book at the end, very funny.
if you just build the convex space adjacency tree, you get all the visible faces for free, for ray casting (inside a convex polygon space partition, you always see the inside walls)
Clean, clear, to the point.
Thank you very much!
Great breakdown, thank you. Having done my own reading of the Black Book, and other of Abrash's articles (which seem to be the most time capsule-y articles available on the Quake engine development) I'm struck by how pragmatic many of the choices made in building the engine were. For example, the PVS doesn't eliminate overdraw by itself, but it greatly reduces it. Carmack knew it was stupid to try to get the PVS perfect (in fact, fine-grained PVS is still an open problem today). PVS as implemented in quake gets the set of polygons low enough that span buffers can be used -- span buffers being even for the time a slightly outdated algorithm, that was designed in a time when writing a pixel was very, very slow, to completely eliminate the remaining overdraw. Overdraw in drawing the world was very expensive because every pixel had to be textured and lit, and world polygons tend to be large. But then at model drawing time, Carmack just said fuck it, the polys are small enough that we'll just the z-buffer.
I studied the everliving crap out of that Black Book in high school
Great video as usual!
A much simpler approach that comes to mind is to keep a list of occupancy spans while plotting polygons front to back. Then there will be no more than a handful spans per each line, and at their size, they might as well be an array. What speaks against that?
One thing and a big one is integration of entities (enemies items etc) into the rendering. That you have discarded environment depth data from your buffer before you need it again to occlude the entities by the environment.
And another is cache optimisation; that you have execution alternating between various domains of code like BSP traversal, clipping, triangle setup and rasterisation, this means code and hot data get pushed out of the cache regularly. Cache optimisation is a big must to make something run fast on a P5.
Hello, to render enemies etc a depth buffer is indeed used, it is just that for the main level only depth writes are required, which were apparently much faster than depth reads. Your idea with spans sounds like it should work. I've no doubt it was considered by Carmack, Abrash, and co; it'd be interesting to know why they went with the AEL/APL approach. Edit: Chapter 66 of Abrash's book talks about this in the Sorted Spans section.
@@MattsRamblings I'm certain they tried more than a few things. By drawing one surface at a time and having no interleaving tasks, good texture fetch locality is achieved, and cache can be utilised especially if they use swizzled texture format.
@@SianaGearzquake 2 sorts the output span list by surface (lit textures) to optimize cache hit rate, but I don't think quake does this.
@@Ehal256 Once you start increasing geometric resolution, you're going to need better methods than just drawing one whole surface at a time to achieve better cache locality. But either is better than rendering the whole spanbuffer line by line top to bottom. Of course one can come up with tile based methods as well and so on but they can also be hit or miss.
Great presentation!
I used to make levels on an editor called QOOLE - I wish I'd known then some of the stuff you've put in your videos as it explains why some things just wouldn't behave themselves.
Could you explain how the objects are rendered?
Sometimes I wonder if stuff like this can be threaded to make it faster. Not that it really matters anymore, but as an academic experiment for those old dual and quad Pentium Pro setups.
The spans probably are better for avoiding memory cache misses as likely they are stored next to each other in memory...well...it's usually done that way anyhow, along widths rather than heights.
wow, another one
edit: great one
it looks so good in lowres
When I watch your videos on quake I usually try to recognize which parts are relevant to modern graphics programming and which aren't. I suppose if you are taking advantage of the graphics card and modern graphics APIs then everything in this video after the "Back-face culling" chapter wont be particularly useful since you can have your graphics API automatically perform lots of tests to prevent overdraw.
Yes and no. Yes, because programming on the CPU, you don't need to worry about such. However, no because that's actually not far off what the GPU does, so understanding Quake's rasterization helps in understanding how a GPU works. While algorithms might be differ (sometimes significantly), the biggest difference (by far) is Quake is single-threaded (it wasn't until after 64-bit became commonly available that multi-core cpus were commonly available), while GPUs do the vast majority of the work in parallel.
Great video. Volume was low though.
Holy crap... I can see why Carmack & Romero mentioned that Doom was kind of the sweet spot where the average person can reasonably understand how the engine renderer works and how building out a custom level works, where Wolfenstein was too simple create anything all that interesting and Quake is just that smidge too complex for most people to wrap their heads around.
Very nice video. While watching it, I was thinking which modern tech would be equally impressive. Maybe it's nanites in UE5. IMO the best presentation about them is called "Nanite | Inside Unreal". It's almost 3 hours long, but the stuff is not terribly complicated, these people simply tell too much information: history, requirements, their reasons to make various decisions, etc.
Very interesting!
Did the book mention the span algorithm you described, and did Carmack say it was more efficient than the regular approach?
The book does describe the AEL / APL span algorithm: Check out Chapters 66 and 67. There are some differences compared with the final version in the GPL release, so refer to source for the true method. It was the best technique they found, in terms of maximizing the worst case performance, that is to say, still performing well on the hardest scenes.
@@MattsRamblings That's awesome, thanks
Very nice❤
They explain to us the "truth", because they hide the real truth that Quake rendering engine is Black Magic
That's genius !!!
Imagine going to all this trouble, only for OpenGL and Direct3D to arrive shortly after. I often felt sorry for the poor person tasked with building the software renderer that most people never used by the time 3D graphics cards were more commonplace.
Is there any modern "Black book" of comparable technical details?
How are models and your viewmodel affected by lightmaps?. Do they sample a point under them and they are colored according?
Fkn John Carmack, he's an alien if there are any living on earth.
Excellent as usual. If i might give some constructive criticism: Consider red-green colour blindness in your visualisation, perhaps switching to blue and orange. Also check that your audio is normalised, it is very quiet in places.
I'll bear this in mind in future vids, thanks for bringing it to my attention
I wish I had a big brain to understand this.
What about Z-fighting? I find it interesting that the Z-fighting traces bleeds across horizontal scans consistently and if the camera is kept still, the Z-fighting pattern does not update. You can get certain camera angles where one texture completelly obscures the other while you can get other camer aangles where there's a diagonal streak. Wouldn't Z-fighting be very costly rendering time wise with this algorithm?
I’m not sure if Z-fighting would really apply here given how the world geometry is generated from the CSG brushes. The span optimization is only done to the BSP geometry, which (if I understand correctly) already has hidden surface removal applied to it. Levels are paper-thin shells that have no exterior.
@@MaximumADHD If I remeber correctly some issues with Z-fighting in levels appeared with GLQuake that did not exist with software renderer.
Finally everyone can make non laggy 3D engine for calculators!
I wrote a doom renderer for my Casio, works fine
@@thewhitefalcon8539 the talk was about rasterization of polygons, not raycasting
@@Z_Z.t Doom doesn't use raycasting, it's an interesting hybrid of polygon rasterization but restricted to mostly 2/2.5D.
@@Ehal256 whatever, doom is a different and simpler engine, so some of tecnologies shown in video are not used there, because there's no need to.
👍👍👍
...That's a lot of words to say "black magic is how it works".
Quake 2's software renderer is even better.
What music was used in this video?
Hello, the music is:
White Hex - Searching For You
The 129ers - I Can't Remmber I Can't Recall
Nico Staf - Smooth and Cool
Dan Henig - Midnight
how did they adapt the software renderer way of no redrawing into the OpenGL renderer of GL quake or in Quake 2?
Quake was software rendered, development started before Doom was released, and they likely added minimal OpenGL support by replacing end of graphics pipeline to draw polygons using OpenGL.
IdTech3 (Quake 3 arena) likely was similar but more optimized to work with 3D-hardware. IdTech4 (Doom 3) was overhaul to architecture. Sadly that failed because it used patented algorithm and engine was very dependent on those shadow volumes. IdTech3 was very long lived engine.
@@gruntaxeman3740 keep in mind that in the Quake 2 engine Karmack made it a point to no overdraw anything, now how did he do it?
@@Biel7318
Quake 2 engine is basicly same as Quake 1. Software renderer where end of rendering pipeline works in graphic card.
It is easy to make engine with zero overdraw. Like raytracing whole image. It is not necessarily good idea to have zero overdraw as it is faster to let GPU do the work than use CPU to minimize overdraw.
@@gruntaxeman3740 independent of it's optimalness when rendering. Carmack did made an intent of zero overdraw with Quake2, but using the GPU instead of the CPU unlike Quake1. I'm just asking how different is that method to Quake1's
I don't know if it was actually done this way in Quake 2, but with a GPU it's better to culling at the level of BSP nodes and not worry about a little bit of overdraw where one node overlaps behind another
Matt, what tools are you using to render your videos?
Hello, mostly Blender and a tonne of Python scripts.
Can you do the same thing with "Minecraft"?! I want to see how "Minecraft" look like on Software renderer!
Also: The visualizations are nice but some parts warrent an epilepsy warning, IMHO.
please epilepsy warning
Wow, this was too technical for me
In short, the scanline algorithm is used as a depth prepass... but then... quake can't use transparent textures?
Not qute: sprites use transparency (binary). However, yes, translucent water is a problem, and not just because of the overdraw elimination, but the palette too (though the water could be dithered for a probably rather ugly translucency, but only as a second pass, which is how the GL renderer does it (minus the dithering)).
@Bill Currie well that's quake2 water, but quake2 was designed with OpenGL acceleration in mind, so the different passes were expected. But Q1 is entirely solid render. If any, with transparent passes, the lines don't get discarded. Instead, they have a chance to get properly sorted and have correct transparency🤔
@@santitabnavascues8673 Quake had transparent water in OpenGL as a hack, but yeah, it might have come after Quake 2 was released. Certainly the software renderer was not designed with transparent water in mind. However, that's really after the bsp stage of the rendering pipeline. The bsp tree traversal itself is barely changed between software and OpenGL, and even my Vulkan renderer's bsp stage is very similar.
@Bill Currie the bsp is enough to render transparencies well, the way it is traversed sorts the polygons, Quake3 renders them just fine. Good luck with your renderer!🙂
@@santitabnavascues8673 Of course it is, I never said otherwise. The BSP tree how the original legacy OpenGL rendering, and both the modernish OpenGL and Vulkan renderers (both of which I wrote) support transparency, because the BSP tree supports not only depth sorting, but also texture sorting.
It's the way spans are handled that (at least at first glance) make transparency awkward. I don't feel like digging into getting software transparency working as I have other things I want to work on (eg, shadows, diegetic UI, ...).
Ive been working with unreal engine 5 a lot and i love some of the modern optimisation methods possible with powerful gpus and multi core cpus, its hard to expain just how many advantages these have. But it gets frustrating when you see "new culling tech" being advertised when the problem was allready solved 30 years agao on less powerful hardware, in a more efficent way. If i had the time to go through unreal engine code and make some overhauls i think itd just end up using the same solutions found in quake (though some problems can be done better with modern hardware then could be done with methods used in quake".
In short, quake did optimisation better then multi billion dollar companies 30 years before and its very frustrating.
Definitely a case of "yes, but no" as methods for transparency; in particular the lighting and shadows around them, have changed a lot and require greater efforts to prevent drawing errors. That said, having an expanded set of rendering options with a hierarchy of what other methods they rely on being parsable so the desired balance of older optimizations to realism could be achieved on demand would be a wet dream, especially if it could be injected into earlier titles. The specular mapping of the mid 2000's that carried on way too long is something I'd probably tweak in every title I could😅
@@Xeogin i completely agree, i love 3d graphics and things like ray tracing is pretty much a must have IMO, but binary space partition gave so much preformance improvements (in inside spaces) that its kinda stupid to not use.
@@hughjanes4883 BSP is highly focused on static geometry, and it also requires a fair bit of precomputation. While it's still very effective on modern CPUs (see Ironwail), I can see why it fell out of favor.
@@hughjanes4883 BSP goes very bad very quickly with complexity. Cross cutting planes are a minor problem in Quake but they get drastically worse once you start doing intricate geometry. It was a suitable solution for the time, not for all times.
Yeah, but think of the kind of content we handle now a days. Most of it is dynamic and changing, whereas Quake levels are fully static. Also, current content has millions, literally almost, of triangles, so an active line list iwould become a large structure, and having ever smaller triangles (such as the nanite system) would make this structure even larger than an equivalent depth buffer and sorting through them and referencing polygons would pose a computational complexity that would make their utility rather limited, setting its scope completely outside of the real time rendering. But the core optimizations are still there, for example, many of the current engines do a depth prepass precisely to prevent the overdraw of complex pixel shaders
I can't watch this video, for the same reason i am leaving this comment, but i want to commend you on the best flashing lights warning i have seen on youtube c:
Sorry that you can't watch the video, I'll certainly pay closer attention to this in the future.
Ok here I was lost
Too complex video