Sparse Voxel Octree and Polygon hybrid

Поделиться
HTML-код
  • Опубликовано: 17 авг 2014
  • Combining CUDA sparse-voxel-octree ray-casting and OpenGL polygon rendering to create a single frame.
    The SVO handles long draw distances with a general-purpose level-of-detail algorithm. For close up content it switches to traditional OpenGL polygon rendering for continuous (non boxy) surfaces.
    Currently rendering a heightmap generated with L3DT ("Large 3D Terrain generator"). But in theory it should also be able to handle other content types like trees and foliage without having to resort to polygon based level-of-detail strategies like bill-boarding.
    The ray-casting averages about 50 fps at 1280x720 on a NVidia GeForce GTX 660. It actually speeds up when rendering more polygon content because any pixels covered by polygons don't need to be ray-cast.
  • ИгрыИгры

Комментарии • 82

  • @GonziHere
    @GonziHere 3 года назад +1

    Randomly stumbled upon this now and I must say this is a really clever way of dealing with voxels :-)

  • @TomMulgrew
    @TomMulgrew  9 лет назад +10

    The Morrowind conversion is on hold indefinitely at the moment - I think I underestimated the complexity of the project, and the difficulty of teasing out the rendering logic from OpenMW and the Ogre rendering engine.
    If someone was interested in helping me with this (I basically need to turn a Morrowind sector into a list of textured polygons that I can pipe into my triangle-to-voxel renderer), I might pick it up again.

  • @boeingcrazy5797
    @boeingcrazy5797 3 года назад +1

    Thanks for this, I just recently started experimenting with marching cubes and surface nets - this looks very interesting and is very inspiring!

  • @Tomkat53
    @Tomkat53 9 лет назад +20

    This is pretty brilliant!

    • @TomMulgrew
      @TomMulgrew  9 лет назад +2

      Thanks. I've put quite a lot of work into it :)

    • @duz2ht
      @duz2ht 9 лет назад

      Tom Mulgrew He's a DF player too hahahaha I said about your video to him HAHHAAHA, it's a magic, no doubt!

    • @Tomkat53
      @Tomkat53 9 лет назад

      Tom Mulgrew - when is the next video? :-)

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      Tomkat53 when I next have something new to show I guess :-). I haven't had a lot of time to work on this lately, so could be a while.

  • @tranceemerson8325
    @tranceemerson8325 4 года назад +3

    I really like your hydraulic erosion technique, looks VERY natural. maybe the terrain is scaled a bit too tall but other than that it loos fabulous!

    • @TomMulgrew
      @TomMulgrew  4 года назад +2

      Thanks. I can't claim credit for the erosion though. That was done by the L3DT terrain generator www.bundysoft.com/L3DT/

  • @DigitalLibrarian
    @DigitalLibrarian 8 лет назад +3

    Love it. i'm totally stealing this idea. Yoink!

  • @Vistorri
    @Vistorri 9 лет назад

    Very very impressive mate

  • @AmaroqStarwind
    @AmaroqStarwind 7 лет назад +4

    Don't know if you ever followed up on this, but if you used some compression algorithms with your height/displacement map, then you could significantly reduce the file size of the voxel model, since it renders directly from the height map.
    Tessellation would also be a good way to handle progressive level of detail for polygons.

  • @10bokaj
    @10bokaj 4 года назад +1

    cool stuff.

  • @dandymcgee
    @dandymcgee 4 года назад

    Very cool!

  • @Mundm28
    @Mundm28 8 лет назад

    very nice!

  • @jaspersgames4094
    @jaspersgames4094 3 месяца назад

    very cool!

  • @Channel-ds1cg
    @Channel-ds1cg 7 лет назад

    amazing

  • @blazejflorkiewicz9698
    @blazejflorkiewicz9698 2 месяца назад

    brilliant :)

  • @riteshpatidar9184
    @riteshpatidar9184 3 года назад

    Great

  • @nopeynope1771
    @nopeynope1771 8 лет назад

    Another method of mixing Voxels and Polygons is to store the model as an octree,
    Raytrace the distance, and use Manifold Dual Contouring to convert the octree to polygons.

  • @ML2005
    @ML2005 9 лет назад

    Awesome stuff! Are you still working on the Morrowind voxel conversion?

  • @clemdemort9613
    @clemdemort9613 3 года назад

    Crazy to think that we used to have voxels for far distances and polygons for up close, now it's the opposite haha

    • @clemdemort9613
      @clemdemort9613 3 года назад

      @@LegendLength yeah partially, some games use the voxel for their aesthetic, others for physics, like teardown

    • @clemdemort9613
      @clemdemort9613 3 года назад

      @@LegendLength well voxels aren't anything new, however we've seen them get more and more popular with projects like teardown or John Lin's sandbox
      Both of them use octrees to render huge amount of data.

  • @chien461
    @chien461 2 года назад

    do you know any papers that talk about 6DOF raycasting like this? Amazing work btw, i've thought of doing a raycast engine like this but the jump from 4 to 6DOF is quite daunting

  • @necrowizzard
    @necrowizzard 9 лет назад

    looks cool, wouldn't it make more sense to render more complex (no heightmap) terrain? did you compare the performance of this to a heightmap rendered with the tessellation shader and displacement mapping?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      necrowizzard Yes, it would. In fact it should be able to cope with all sorts of general purpose scenes theoretically. I just don't have the time or artistic ability to create a large complex scene myself, whereas large heightmaps are easy to generate with L3DT. I'm working on adding some trees (and boulders, bridges, houses, etc) though.
      And no, haven't compared it with any traditional polygon approach. I'd expect a tessellation shader to be faster.

    • @necrowizzard
      @necrowizzard 9 лет назад

      Tom Mulgrew Sounds like a good approach, good luck with adding more geometry.
      I only know of an article producing nice results that works the other way around (http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html). They start from volume data and generate a mesh via marching cubes. But, this is probably a bad reference due to the overhead in implementation. I don't know of any tool using their approach (probably there is one).

  • @hor6teen
    @hor6teen 6 лет назад

    Great work! Could you explain why these flimmering aliasing artifacts appear (in voxel-ray-traced parts) and how they could be avoided?

    • @TomMulgrew
      @TomMulgrew  6 лет назад +2

      I think it's because the octree has the equivalent of mipmapping built in, but there's no linear filtering. You can't account for "this pixel is 50% between texel A and texel B" like linear texture filtering does.
      You could theoretically implement something based on where the ray hits the voxel, but you would need to find the neighbouring voxels to interpolate their colours, which would be slow and expensive.
      I can't think of any way to avoid this except casting multiple rays per pixel - which would slash the frame rate dramatically.

    • @hor6teen
      @hor6teen 6 лет назад

      That's a great answer, thanks :)

  • @chucksneedmoreland
    @chucksneedmoreland 8 лет назад

    Just curious how are you rendering the voxels, are they cubes created with opengl?

    • @TomMulgrew
      @TomMulgrew  8 лет назад +1

      +Lando Calrissian Nope :). The cubes are stored in a tree structure and ray-cast using CUDA. It computes a ray for each pixel and calculates which voxel it will hit in the tree.

  • @eyezaryvideos
    @eyezaryvideos 9 лет назад

    Do you use Unlimited Detail-like way of voxel traversal optimization (using CUDA) or each ray cast separately (pack of rays per thread)? Also do you use UD-like voxel data size reduction optimization (linkless octree)?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      Have UD published some details of their algorithm? I thought they kept it heavily under wraps... ?
      The rays are each cast separately. The initial stack state (traversing from the root to the voxel containing the camera) is computed once into shared memory then copied for each ray, but that's about all the rays share.
      The data is stored in fixed-depth sub-trees, which are paged in from disk and composed together to create the full tree. I don't store any topology data at all for the leaf nodes (just colour data), leaf parents store just an 8 bit mask of which children are present, and leaf nodes above that store the mask plus a 3 byte index of the first child node within the sub-tree (so 1+3=4 bytes). The colour data is stored as per research.nvidia.com/publication/efficient-sparse-voxel-octrees.
      The bytes/voxel depends heavily on the average occupancy. The scene averages about 3.5 child voxels per parent, which leads to 2.25 bytes/voxel.

    • @eyezaryvideos
      @eyezaryvideos 9 лет назад

      Tom Mulgrew Algorithm in short: for whole viewport traverse octree nodes front-to-back (8 possible directions for 8 children nodes, order could be easily calculated by XOR operation with mask from three signs of (dx,dy,dz) from camera origin to parent node center), 1) if child behind camera - skip it, 2) if child partially behind camera - subdivide (and repeat same front-to-back algorithm for its children), 3) if child in front of camera, but too big (not single pixel) - first test for occlusion (any shape that 100% covering all children vs stencil/z-buffer, filled rect would suffice), then if not occluded - subdivide (add children indexes to stack in back-to-front order to process them front-to-back) , 4) if child in front of camera and small enough (single pixel) - draw it (and fill stencil or z-buffer to help to skip tons of occluded voxels), if cannot subdivide - draw filled rect (same as single pixel, but bigger). In fact we can save all pixel-on-screen-sized voxels in separate memory and render them on each frame, updating them dynamically by queries to load more voxels (when subdividing) and un-subdividing children when all of them occluded.
      UD optimization: switch from perspective projection of nodes points to ortho-projection when estimated node size in pixels on screen becomes too small and perspective projection gives nearly same result as ortho-projection. This trick saves lots of division-by-z operations and could be performed in fixed-point arithmetics. It's also patented: www.google.com/patents/WO2014043735A1
      Linkless octree format: 8 bit mask of children in parent node, next level of tree contains only valid children, child index calculated as sum of set bits from previous level plus child index inside parent. Same for next levels. Saves a lot of memory if only surface of voxels encoded (not volume). Requires lookup tables, but could be performed on CPU.

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      Braid Acer Thanks. It'll take me a while to digest that. It does seem to me you'd need pretty efficient occlusion tests to make something like that work fast enough though (and not just testing 1 pixel at a time)..?
      The linkless octree format sounds like what I was considering for storing sub-trees on disk (sequential list of 8 bit child masks for each node in breath-first order). But I'd have to put the links back in when loading into device memory as I can't see how you could efficiently traverse it otherwise.
      I was also considering a compression scheme that involves taking each possible bitmask and building a huffman tree for predicting what a child node's bitmask would be for a parent node with that bitmask. My theory is that specific surface directions would make certain bitmasks more likely and that some redundancy could be squeezed out by exploiting that. All just theory at the moment.

    • @eyezaryvideos
      @eyezaryvideos 9 лет назад

      Tom Mulgrew You should check whole shape of pixels at once. It may seem not very efficient, but it is, in the end, as it allows to optimize occlusion in the best way possible, no invisible areas will be octree-depth traversed at all. Also no checks required for already subdivided nodes, only checks for further subdivisions and "mergebacks". The question is -- could it be implemented in CUDA? My best variant was to split viewport into N smaller views and use CUDA as threads-provider, if it is possible this way. You'll need read-only shared memory for active nodes (unpacked with indexes of children and color/material information) and small per-thread writeable buffer to collect subdivisions and "mergebacks" queries to be processed asynchronously on CPU, including possibility to stream data the same way it works in UD. Can CUDA work this way?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      Braid Acer I'm not sure sorry. I don't have a clear enough picture of what you're trying to do.
      It sounds like for the general algorithm (sub-dividing, merging and branching) the challenge would be to minimize the amount of branching. CUDA (and GPUs in general) run batches of threads in lock-step, where each thread executes the same instruction but with different data. So if code branches the threads that don't follow the branch stall and wait for the ones that do to complete.
      Async CPU/GPU is quite doable with CUDA, and the synchronisation is pretty straightforward. You can also overlap data transfers with GPU processing.

  • @chasemarangu
    @chasemarangu 6 лет назад

    I have not much experience with a rendering engine at all. However, I know what a raycaster is (though I haven't yet coded one) and what voxel is, and I know what a polygon is and have heard about OpenGL it is used for 3D rendering but it is too hard for me. You did a great explanation, and the visuals helped alot.

  • @theb1rd
    @theb1rd 10 лет назад +2

    Any particular reason for ray-casting in CUDA instead of a fragment shader?

    • @TomMulgrew
      @TomMulgrew  10 лет назад

      I don't have heaps of experience with fragment shaders but suspect think they may not be flexible enough. The octree is several 100 megs in memory and so can't fit into a texture, and each thread needs an array for tracking visited nodes (I've tried kd-restart but couldn't get it fast enough).
      It could probably be ported to OpenCL though.

    • @theb1rd
      @theb1rd 10 лет назад

      Thanks for the info. One more question, if you don't mind: are the traversal stack arrays in shared memory, or an ordinary buffer?

    • @TomMulgrew
      @TomMulgrew  10 лет назад

      theb1rd it's a local array inside the kernel function - I think that means it's effectively global memory.
      I hadn't heard of the shared memory approach (at least not for thread-local storage). Sounds definitely worth trying. Thanks for the tip :)

  • @PunkOffice1
    @PunkOffice1 7 лет назад

    Are you working for High Fidelity now? I think they are using the same technique

    • @TomMulgrew
      @TomMulgrew  7 лет назад

      Nope. Never head of them. This is just a personal project I built a couple of years ago.

  • @DjChronokun
    @DjChronokun 9 лет назад +1

    is there any advantage to doing the ray-cast in CUDA vs a glsl compute shader?

    • @DjChronokun
      @DjChronokun 9 лет назад

      Also, are you doing any reprojection of pixels that have already been ray-cast last frame so you only have to fill the holes or just ray-casting all those pixels every frame?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      DjChronokun I'm ray-casting everything each frame (excluding the pixels covered by polygons).
      Re-projection sounds interesting though. Is there any prior research I can read up on? A few things aren't obvious, like content coming in from the side of the frame and obscuring re-projected pixels from the previous frame.
      The main issue with a glsl shader is that it has to be able to randomly address several hundred megabytes of voxel data. Correct me if I'm wrong (I'm not too familiar with glsl) but I believe a shader would have to access the voxel data via texture objects, which would limit the amount of addressable memory and therefore the number of voxels in the scene.

    • @DjChronokun
      @DjChronokun 9 лет назад

      Tom Mulgrew well I did come across this: voxels.blogspot.co.nz/2014/05/raycaster-speed-up-up-to-400-by-image.html but it seems to come with some artifacts, I hadn't considered the case of things moving in-front of existing pixels, that would seem to cause problems.
      A glsl shader would either have to access textures or shader storage buffers, I'm not sure if that would impose extra memory limits or not, each SSBO is guaranteed to be at least 16MB but I don't know how much total memory you can have, I'd guess it would be limited to the available graphics memory which I think is usually about 2x the amount of ram on the card so it might require some extra manual paging of textures/SSBOs

  • @jupitersky
    @jupitersky 5 лет назад +2

    Minecraft should use this,
    you can see so far it's amazing!

  • @joeseth05
    @joeseth05 9 лет назад

    One question: Why do you render voxels as boxes? Wouldn't it be easier and more efficient to just use solid filled 2D shapes like circles or squares (either with hard or soft edges) and giving them a little bit of overlap to prevent unfilled pixels from showing up?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      It's essentially an aspect of the data structure and the fact that it's ray traced.
      In a SVO the scene is stored as a cube divided into 8 equal sized smaller cubes, which are in turn also divided into smaller cubes and so on.. until you eventually get down to the tiny cubes that you render.
      If you wanted different shapes (spheres etc), you'd need a different code path to handle the rendered nodes while still efficiently searching the higher level nodes. I'd expect that sort of branching to incur a significant performance penalty in CUDA.
      It may be possible.. (There may be a clever trick to it that I don't know about). But I can't see how to do it efficiently.

    • @joeseth05
      @joeseth05 9 лет назад

      Tom Mulgrew I didn't mean that the space should be subdivided into sphere-shaped regions, but what the primitive shapes that each voxel get ultimately rendered by should look like.

    • @TomMulgrew
      @TomMulgrew  9 лет назад +2

      joeseth05 it's the ray-tracing that makes it complicated. Rather than 1) Find voxels, 2) Render them, it's 1) Trace a ray through each pixel, 2) figure out what it hit, 3) draw the pixel in that colour.
      So it's not deliberately drawing the voxels cube shaped, but rather the pixels whose rays hit the same voxel happen to form a cube-shaped projection, due to the nature of the ray-traced data structure.

    • @joeseth05
      @joeseth05 9 лет назад +1

      Tom Mulgrew Ah, OK, so it sounds like it employs ray-casting(a special kind of ray-tracing) for the rendering of the voxels as opposed to rasterization which I initially assumed, then of course, the cube shape makes sense.
      On the other hand, I wonder whether rasterizing voxel grids is faster then ray-casting them, because usually rasterizers are faster then ray-casters for all kind of primitives, but rasterizers are more complicated. But these are voxel grids, and rendering voxel grids as primitives should be easier than rendering triangles, becuase voxel grids are even more primitive than triangles.
      Only problem I can attest that voxel grid rasterizers do pose in practice is to get the neighbor overlap right.

    • @TomMulgrew
      @TomMulgrew  9 лет назад +2

      joeseth05 sparse voxel octrees may be the point where ray-casting overtakes traditional rendering, at least when you get down to 1 pixel per voxel. Chiefly because rasterizing time is roughly proportional to the number of primatives, (there's about half a billion voxels loaded at any one time in my scene) whereas with ray-casting it's proportional to the number of pixels.
      I can get 60-70fps at 720p. Jon Olick tweeted a few months back that he was managing 60fps @ 1080p on similar hardware (and seemed confident he could double it again).

  • @otonanoC
    @otonanoC 9 лет назад +8

    Submit a paper to SIGGRAPH !

    • @HickoryDickory86
      @HickoryDickory86 3 года назад +1

      I'm five years late, but... Hear, hear!
      This is absolutely brilliant, and should definitely gain momentum in the larger gaming sphere. Plus, the more developers there are who start working on implementations, the more refinement and performance-saving solutions there will be.

    • @erich_l4644
      @erich_l4644 3 года назад

      @@HickoryDickory86 is this a cutting edge idea? I need to do a graduation paper... hahahaha

    • @HickoryDickory86
      @HickoryDickory86 3 года назад

      @@erich_l4644 Personally, I believe so. Haven't seen it used anywhere else by anyone else. Yes, voxels are used in some instance for LoD (see this Stack Exchange response: gamedev.stackexchange.com/a/70841 ), but this approach takes it to the next level altogether.
      I would much prefer a vendor-independent solution that didn't require Nvidia (CUDA) to compile the SVOs (e.g., OpenCL), but it makes sense for Tom to use the best option available to him.

  • @duz2ht
    @duz2ht 9 лет назад

    Man, you did what I was searching for years. It's like a improved Voxel Space engine used by Novalogic in Delta Force 1 and 2. I'm not a programmer but I'm learning some things in Unit3D, trying to remake at least the multiplayer of Delta Force that is flawless.
    Did you ever played Delta Force?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      No, never played it. I checked out a couple of RUclips videos though. Looks impressive for its time.

    • @duz2ht
      @duz2ht 9 лет назад

      Tom Mulgrew Yeah, it was! Look to the terrain in your video. Now imagine a 8vs8, basically a war with this flawless environment... trenches, lakes, hills, holes... just few buildings. There's some DF1 and DF2 multiplayer videos in my channel.
      Here's the question: How to use that kind of terrain as you did? I have some splat/normal maps and things to generate some of the old DF terrains, check this: imgur.com/a/iF6Se
      Is there a pack/files available out there or it's a private thing?
      I've generated one of those terrains using the grayscale image (splat) with the default Unit3D terrain system. (take 2 minutes) But the result is not smooth as the Voxel Space engine created by Novalogic, a lot of vertices/poligons.

    • @formaffinity
      @formaffinity 9 лет назад

      Eduardo Macedo Unity3D does have some Voxel engines for download. I know there's one for about 80 bucks. There's also Efficient-Sparce-voxel-octrees which is an open source project. What Mr. Mulgrew has done here is truly impressive. DF2 was an expansive world full of great players during it's time, and the type of game that is sorely missed in today's closed down, tunnel vision arcade shooters. A DF2 like game, with new maps, TXP maps and better graphics, more models etc, would be a dream.

    • @duz2ht
      @duz2ht 9 лет назад

      increality For sure, would be a dream! After DF1, DF2 and even DFLW and DFTFD I never saw a game with those characteristics. I've played BF2, BF3, BF4, CODs, Insurgency and a lot of FPSs and what Novalogic did, no one else did.
      The question is: The engine in his video seems perfect for the game concept, outstanding. I can't find anything in Unit3D store with these characteristics, I just saw these cubic voxel engines, but on this subject I'm a newbie. I'm wrong? There's similar things over there? The bad part is: I'm newbie and I can't waste $80 to test something...
      PS: You are in the DF2 Reunion group on Facebook? If yes, what's your name?

    • @formaffinity
      @formaffinity 9 лет назад

      Eduardo Macedo I was just accepted to the group having applied for it last night. I go by Jesse, I'll add you over there now. I also did some BF2, and BFBC2. BF2 was still better than most, at least the maps were relatively larger than say, COD or the like. Joint Ops was ok too, but too color saturated for my tastes, and lacking in the natural feel DF2 had. You could also look into 3DEM's for map creation. There's a lot of free satellite imagery, which can be converted into full, repeatable and scale-able terrains. I understand the issues with not wanting to buy/test voxel software. There are similar things, full voxel, but not the hybrid polygon/voxel that he's doing. Not to my knowledge.

  • @clankill3r
    @clankill3r 9 лет назад

    What is your technical background?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      clankill3r (Sorry about the delay.. been busy). My day job is a web developer. The 3D graphics and voxels are a spare-time hobby.

    • @clankill3r
      @clankill3r 9 лет назад

      Tom Mulgrew Nice, I want to make an engine as well as a hobby project. You being able to do this is really inspiring.

  • @darkrider1946
    @darkrider1946 8 лет назад

    Sources from the Voxlap and in what language (C ++ or Java)?

    • @TomMulgrew
      @TomMulgrew  8 лет назад

      +Dark Rider sorry, I haven't released the source.
      The main application is C++. The voxel rendering is CUDA.

    • @darkrider1946
      @darkrider1946 8 лет назад

      Tom Mulgrew There is a Sparse Voxel Octree?

    • @TomMulgrew
      @TomMulgrew  8 лет назад

      Yes. All the voxels are stored in a sparse voxel octree and rendered from there.

  • @moloned
    @moloned 9 лет назад

    do you plan to publish your code?

    • @TomMulgrew
      @TomMulgrew  9 лет назад

      No concrete plans at this stage, sorry.
      I still have some more development before I start thinking about what to do with this.

  • @negrastormentas2865
    @negrastormentas2865 6 месяцев назад

    Have you ever used this in a project? Have you ever considered releasing the code?

  • @duz1ht
    @duz1ht 9 лет назад

    You rule sir! Check my channel header, it's a Delta Force 1 voxel terrain. Tell me, you could reproduce the terrain with a pack of normal map, texture and the black and white image that I forgot the tech name? (I could show you the three images that the Delta Force 1 engine uses to generate the map)

  • @LanIost
    @LanIost 8 лет назад

    Note: These are REAL voxels, not minecraft 'voxels'