Это видео недоступно.
Сожалеем об этом.

K-d Trees - Computerphile

Поделиться
HTML-код
  • Опубликовано: 20 янв 2022
  • One of the cleanest ways to cut down a search space when working out point proximity! Mike Pound explains K-Dimension Trees.
    EXTRA BITS: • EXTRA BITS: More on K-...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottsco...
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Комментарии • 228

  • @JohnnyWednesday
    @JohnnyWednesday 2 года назад +589

    It's worth noting for any budding game developers that K-d trees have to be rebuilt if objects move so if you're looking to accelerate a locality query between many moving entities? a quad/oct tree is your best choice as it is possible to maintain them in real-time with minimal computation.

    • @ssvis2
      @ssvis2 2 года назад +35

      Yep. You also have to factor in computational time to split your data into whatever structure. Dynamic data makes it a lot harder and in some cases can actually cause you to spend more time on tree rebuilding than brute force searches.

    • @SerBallister
      @SerBallister 2 года назад +14

      Typically both can be used, KD-trees for static objects which are then placed (by their bounding boxes) in a spatial hierarchy like an Oct-tree.

    • @GilesBathgate
      @GilesBathgate 2 года назад +5

      So actually, you can do some transformation on the k-d tree such as scaling all axis by a factor or translation. With translation, you just move the splitting planes, relative to your origin or query point.

    • @SerBallister
      @SerBallister 2 года назад +4

      @@GilesBathgate Yeah. Or the opposite, the ray or whatever youre testing can be transformed from its world position into the kdtree's space.

    • @JohnnyWednesday
      @JohnnyWednesday 2 года назад +4

      ​@@SerBallister - No matter how much you transform the KD tree? it's not going to help you manage a world full of moving entities at interactive frame rates. Transforming coordinates between two different reference frames is a moot point.

  • @jacob_90s
    @jacob_90s 2 года назад +137

    Great video, but just want to note as well that multidimensional trees can also be used for things other than just geometric use cases; they can be also very useful if you need to search for data with multiple ranges. If, for instance, you had a database of retail transactions, and you wanted to search efficiently for all transactions within a certain date range AND with a minimum and maximum price, you would need to use a k-d tree or something like it, to be able to efficiently find the results.

    • @jacob_90s
      @jacob_90s 2 года назад +2

      @Sky Gardener Yep, but the issue comes down to type. Trying to use a geometric data type to hold dates, numbers, strings, etc is possible (to a limited extent) but an unnecessary pain. The problem is that this is an index that should be generic, and let the designer choose the types (like regular indexes) yet most database developers are treating it like an application specific problem, which should only ever be used for geometric problems

  • @nullablebool
    @nullablebool 2 года назад +145

    Dr. Pound - the teacher I wish I had. What a guy.

    • @GamerMartin96
      @GamerMartin96 2 года назад +16

      I was lucky enough to be taught by Mike for a couple of modules at uni. Definitely one of the best in the department, alongside Steve Bagley and the other UoN professors on Computerphile

    • @ghosttwo2
      @ghosttwo2 2 года назад +11

      Looks like Toby Maguire's dad.

    • @paulbunyangonewild7596
      @paulbunyangonewild7596 2 года назад +1

      This sounds like a dirty innuendo. If that is the case..... I would also love to have him as a teacher.

    • @JamienM
      @JamienM 2 года назад +1

      Pounding out the knowledge 😁

    • @danceswithdirt7197
      @danceswithdirt7197 2 года назад

      How funny. I was just thinking I'd love to take a course with the dude.

  • @zacharychristy8928
    @zacharychristy8928 2 года назад +22

    The KD tree really shines for me when you represent it with a simple heap. You get an advanced data structure with range/radius searches, and nearest neighbor queries, all in logarithmic time for no extra memory! The kd tree would just be an array of points!
    If you want to pick the split dimensions more smartly (like if you have a long, thin pointcloud) you only need one array of bytes that represents which split occurs at each node.

  • @astropgn
    @astropgn 2 года назад +44

    10:37 I think that a good visualization to see if it could be in another rectangle is to draw a circle with radius = the distance from the point to G (the closest distance so far). The circle goes to the other rectangle, so it means that it could have a point in there with a smaller distance. But since it doesn't go to any other square, it is impossible to have a point there it a smaller distance.

    • @iwanadai3065
      @iwanadai3065 Год назад

      does that accurately represent the process the computer takes with a kd tree?

  • @niks660097
    @niks660097 2 года назад +7

    K-d Trees are used extensively in RTS games, to calculate gameplay stats e.g damage based on range or units attacking closest enemy etc...

  • @Alig4realz
    @Alig4realz 2 года назад +5

    Every time I see Mike Pound on Computerphile its an instant click

    • @tissuepaper9962
      @tissuepaper9962 2 года назад

      Same with the OEIS guy whose name I can't remember.

  • @mickenev
    @mickenev 10 месяцев назад +2

    This is hands down the best explanation I have seen of K-d trees and it is done in a way that captivates attention. Well done and thank you from college students around the world.

  • @GilesBathgate
    @GilesBathgate 2 года назад +4

    It was mentioned that this doesn't really work for edges and facets, but actually you can do this. Objects are either on one side of the splitting plane, or the other, or they span the plane, in which case they are added to both left and right cells/leafs of the tree.

  • @NickAskew
    @NickAskew 2 года назад +14

    Ah, now I see where I possibly went wrong. We were using a k-d tree (actually 3D tree) looking for points and indeed sometimes the points were over the border and I guess we were not working back up the tree. I'll have to see if that was ever fixed. Actually I think we were looking for the nearest triangle on a surface and if I remember correctly we used the tree to determine likely nearest points and then build a list of triangles that met at that point and then calculated the nearest perpendicular to the plane of the triangles (or to the edge if the nearest point on the plane was outside the triangle) and then we could decide which triangle the point was nearest to.
    What I remember from years ago is that building the spatial index is far more costly than using it. So for purposes such as lidar I'd expect they need to refresh the index so often that using it the way you suggest seems like it might be expensive in terms of processing.

  • @moralboundaries1
    @moralboundaries1 2 года назад +12

    These data structure videos are fascinating! More please! Can you look into the data structures that are used for meteorological and cosmological simulations?

  • @Mutual_Information
    @Mutual_Information 2 года назад +22

    K-d Trees.. Computerphile doesn't shy away from the more esoteric topics!

    • @DantalionNl
      @DantalionNl 2 года назад +1

      If looking for more esoteric tree data structures I'd suggest BSP trees or octtrees as well

    • @lancecoleman7440
      @lancecoleman7440 2 года назад

      brave is brave

    • @MysticPing
      @MysticPing 2 года назад +3

      Both BSPs and KD-Trees are very common in computer graphics

    • @kiramaticc
      @kiramaticc 2 года назад +6

      Esoteric? I thought K-d trees were commonly taught in most Comp Sci degrees. I know I learned K-d trees at my university.

    • @Mutual_Information
      @Mutual_Information 2 года назад +3

      ​For RUclips, I'd consider an explainer on K-d trees pretty niche. It's one way to speed up nearest neighbor search - seems pretty specialized to me.
      But I see your point. A CS student would *not* think it's esoteric at all. And maybe I should factor that in. This is Computerphile after all

  • @erickweil4580
    @erickweil4580 2 года назад +3

    I just finished implementing a Bounding Volume Hierarchy for my 4D Game and this video pops up. To build the BVH I use a very similar approach as is used to build the kd tree using the objects centroid as points, but then spliting is stoped once some amount of objects, like 4, are on the leafs, then work from bottom up calculating the combined AABB's of the leafs, and the AABB's of each branch node so that the AABB contains every child (they may overlap).
    In the end you have a structure that with a few AABB tests let you decide which objects are colliding with you, in the best case scenario, with 4 childs each node, you only need to do 60 tests to find a object in the middle of 1M others (The tree will have a depth of 15, each level you check for 4 nodes to decide where to continue), but in practice, the nodes may overlap leading to a few more checks.

  • @oneratdylan
    @oneratdylan 2 года назад

    The timing of this is amazing. I learnt about KD trees last week and I'm using them in my game. Gotta love a KD tree

  • @archivedandgone
    @archivedandgone 2 года назад +2

    My favorite Computerphile professor!

  • @mbakenemdusink9757
    @mbakenemdusink9757 2 года назад +3

    Sounds like by far the most expensive operation is going to be the calculation of the actual distance between to points, requiring two multiplications and a square root: sqrt(x^2 + y^2), and you'll have to do that a lot.
    I once had to solve this problem back in the 90s for an interactive graph editor, so we had to determine the node closest to the mouse.
    We used the much faster Manhattan distance (x + y), and that worked really nice.
    Algorithms also derail in time when your starting point is far away from the actual nodes.

    • @EmersonPeters
      @EmersonPeters 2 года назад +3

      I was going to say that when comparing distances you can skip the square root, but then realized that the correct distance is needed if deciding if a branch can be pruned.
      Edit: Wait, no, you can just square the distance to the line instead lol

    • @tissuepaper9962
      @tissuepaper9962 2 года назад +1

      You don't need to take the square root. Squared distance works just as well. There are actually a *lot* of alternative distance functions you can use that match euclidean distance better than Manhattan distance does without breaking the computational bank.

  • @RobLang
    @RobLang 2 года назад

    Awesome video! The right balance of humour, description and algorithm. Brilliant.

  • @zxuiji
    @zxuiji 2 года назад +5

    9:52, around this point it occurred to me there's an optimisation for distance checking, normally sqrt() is used but it's actually sufficient to just multiply the difference between the coordinate axis against each other and use that as a distance for comparing to other distances of the same kind

    • @alegian7934
      @alegian7934 2 года назад +2

      what if you had points with the same x value? then x1-x2=0 so your product is 0, while their distance is not. Also intrestingly, your product dx*dy could be interpreted as the area of the rectangle between the points. Kind of how you can click & drag your mouse on your desktop and it highlights a rectangle

    • @tesfabpel
      @tesfabpel 2 года назад +7

      distance squared is used a lot to speed up the check... this avoids the costly sqrt.

    • @zxuiji
      @zxuiji 2 года назад

      @@alegian7934 just add 1 to both values then multiply

    • @zxuiji
      @zxuiji 2 года назад

      @@tesfabpel tbh, I think it's also possible to convert it to square root form using the distance between the furthest points on each axis, not gonna try to explain with my phone keyboard though

    • @the-mush
      @the-mush 2 года назад +1

      @@tesfabpel wouldn't it be even more efficient to just use manhattan distances?

  • @kaiweiyeo4279
    @kaiweiyeo4279 2 года назад +2

    I love Computerphile. Thanks for educating the public! Have you considered a series on a confusing thing called Design Patterns by the Gang of Four?

  • @dodogaz
    @dodogaz 2 года назад +9

    Really interesting algorithm.
    building the tree doesn't seem so fast: how do you choose the next point for the split? If it has to be in the middle, you still need to compute the position of every other point, right?
    Also, I guess the position of the starting point impacts the performance, i mean it will be faster to inspect closest points than the ones farther away.

    • @gothxx
      @gothxx 2 года назад

      In the middle of the section. So half each time. Also insert might be slower, but search is fast.

    • @ILMTitan
      @ILMTitan 2 года назад +5

      Building the tree shouldn't be too slow. You need two lists, sorted on X and Y. Then, when you split in one dimension, you just build two new lists, each maintaining the order from the other dimension. That would be O(n log(n) + n log(n) + n log(n)) = O(n log(n)). Mutation however, would be slow, as one extra point could easily change the entire tree.

    • @oscarsmith3942
      @oscarsmith3942 2 года назад +1

      @@ILMTitan That is a much less stupid n log(n) algorithm than the one I came up with. The other algorithm you can use is to use Quick Select to find the median in O(n) time, which gives a runtime of T(n) = 2T(n/2) + O(n) = O(n log(n)). I guess mine has the advantage of using O(1) space, but it still will be at least 10x slower in practice...

  • @edmunns8825
    @edmunns8825 2 года назад +1

    As an embedded engineer I like decision trees, It's all we can run on an MSP430. But you can do a lot with them if you know how to train them. Thanks WEKA!

  • @todayonthebench
    @todayonthebench 2 года назад +3

    The downside with the K-d trees is that the data needs to be prepared by making comparisons before it can be used to find the closest point to another point. Depending on how long it takes to prepare the data, it might actually be slower than other methods.
    Though, most methods also needs to prepare their data as well, making it rather an argument about how the data is prepared and how long that takes. (Though, for really small datasets the brute force approach can be statistically the fastest.)
    Another method of finding the closest point to another point or in a given direction of movement (as in ray tracing) is to group the points based on location. Quantizing it into smaller lists. Then we only need to check the points in the neighboring lists, or the lists relative to our direction. And quantizing data into lists is a relatively trivial task since it does no comparisons nor calculates any medians. This however has many downsides of its own, so it too won't be ideal in a lot of situations. (And one can recursively quantize the lists into new lists, how many items a list contains is though a debatable topic.)

  • @davidm2.johnston684
    @davidm2.johnston684 2 года назад +3

    Very interested in computer graphics and the bounding volume hierarchy over here!

  • @irfanjames6551
    @irfanjames6551 2 года назад +2

    I would really like you to do a video on "Bounding Volume Hierarchy".
    I think they implemented it for Unreal 5 (nanite).

  • @eliotcougar
    @eliotcougar 2 года назад +3

    Many people know how hard it is to find G...

  • @RonJohn63
    @RonJohn63 2 года назад

    The splitting reminds me of what's done in Quicksort, and the comparisons are tree pruning.

  • @nerdinator2807
    @nerdinator2807 2 года назад +2

    DR MIIIIKE POOOUND

  • @normalPlayerSeverEarth
    @normalPlayerSeverEarth 3 месяца назад

    Thank you so much for explaining this instructively.

  • @radoslawlanga1809
    @radoslawlanga1809 2 года назад

    This is not how RUclips works, one second you are watching kittens playing and next thing you see are K-d trees videos.;)

  • @CoolFire666
    @CoolFire666 2 года назад +1

    How do you pick the point you're going to use to split on? And what happens if you're splitting in the X direction and there's another point with the same X coordinate?

  • @gnagyusa
    @gnagyusa 2 года назад +3

    Mike's incorrect about triangles. In fact, many ray-tracers use K-d trees to store triangles, including Equinox3D. They are a bit slower to construct than a BVH, but faster to traverse.

  • @jameshamann465
    @jameshamann465 2 года назад

    I listen to these while doing other activities. Mostly just because I like Mike's calming British accent

  • @rylaczero3740
    @rylaczero3740 2 года назад +4

    Wikipedia:
    In particular, when the number of points is only slightly higher than the number of dimensions, the algorithm is only slightly better than a linear search of all of the points. As a general rule, if the dimensionality is k, the number of points in the data, n, should be n ≫ 2^k.

  • @yiwenju2735
    @yiwenju2735 Год назад

    Detecting the closest point from the video isn't robust, but the rest of the algorithm is correct. People refer to the wiki page of kdtree for finding nearest neighbors.

  • @finkelmann
    @finkelmann 2 года назад +5

    Nicely explained as always!
    However... I think you made a small mistake. Around 10:30, youre saying the distance from X to B IS already larger than G-B. But, If B were to move much to the left, but keep the same "vertical" distance from X, there could still be a point on that orange line which is quite close to X. In other words, not all points inside B are guaranteed to be farther from X than B is...
    I may be wrong here, but i think points above the B line should still be verified.

    • @Cookie-qu1gs
      @Cookie-qu1gs 2 года назад +5

      Although he seems to say the distance from X to B, he's actually talking about the distance from X to the line through B, which is an important distinction. Any point above that line must be *at minimum* as far away from X as the line is, and since the distance from X to the line is greater than the distance from G to X, we know that no points above that line could possibly be the closest point. We can find the distance from X to the line through B by simply comparing the Y values of X and B, which is a lot better than having to calculate the distance using both X and Y co-ordinates. So not only does checking the distance to the line rather than the point ensure you know when you can and cannot avoid checking points beyond the line, it is also faster to check than checking the distance to the point.
      Hope this cleared up your confusion! :D

    • @finkelmann
      @finkelmann 2 года назад +1

      @@Cookie-qu1gs exactly, you pretty much clarified what I pointed out. The distance to the line is what i referred to as the minimal "vertical distance". So yeah, in that case the area inside/above B really is all cleared.

  • @bramvandyck1273
    @bramvandyck1273 6 месяцев назад

    So that was clear! Can you explain the Ball-Tree algorithm too?

  • @aleksandar5323
    @aleksandar5323 2 года назад

    Hmm, what do we do, if the node on the right of E is not a leaf node? Do we go further down it and how does the algorithm go?

  • @martins4555
    @martins4555 10 месяцев назад

    Really great video and very well described. Tho the outro is a bit weird with two sounds running at the same time suddenly

  • @AndrewHelgeCox
    @AndrewHelgeCox 2 года назад

    BVHs have become the basis for ray tracing hardware but at the turn of the century, kd-trees were the state of the art in interactive ray tracing. Later, an interest in build and refit times for dynamic meshes led researchers to find ways to make BVHs competitive for the static mesh case.

    • @AndrewHelgeCox
      @AndrewHelgeCox 2 года назад

      Look for papers from Saarland by Ingo Wald, Carsten Benthin and others from twenty years ago to get the history of kd-trees and BVHs in fast ray tracing.

  • @Theraot
    @Theraot 2 года назад +2

    Two plus two is four minus one that's three, quick mafs. 11:33

  • @lerichroi4129
    @lerichroi4129 Год назад

    A teacher like no other. My university teacher only made this simple concept extra hard to understand.

  • @TheEdwinpako
    @TheEdwinpako 2 года назад

    So in nth dimension u would split with (n-1)th dimensional planes? For example in 3d with 2d planes

  • @Josh-tu9ji
    @Josh-tu9ji Год назад

    Sure, it's O(logn) to search through a k-d tree, but what's the time complexity of constructing such a tree. It seems like it would still be close to O(n) if you have to visit a lot of nodes to accurately construct a k-d tree.

  • @sudiptoborun
    @sudiptoborun 2 года назад

    I'm studying Graph Theory and I find this fascinating.

  • @DAG_42
    @DAG_42 2 года назад

    In the case of a vast quantity of points, what should you do if you don't expand the tree all the way down? Or do you always add all points to the tree?

  • @Voltra_
    @Voltra_ 2 года назад

    That's a neat data structure

  • @cannonball7
    @cannonball7 2 года назад +1

    Hell yeah, K-d Trees in CRISP 360p

    • @seanski44
      @seanski44 2 года назад +1

      Sometimes RUclips ain't that quick at processing :)

  • @Raptomic
    @Raptomic 2 года назад +2

    If the point in the example (in the space defined by the lines of E, G and B) was closer the the border and the line of B, shouldn't we be checking for other closer points above B ? In such a case, the line to B wouldn't be the shortest to the space above B (kind of like the situation explained for the space right of E so we would have to check for points there I believe.
    Nice video as always

    • @Refract3d
      @Refract3d 2 года назад +6

      I think (and don't quote me on this), he's comparing the distance to be in only 1 axis, therefore knowing that the minimum distance was already longer than the distance to G, therefore eliminating anything in that quadrant. I guess in that scenario though, if that distance was actually smaller, you'd have to start checking the points distance themselves...It's unclear how it would be usually calculated.

    • @ILMTitan
      @ILMTitan 2 года назад

      @@Refract3d Right. If (B.y - X.y) < Distance(X, G), you basically repeat the algorithm on the other side of B, finding a best guess first leaf, seeing if it is closer, and working your way back up the tree.

  • @baehyunsol3349
    @baehyunsol3349 2 года назад +1

    You had 15 points on the plane, which is 2^4 - 1. Once you chose A and divide others into 2, each side had 7 points, which is 2^3 - 1. 2^2 - 1 for the next iteration, and 2^1 - 1 for the final. Was it intentional to begin with 2^n - 1 points?
    Anyway, thanks for a great video!

    • @G12GilbertProduction
      @G12GilbertProduction 2 года назад

      Probably he draw a 14 points, not 15 at all. So, 14 ^ 2 - 1 it could be final output before (O(x)) goes pretend to reverse this iteration.

  • @stewdean
    @stewdean 2 года назад

    So it'll work in six dimensions. And what happens if you want to remove one of the dimensions dynamically - so everything on dimensions 2 and 5 doesn't matter?

  • @elietheprof5678
    @elietheprof5678 2 года назад +2

    What do you think of bubble-based space partitioning? Spheres which contain spheres which contain objects, for example. It seems like the easiest space-partition for beginner programmers to implement... but it must come with some disadvantages, right?

  • @TruePathLiving
    @TruePathLiving 2 года назад

    Hi, I noticed that when I try looking for coordinates nearest to the center, it won't give me the ones that I expected. I willl plot my coordinates on a scatterplot, I see that there are coordinates literally on (0.0) but when I try finding the 10 nearest points to the center it will give me points much farther than the ones I see exist on my scatterplot.. Can you explain why that is? Evem if the computer iterates through all the coordinates and finds the child coordinate how is it that it gives me different results than I am expecting, can someone explain?

  • @CattyKit_chill
    @CattyKit_chill 2 года назад

    I am currently doing something which requires me to calculate the distance to coastlines for every single pixel of water to a set of land (which includes the coastlines). This algorithm is something I've used and finding Mike talking about it really made me happy :)
    Any suggestion on what algorithm I could try instead?
    Detail of what I've done with KD-Tree: Given a reasonable map (2d array) of water and land, from which a set of land and water locations,
    1. Build a tree using the entire land locations
    2. Make a query for each water location and get the shortest distance

    • @finnaginfrost6297
      @finnaginfrost6297 2 года назад +1

      If you want each pixel of water to have a single 'distance to any land' number, you could write a flood-fill / A-Star compute shader to compute all your distances super quickly, and run it on a GPU each time you want to update it.
      Said differently, compute the Signed Distance Field (look it up on wikipedia) of your land/water 2d array.
      What are your 'land and water locations'? Are they special pixels on your array, are they vertices of your coastline?

    • @CattyKit_chill
      @CattyKit_chill 2 года назад

      @@finnaginfrost6297 Thank you for your suggestions, will look further into these.
      The land and water locations are two mutually exclusive and collectively exhaustive groups of pixels in my 2D array (snapshot of a map).

  • @nicolas.predella
    @nicolas.predella 4 месяца назад

    How could i go around removing elements from this kind of tree without rebuilding it?

  • @bejamartins
    @bejamartins 2 года назад

    I remember implementing a k-d structure for a geocahing project in university. Got a 19 in that project :D due to suberb preformance.

  • @MrGoatflakes
    @MrGoatflakes 2 года назад

    Oh sheeet Dr. Mike explaining K-D trees 😄

  • @Shannxy
    @Shannxy Год назад

    7:53 Can someone explain this little comment for me?
    Is the origin of top left for people doing computer graphics?

  • @richbuilds_com
    @richbuilds_com 2 года назад +3

    Can only view in 360p. Is this retro episode? ;-)

    • @DantalionNl
      @DantalionNl 2 года назад

      This is we don't wait for youtube to process the video before publishing and just hit publish as soon as the upload of the source file is finished~

    • @seanski44
      @seanski44 2 года назад +1

      @@DantalionNl or schedule it assuming RUclips have their act together to process something in a timely fashion...

  • @Jkauppa
    @Jkauppa 2 года назад

    gaussian mixture with noise explanation is the best, ie, probability distribution function estimation

    • @Jkauppa
      @Jkauppa 2 года назад

      or extending to polynomial function (+ normal noise) fitting, polynomial function mixture model fitting, automatic amount of components based on normal/any noise explanation, any numbers of dimensions

    • @Jkauppa
      @Jkauppa 2 года назад

      then you forgot what you tried to solve, initially, and got lost

    • @Jkauppa
      @Jkauppa 2 года назад

      closest points by x-sort, y-sort, etc, ie, dimensional sort bins

    • @Jkauppa
      @Jkauppa 2 года назад

      you get the closest, nearest, points very fast

    • @Jkauppa
      @Jkauppa 2 года назад

      I assume O(n log n)

  • @angelorf
    @angelorf 2 года назад

    Is the next video gonna be on the bounding volume hierarchy?

  • @usama57926
    @usama57926 2 года назад

    Nice explanation

  • @MattRiddell
    @MattRiddell 2 года назад

    Locality sensitive hashing next!

  • @defeatSpace
    @defeatSpace 2 года назад

    The computer processor speed for lidar scanners makes me think about how for the brain, processes emerge and dissolve in parallel in different parts of the brain at different frequency bands like theta (5-8 Hz), alpha (9-12 Hz), beta (14-28 Hz) and gamma (40-80 Hz), where as it might be slower you still have millions of these processors that generate to complete tasks with additional asynchronous verification for accuracy. Not taking into account the conditioning that occurs for natural neural-networks, animal brains are so much better at unconscious data processing despite using slower processors because they have a ton of evolutionary experience that enables practically infinite capability to flexibly allocate further resources for multiprocessing, it all seems to really boil down to how efficiently programmers can diversify tasks and the efficiency of multithreading capabilities for a processor, while still having limitations because processors cannot really support multiprocessing like brains do.

  • @adamkelemen4607
    @adamkelemen4607 2 года назад

    Is there a bound for k in a sense that for some k the method gets inefficient? I read somewhere that from k>6 it gets slow but wouldnt know..

    • @deanjohnson8233
      @deanjohnson8233 2 года назад

      Assuming well distributed points, each level of the tree reduces your problem space by a factor of 2 so I don’t think the algorithm would become inefficient for high values of k. It certainly gets slower as k increases, but I don’t think it would grow much worse than any other algorithm.

  • @nyuh
    @nyuh 2 года назад

    oooohhh, I've actually been coding stuff that uses k-d trees. what a nice coincidence!

  • @solsticeprojekt1937
    @solsticeprojekt1937 2 года назад +1

    Did he mention how to pick the point to split at? Because he's just picking optimals. If I have to search for the ones in the middles it's not actually worth it for dynamically moving items? Edit: Right. He picks the median point, he said.

    • @albertbatfinder5240
      @albertbatfinder5240 2 года назад +1

      Median implies the points undergo a sort every time the point array is presented for evaluation. If you’ve already sorted them by x-axis, then y-axis, any binary search is going to hone in on a set of nearest point candidates pretty quickly. Maybe that’s all he’s doing with the entire explanation. I may have implemented this myself, without calling it a k-d tree or even honouring it with a name at all.

    • @solsticeprojekt1937
      @solsticeprojekt1937 2 года назад +1

      @@albertbatfinder5240 Thanks for the reply!

    • @deanjohnson8233
      @deanjohnson8233 2 года назад

      You could also just pick a random splitting point. Some splits will not be very helpful, but on average it can work well. What strategy you use will depend on how often you have to rebuild the tree compared to the number of queries against the tree.

    • @solsticeprojekt1937
      @solsticeprojekt1937 2 года назад

      @@deanjohnson8233 Thanks! I think that should have been mentioned. I gotta try that some day. I think this can get really fast using cmovcc. I'm sure I'm not inventing anything new here. I knew of kd-trees, but never actually how to implement them. :D

  • @z4br4k98
    @z4br4k98 2 года назад

    just in time for my computer graphics exam!

  • @PMA65537
    @PMA65537 2 года назад

    8:55 Is there a difference if the closest is not a leaf node?

    • @KieronTaylor
      @KieronTaylor 2 года назад

      That's what the backtrack up the tree tells you. You don't know all the possibilities until you've visited a leaf, but once you have, any of the nodes on the branch is a viable solution. The most wasteful case is a point near your top level node, but at least it's the same number of moves for all solutions - preferable in all but "lucky" situations.

  • @arkhoseu348
    @arkhoseu348 2 года назад

    nice explanation

  • @sweepingtime
    @sweepingtime 2 года назад

    Even making an error where G is the child of D is illuminating, because it made me think: why isn't this the child of E? Well, it got me to follow along.

  • @Veptis
    @Veptis 2 года назад

    How frequently do people come up with these smart algorithms and structures to reduce time, memory complexity?

  • @cauchyschwarz3295
    @cauchyschwarz3295 2 года назад

    What has not been adressed is how you find the point that is in the middle when you decide the next split-point. Don't I have a similar problem as in the beginning of having to sort the points along that axis and check them individually?

    • @jtsiomb
      @jtsiomb 2 года назад +1

      There are kd-tree construction strategies. You can pick a point at random, you can pick 3 points at random and pick the middle of the 3, or you can go through the whole lot of them. But also in some cases you might not even want to split down the middle. A common construction strategy is to optimize splits to minimize the surface area of each subtree bounding box (surface area heuristic). It all depends on what you're going to use your kd-tree for, and how much time you have to spend on constructing it.

  • @jacklee5876
    @jacklee5876 2 года назад

    If the root represents the starting point of your search, does it mean that one must create a K-d tree for every point so each point knows how to traverse on their own tree?

    • @tscoffey1
      @tscoffey1 2 года назад

      It's like a binary search. A properly chosen root node is optimal for most searches of the tree.

  • @rickhackro
    @rickhackro 2 года назад

    Absolutely amazing.

  • @victornoagbodji
    @victornoagbodji 2 года назад

    This is an amazing explanation! Thanks 😊 🙏

  • @kebman
    @kebman 2 года назад

    I like the woolly sweater! I have one on just like that right now! Very tasteful!

  • @ForSquirel
    @ForSquirel 2 года назад

    I just wanna know, If I buy that sweater, jumper, whatever you call it, will I automatically get a +1 to intelligence? Cuz I'm down for that.

  • @jeremystott8188
    @jeremystott8188 2 года назад

    Great video, very interesting!

  • @theprofessionalfence-sitter
    @theprofessionalfence-sitter 2 года назад

    So, is this method just faster on average or even in the worst case?

    • @deanjohnson8233
      @deanjohnson8233 2 года назад +1

      Depending on how you pick the splitting points, it is theoretically possible that the tree just becomes a linked list where you could have to check every single point.
      In the video he mentioned using the median point as the split point. This would never lead to the linked list tree (and thus better worst case performance) but requires you to be able to sort the points along each dimension. Other strategies could remove the need to sort but could theoretically lead to very bad worst case performance. For example, you could pick a random splitting point - but if every random choice just happens to make the tree a linked list….

  • @U014B
    @U014B 2 года назад

    What is that picture on the whiteboard in the top-right? There's an arrow pointing from one guy's tie to another guy's badge? Is this supposed to be some kind of avant-garde office humor?

  • @runrickyrun157
    @runrickyrun157 2 года назад

    Man, that thumbnail. I really thought Gave from The Office was going to tell me about math.

  • @norliegh
    @norliegh Год назад

    reminds me so much of geohashing just slightly different.

  • @AldenLaslett
    @AldenLaslett 2 года назад +1

    Can you do R-Trees next?

  • @MGtvMusic
    @MGtvMusic Год назад

    Amazing!

  • @JamesNorthrup
    @JamesNorthrup 2 года назад

    why the tractor feed paper?

  • @SSJ0016
    @SSJ0016 2 года назад

    Could you implement a circle of a growing radius around your point of interest? The first point to contact the circumference would be the closest point.

    • @fugoair
      @fugoair 2 года назад

      Try implementing your idea. One of the issues you might discover is that the detection of that 'contact' is slower then all the complicated tree stuff in kd tree. As more points are added the time complexity gets worse while the kd tree gets better.

    • @deanjohnson8233
      @deanjohnson8233 2 года назад

      How would you determine which pints to check against? Without some spatial ordering, you can’t check in an efficient order. But if you have spatial ordering, you probably don’t need any explicit uses of a growing circle.

  • @ChrisWalshZX
    @ChrisWalshZX 2 года назад +3

    I thought Mike was just going to go into binary trees and I knew this wouldn't be enough to solve the problem. The splitting on a specific point's X/Y/Z axis is really interesting.
    As a side note, and I'm sure MIke will agree but it kind of off-topic but the program would only actually compare and store the SQUARE of the distances as this saves performing an expensive SQRT function for each distance measured but still allows you to compare which value is smaller between two different distances.

    • @APaleDot
      @APaleDot 2 года назад +5

      Using a k-d tree, you don't use the square because you are mostly comparing along a single axis.

    • @the-mush
      @the-mush 2 года назад +1

      @@APaleDot also, even if you where using more than one axis, wouldn't it be even faster if you used a "Manhattan" distance? Don't see the point of squaring the numbers if you don't need the real "euclidean" distance.

    • @APaleDot
      @APaleDot 2 года назад

      @@the-mush
      Maybe I'm misunderstanding what you mean, but the goal of the algorithm is to minimize the euclidean distance. To find the closest point.
      The k-d tree just allows you to rule out certain points because you know they are further away along one axis than the current smallest euclidean distance.

    • @deanjohnson8233
      @deanjohnson8233 2 года назад +1

      @@the-mush if you are trying to find a “close enough” point, then you can use the Manhattan distance. If you want to have the point that is truly the closest in the usual sense, you have to use Euclidean.

    • @the-mush
      @the-mush 2 года назад

      @APaleDot yeah! I wasn't talking about the k-d tree, just was thinking about the differences between squared distances and manhattan. Talking about it in other thread, I now realize that squaring the numbers first allows you to avoid a problem you get if you oversimplify the algorithm and only sum them up (as in manhattan distances).
      However, as you said, if you only consider one axis at a time, none of that matters since you only work with the deltas one-dimensionally. I think that detail of essentially working in straight lines along the axes made me remember the manhattan distances thing, but the relation is way thinner than I first thought.

  • @slam_slam_
    @slam_slam_ 2 года назад

    Awesome video!!

  • @CryingMG
    @CryingMG 2 года назад

    This guy is the best!

  • @mouazq
    @mouazq 2 года назад

    The search time was reduced at the expense of great spacial complexity (Storing all nodes in a tree structure in memory)

  • @_mvr_
    @_mvr_ 2 года назад

    Do a video on octrees!

  • @accountname1047
    @accountname1047 2 года назад

    Is this not just a separating hyperplane search?

  • @phyzix_phyzix
    @phyzix_phyzix 2 года назад

    I'm confused. If B was all the way to the left (x = 0) then a point above B could potentially be at a shorter distance than B.

    • @Tumbolisu
      @Tumbolisu 2 года назад

      Yes, but X is so far away from the line that goes through B, that nothing above that line could possibly be closer than G.
      He made the unfortunate mistake of placing B right above X. So whenever he was trying to point at the line, he also had to point at B itself.

  • @zr0724
    @zr0724 16 дней назад

    thank you

  • @JBantha
    @JBantha 2 года назад +1

    as a human this seems overly convoluted to compare a point but I'm happy for the computers to have a way to do it.

    • @tissuepaper9962
      @tissuepaper9962 2 года назад +1

      No fair, you've got a hardware accelerator for calculating euclidean distance.

  • @ac_nero
    @ac_nero 2 года назад

    neat in fact!

  • @372leonard
    @372leonard 2 года назад

    do a video on bounding volume hierachies please

  • @esquerte
    @esquerte 2 года назад +3

    Now we finally know how to find G point.

    • @miallo
      @miallo 2 года назад

      On that note: 11:42 "I need to not count my Exes"

  • @MysticPing
    @MysticPing 2 года назад +1

    If only you'd done this video two weeks ago and I would've done better on my last exam haha. Could not for the life of me understand them.

  • @thenerdyouknowabout
    @thenerdyouknowabout 2 года назад

    "I need to not count my x's [exs]" - Words to live by... Words to live by...