On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N)
Ingo Wald, Valstimil Havran
Kd-trees seem to be the most widely used technique for spatial subdivisioning. They are believed to be the best known method for fast ray tracing. The Surface Area Heuristic is the construction technique usually used, and how to building efficient ones and traversing them quickly is well understood. However, constructing them is not as well understood, which is what this paper hopes to analyze.
All kd-tree construction schemes follow the same recursive scene:
function RecBuild( triangles T, voxel V ) returns node
if Terminate(T,V) then:
return new leaf node( T )
p = FindPlane( T, V )
(V_L, V_R ) = Split V with p
T_L = all members of T on one side of V_L
T_R = all members of T on the other side of V_L
return new node( p, RecBuild(T_L, V_L), RecBuild(T_R, V_R) )
The difference between a good and a naively build kd-tree is often a factor of two or more. The naive method often used, the “spatial median” method, is to just alternate axis and the plane being split at the spatial median of the voxel.
The Surface Area Heuristic(SAH) tends to perform quite well. It makes several assumptions:
- Rays are uniformly distributed, infinite lines.
- The cost for both a traversal and a triangle intersection are known.
- The cost of intersecting N triangles is roughly NK.
The expected cost for a given plane p then is one traversal step, plus the cost of intersecting the two children. Usually, instead of minimizing the cost of traversing the entire tree, a greedy version of the heuristic is used which minimizes it at each level, due to efficiency reasons. We also need to define a termination criterion; when to stop the recursive build: SAH gives us an elegant one, which is if the cost of this node as a leaf is better than even the best split.
The SAH tends to work best, and only a few modifications are known to consistently yield better improvements. One of these is to favor splits that cut off empty space. Requiring the tree to be split a certain number of times can prevent the tree from being stuck in a local minnima, but this number is hard to determine for general scenes.
There are several ways to construct Kd-trees once you have decided upon the cost function. The mmost trivial way is to iterate all triangles, determine their possible split candidates, and compare the cost for each one. This will end up taking O(N^2) operations, where N is the number of triangles in the scene. This is often too slow.
There is also a widely-known algorithm for constructing SAH trees in O(N log^2 N). I’m not going to summarize the algorithm here: if you are interested, read the paper. It essentially ’sweeps’ a plane, calculating costs incrementally. This is a huge improvement, but as the theoretical lower bound is O(N log N) we wish to be better.
This paper introduces a O(N log N) method of building a kd-tree. What caused the previous algorith to be O(N log^2 N ) instead of O(N log N ) is having to sort once per partition. This algorithm improves on that by sorting the evet list only once. This involves having to events for all dimensions. Again, I’m not going to recap specifics of the algorithm here.
The two algorithms were then compared in actual time, and not just asymptotic behavior. Realistic models of varying size were used. The O(N log N) variant is concisely faster than the O(N log^2 N) variant by a factor of 2-3.5. Analysis shows that the O(N log^2 N) variant spends more time re-sorting than the other variant uses to build the entire tree.