Changes

Bas Nieuwenhuizen · a1a38ade
--- a/Raytracing.md
+++ b/Raytracing.md
@@ -104,6 +104,8 @@ Couple of solutions:
 A good solution might be a combination of 1/2 + 4, keeping a small local part of the stack for frequent operations and then regularly pushing/popping big chunks into/from VMEM. Will need some work to avoid significant divergence.
+Stackless traversal is possible, but (a) we might not be able to fit a parent pointer in the fp16 box nodes without significant overhead (b) would need 1 load per level (instead of 1 load + 1 store per M (M=4/8?) levels in the combined solution above) (c) needs a bunch of logic to figure out the next child which would probably involve doing the intersection test again, which leads us to ...
 ### Should we retest parent box nodes?
 The current algorithm can have effectively 3 nodes per BVH level on the stack which is quite inefficient and will blow up our stack size.