This implements the "tortoise and the hare" algorithm for detecting cycles in graphs. We use the caller's iterator as the hare and our own internal copy as the tortoise. Conveniently, VkBaseOutStructure (and VkBaseInStructure which is identical except the pointer type on pNext) have a pointer we can use for the tortoise and an sType which we can use for a counter to ensure we only increment the tortose every other loop iteration.
There are more efficient algorithms than tortoise and hare but they require allocating memory for something like a hash set of seen nodes. Since this for debug purposes only, it's ok for it to be a bit inefficient in the case where it hits the assert. In the usual case of no loops, it's the same runtime efficiency as the unchecked version except that it does a tiny bit of math and 50% more pointer chases.
Version 1 worked fine with clang and with GCC 12.1 with -O0 but not with
optimizations. See #6895 (closed).
Unfortunately, the first version required modifying a temporary declared
const inside the for loop and that seems to have been the problem. This
version instead has an iterator struct which is managed by an outer for
loop and the inner for loop exists to declare the user's requested
iteration variable and manage the actual iteration. Because the outer
for loop is effectively
for (bool done = false; !done; done = true),
it will execute exactly once, regardless of the inner loop, so break and
continue inside the inner loop should work the same as if it's a single
The other major difference with the new version is that the code is the same for debug and release except the half_iter and loop check are gone. I've verified by hand that this produces virtually identical code to the old simple iterators on both GCC andl clang with an optimized build.