dzn: Use new SM6.8 intrinsics to optimize indirect drawing
SM6.8 adds (finally) start vertex/instance location sysvals. We also added ExecuteIndirect tier 1.1, which is technically independent but more or less part of the same bag of features, which provides a way to get a draw ID. With those, we can remove the need to internally allocate / patch indirect arg buffers, which also means we don't need to break up render passes for indirect drawing.
This is a prereq for landing !23843
This work uncovered a D3D bug where validation in ExecuteIndirect
was too strict. This validation is fixed in the 1.613.2 Agility SDK, so this series also picks up that new package.