Workaround for Duplicated Shader Compilation
This is a simple workaround for #180 (closed) that avoids a much larger change to the way TGSI is parsed and converted to GLSL.
Unnecessary program linking is eliminated by "selecting" shader variants twice to resolve any unresolved circular dependencies between pipeline stages.
Unnecessary stage-specific shader compilation is eliminated by registering/compiling shader in the host driver lazily at draw-time, after circular dependencies have already been resolved.
Performing shader variant selection twice per draw is cheap, as the currently selected variant is key-matched first and should short-circuit the variant list iteration in most cases.
Switching to lazy-compilation restricts our ability to report host-reported shader compilation errors synchronously to when the guest emits a shader for compilation, but these errors are guaranteed to be reported at the first use anyway. We still retain the ability top detect and report errors during TGSI to GLSL conversion, which still occurs when shaders are first emitted by the guest application.