intel/fs: Emit HALT for discard on Gen4-5
Using HALT to immediately jump to the end of the shader is required to implement GL_EXT_gpu_shader4 and OpenGL 3.0. However, vanilla OpenGL 1.2 doesn't forbid it and it likely makes something somewhere faster. We should be consistent and implement the same discard behavior on all hardware if we can.
The rules for HALT on Gen4-5 are a bit different from Gen6+. On the older hardware, there is no stack for HALT; instead it's up to software to save and restore mask registers. However, there's no real saving needed since we only use HALT to jump to the end of the program where we're about about to do our FB writes. All we need to do is reset AMask to DMask, the value it was initialized to at the start of the thread.