intel/fs: only avoid SIMD32 if strictly inferior in throughput
This enabled SIMD32 in blorp shaders and seems to be give a small FPS bump when using a DG2 GPU as secondary (requires copies to linear buffers to exchange with main GPU).
Signed-off-by: Lionel Landwerlin lionel.g.landwerlin@intel.com