radeonsi: fix NGG culling on gfx10.3, enable by default, and optimize the culling shader

  • NGG culling is fixed for gfx10.3
  • NGG culling is enabled by default on gfx10.3 dGPUs
  • the culling shader is optimized leading to slightly better culling performance in some cases
