Commit 24b41527 authored by Jordan Justen's avatar Jordan Justen
Browse files

intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview



Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen's avatarJordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
parent 06e3bd02
...@@ -1097,24 +1097,35 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool, ...@@ -1097,24 +1097,35 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool,
&device->instance->physicalDevice; &device->instance->physicalDevice;
const struct gen_device_info *devinfo = &physical_device->info; const struct gen_device_info *devinfo = &physical_device->info;
const unsigned subslices = MAX2(physical_device->subslice_total, 1);
unsigned scratch_ids_per_subslice;
if (devinfo->is_haswell) {
/* WaCSScratchSize:hsw /* WaCSScratchSize:hsw
* *
* Haswell's scratch space address calculation appears to be sparse * Haswell's scratch space address calculation appears to be sparse
* rather than tightly packed. The Thread ID has bits indicating which * rather than tightly packed. The Thread ID has bits indicating
* subslice, EU within a subslice, and thread within an EU it is. * which subslice, EU within a subslice, and thread within an EU it
* There's a maximum of two slices and two subslices, so these can be * is. There's a maximum of two slices and two subslices, so these
* stored with a single bit. Even though there are only 10 EUs per * can be stored with a single bit. Even though there are only 10 EUs
* subslice, this is stored in 4 bits, so there's an effective maximum * per subslice, this is stored in 4 bits, so there's an effective
* value of 16 EUs. Similarly, although there are only 7 threads per EU, * maximum value of 16 EUs. Similarly, although there are only 7
* this is stored in a 3 bit number, giving an effective maximum value * threads per EU, this is stored in a 3 bit number, giving an
* of 8 threads per EU. * effective maximum value of 8 threads per EU.
* *
* This means that we need to use 16 * 8 instead of 10 * 7 for the * This means that we need to use 16 * 8 instead of 10 * 7 for the
* number of threads per subslice. * number of threads per subslice.
*/ */
const unsigned subslices = MAX2(physical_device->subslice_total, 1); scratch_ids_per_subslice = 16 * 8;
const unsigned scratch_ids_per_subslice = } else if (devinfo->is_cherryview) {
device->info.is_haswell ? 16 * 8 : devinfo->max_cs_threads; /* Cherryview devices have either 6 or 8 EUs per subslice, and each EU
* has 7 threads. The 6 EU devices appear to calculate thread IDs as if
* it had 8 EUs.
*/
scratch_ids_per_subslice = 8 * 7;
} else {
scratch_ids_per_subslice = devinfo->max_cs_threads;
}
uint32_t max_threads[] = { uint32_t max_threads[] = {
[MESA_SHADER_VERTEX] = devinfo->max_vs_threads, [MESA_SHADER_VERTEX] = devinfo->max_vs_threads,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment