Skip to content

nvc0: enable fp helper invocation memory loads on GPUs with firmware blobs

Karol Herbst requested to merge karolherbst/mesa:nvc0/helper_invocs into main

Reason: Nvidia hardware has a per context switch to enable/disable memory loads in fp helper invocations. Those are disabled by default.

The bad news: This is an mmio register which needs to be toggled.

The good news: Nvidia's firmware provides a way to toggle gr context switched mmio registers from Userspace via MME macros.

On pre Turing GPUs we'll just toggle it on the kernel side for every context. The main reason we do it via macros for Turing+ is that GSP will require us to do it like this anyway.

Shader test file to reproduce this problem:

[require]
GL >= 4.3
GLSL >= 4.30
GL_ARB_shader_atomic_counter_ops
GL_ARB_shader_ballot

[vertex shader passthrough]

[fragment shader]
#extension GL_ARB_shader_atomic_counter_ops: require
#extension GL_ARB_gpu_shader_int64: require
#extension GL_ARB_shader_ballot : require

uniform int input_uniform;

layout(binding = 0) buffer bufblock {
	uint64_t input_ssbo;

	uint64_t test_uniform;
	uint64_t test_ssbo;
};

out vec4 color;

void main()
{
	if (gl_SubGroupInvocationARB != 0)
		discard;

	test_uniform     = ballotARB(input_uniform ==  5);
	test_ssbo        = ballotARB(input_ssbo    == 10);

	color = vec4(0.0, 1.0, 0.0, 1.0);
}

[test]
ssbo 0 32
ssbo 0 subdata uint64  0 10
ssbo 0 subdata uint64  8  0
ssbo 0 subdata uint64 16  0
ssbo 0 subdata uint64 24  0

uniform int input_uniform 5

clear color 0.5 0.5 0.5 0.5
clear

draw rect 0 0 1 1

# only passing if fp helper invocation load from memory
probe ssbo uint64 0  8 == 15
probe ssbo uint64 0 16 == 15
Edited by Karol Herbst

Merge request reports