No API/user accessible sysfs files to get GPU stats

I915 does expose most of this data (not power or temperature, those belong to different drivers) via the standard Linux perf/PMU framework. Access to system wide data is however classed as security sensitive by default.

If you do not want to read the PMU counters directly but via the intel_gpu_top tool, you can give it the required CAP_PERFMON privilege (along the lines of "sudo setcap cap_perfmon+ep .../intel_gpu_top"), or play with the /proc/sys/kernel/perf_event_paranoid tunable. Former is the recommended way.

Access to system wide data is however classed as security sensitive by default.

AMD and NVIDIA don't think so. AMD and Intel CPUs expose this information freely. I have no idea what's sensitive about this.

In Windows all this information is available via bog standard user accessible DirectX APIs and again is available out of the box for all users including non-privileged ones.

If you do not want to read the PMU counters directly but via the intel_gpu_top tool

This makes it unavailable for 99.9% of Linux users out there.
This makes it impossible to monitor GPU use and metrics in full screen games.
This is complicated and counter-intuitive.

It would be great if you reconsidered your policy.

I can't comment on what AMD or Nvidia expose and how. Or what Windows do, or have.

But the default restrictions for system wide counters is not an i915 decision but a kernel policy:

$ perf record -a /bin/true
Error:
Access to performance monitoring and observability operations is limited.
Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
access to performance monitoring and observability operations for processes
without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
More information can be found at 'Perf events and tool security' document:
https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
perf_event_paranoid setting is 2:
  -1: Allow use of (almost) all events by all users
      Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow raw and ftrace function tracepoint access
>= 1: Disallow CPU event access
>= 2: Disallow kernel profiling
To make the adjusted perf_event_paranoid setting permanent preserve it
in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)

If you want to read the data we expose directly, you have to handle the situation yourself with either CAP_PERFMON or with the procfs tunable. Or if you want to use intel_gpu_top as a helper tool to get to the data, perhaps we can apply the capability during post-install. Or we could improve the default error message which at the moment is indeed not the best:

$ tools/intel_gpu_top
Failed to initialize PMU! (Permission denied)

Make the error message say what needs to be manually done to make it work, at which points users can do it themselves like:

$ sudo setcap cap_perfmon+ep tools/intel_gpu_top
$ tools/intel_gpu_top -l
 Freq MHz      IRQ RC6     Power W     IMC MiB/s           RCS/0           BCS/0           VCS/0          VECS/0 
 req  act       /s   %   gpu   pkg     rd     wr       %  se  wa       %  se  wa       %  se  wa       %  se  wa 
 371  371      371  47  1.70 11.38   6950   3286   21.57   0   0    0.00   0   0    0.00   0   0    0.00   0   0

I am however not sure if applying capabilities at post-install is something our package should do, or perhaps it should be left for downstream vendors to control. @adrinael do you perhaps have any experience with these sort of questions? I am not even sure the capability would survive the RPM/DEB packaging step. It may really have to be something distros set up in the package build process?

So, we are back to having no APIs/user accessible data points. Please tell me how everything you've posted can be used while gaming without using console, scripting, running dangerously looking commands, etc. Users are not developers and must not be subjected to this.

Would be great to be able to monitor:

GPU load in %
Frame Buffer load in %
Video encoder engine load in %
Video decoder engine load in %
Temperature in C
VRAM usage in MB
VRAM usage per application in MB
Temperature (if there are sensors of course)
Power use in watts

Please do. You can export everything and anything via sysfs files, e.g.

$ pwd
/sys/module/acpi/parameters
$ ls -la
total 0
drwxr-xr-x. 2 root root    0 Jan 30 22:13 .
drwxr-xr-x. 3 root root    0 Jan 30 22:13 ..
-r--r--r--. 1 root root 4096 Jan 31 15:57 acpica_version
-rw-r--r--. 1 root root 4096 Jan 31 15:57 aml_debug_output
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_busy_polling
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_delay
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_event_clearing
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_freeze_events
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_max_queries
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_no_wakeup
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_polling_guard
-rw-r--r--. 1 root root 4096 Jan 31 15:57 ec_storm_threshold
-rw-r--r--. 1 root root 4096 Jan 31 15:57 sleep_no_lps0

All readable by everyone.

I will talk with one of the distro representatives to see how they suggest granting extra privilege should be handled.

If you as an end user are interested in running intel_gpu_top, I suggest you file a bug report with your distro so they can talk with us directly.

For what the MangoHud project is concerned, I am not familiar with it, but I would also ask if you could get the developers of that project in touch with me, so we can discuss how they are accessing the data we export.

Will do, thank you!

Patch for improved help text in the error case is on the igt-dev mailing list: https://patchwork.freedesktop.org/series/99576/

And I have confirmation from the Ubuntu package owner that granting CAP_PERFMON to intel_gpu_top is indeed something only DEB package can do as part of the postinstall script. Therefore I suspect it should be the same process (file a distro bug) regardless of a distro.

mentioned in commit igt-gpu-tools@e835d0d9

closed

The command to make intel_gpu_top available for normal users:

setcap cap_perfmon=+ep /usr/bin/intel_gpu_top

No API/user accessible sysfs files to get GPU stats

Child items 0

Activity

Admin message

Admin message

No API/user accessible sysfs files to get GPU stats

Activity