Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
intel_gpu_top does not show any clients, although clients are definitely rendering (hardware-accelerated video).
The paste at the very end of this entry doesn't just cut off the clients, they are simply not present, regardless of the state of the runtime toggles (sort/display/aggregate).
To reproduce:
sudo intel_gpu_top
start any application which consumes resources tracked by intel_gpu_top
//exp: some information is rendered below the fancy graph rendition
//act: only the graph rendition, no further details in the bottom section
System environment:
Dell Inspiron 16 Plus (7610) with Nvidia RTX 3060 dGPU
X11 / Xorg == reverse PRIME configuration, so Intel iGPU renders everything
external screen on HDMI port attached to Intel output port (and no MUXing involved; the Nvidia dGPU is literally powered down, not involved at all - also validated via nvidia-smi)
I suppose intel_gpu_top could do a step better and hide the column headers if there is no support by adding some logic. Might be convoluted to implement so need someone with a little bit of time.
FWIW, it might be easier to render dummy text saying "Linux kernel 5.19+ is required for a detailed list" - this way, the various hints and notes in the documentation ("h") do not have to be adjusted.
FWIW, Linux kernel 5.18 was just released, so perhaps not mentioning a specific future kernel version might be also be worthwhile?
Best solution I can think of, as in easy and reliable versus potential feature backports, would be for intel_gpu_top to open a render node and detect presence of the feature itself. I was avoiding touching the device in that way so far, but maybe it is acceptable.
Nothing in the trace strikes me as odd. Certainly fdinfo files appear to be present and partial content shown looks reasonable.
One odd thing is that lsgpu shows card1 but renderD128. I wonder what is card0 on your machine. Do you see both /sys/class/drm/card0/ and /sys/class/drm/card1/ present and to what they point? Or would you be able to check if lsgpi output is the same with the old kernel?
Maybe also try intel_gpu_top -L and then pass that to -d, in case of auto selection somehow does not fully work.
Or if you are comfortable patching and building it from source see what this line prints:
When build there's a warning about type conversion, card.pci_slot_name[0] is an int and not a char *. Changing %s% to %d still returns empty and changing card.pci_slot_name[0] to card.pci_slot_name also returns empty.
I also changed printf("xxx '%s'\n", card.pci_slot_name[0]); to printf("xxx '%s'\n", &card.pci_slot_name[0]);, but still no luck.
Yes you fixed it correctly (%s and just card.pci_slot_name), my bad. Empty is okay, will make it use IGPU_PCI which matches "drm-pdev:\t0000:00:02.0" from your strace.
I am out of ideas what else could be going wrong.
Maybe try pressing 'H' once and then if still nothing, 'i' once to change the display mode into no PID aggregation and no hiding of clients with no GPU utilization.
Yeah, not showing anything. I'm as out of ideas as you are, Tvrtko. Maybe someone else find this issue and figure out what is going on or a kernel update fixes it. Who knows?
If there is still nothing with no aggregation and not idle filtering that means it is possibly not able to parse anything at all. Would you able to paste a content of one relevant fdinfo file here? It would go like this, using lsof find something interesting using the GPU:
/* * Temporarily skip showing client engine information with GuC submission till * fetching engine busyness is implemented in the GuC submission backend */ if (GRAPHICS_VER(i915) < 8 || intel_uc_uses_guc_submission(&i915->gt0.uc)) return;
are you saying that enable_guc=2 should always be preferred to enable_guc=3
Telling GuC FW to load HuC FW (enable_guc=2) has much less effect on how the kernel GPU driver works, than overriding kernel driver decision of who's in control of GPU scheduling etc, kernel driver or GuC FW (enable_guc=3). Therefore latter decision is better left to kernel developers, unless you're keen to debug those functionalities yourself. :-)
Indeed. Just leave enable_guc at the default value folks, aka don't touch it unless you really know what you are doing. Proliferation of incorrect or half-misleading wikis is not your friend here.