Up to 30% perf regression in GPU Media (and 3D) performance, with up to 2x higher power usage (based on RAPL)
Between following drm-tip versions:
- e8c9ce4930: drm-tip: 2019y-11m-13d-16h-42m-12s UTC integration manifest
- 2bdb1f6c51: drm-tip: 2019y-11m-14d-17h-53m-35s UTC integration manifest
GPU performance dropped a lot in some tests on BXT (J4205).
Drop was largest, 5-30% in (HEVC) decoding and low-power (fixed function) AVC transcoding, depending on case. Drop is larger when there's only single process doing transcoding instead of several parallel ones.
Of my test-cases, worst performance drop is in this (HuC is needed for low-power mode bitrate control):
ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v h264_qsv -i 720x480p_30.00_4mb_h264_cabac_180s.264 -c:v h264_qsv -b:v 2000K -low_power 1 -compression_level 4 -an output.h264
According to intel_gpu_top, compared to earlier commit, new drm-tip kernel version:
- Has same 85-90% video engine utilization although performance is (on average) 30% slower
- Most often uses 2x more power, but on some runs it uses 2.5x more power
- Increased power usage causes GPU speed to be limited, in first case from 800Mhz to 700-750Mhz, in latter case to <600Mhz, which explains the performance drop
In 3D benchmarks performance drops are on BXT following:
- 5-10% in high FPS windowed GLBenchmark tests, GpuTest Triangle, Plot3D and Julia FP32
- 5-6% GfxBench Manhattan 3.0 & 3.1, (low FPS) Unigine Heaven & Valley, SynMark TerrainFly*
- 3-4% GfxBench Carchase, T-Rex
- 3% GfxBench ActecRuins
On GEN9 Core devices the drops were much smaller, e.g. on partly TDP-limited KBL GT3e NUC they were 1/3rd of BXT ones, and on non-TDP limited SKL GT2, non-existing.
On KBL GT3e, above FFmpeg also uses >2x more power on GPU side (and a bit less on CPU side), with 10% lower performance, and GPU getting slightly TDP-limited.