intel/compiler: Reuse information between the various scheduling modes
Makes the scheduling part of the backend compiler about ~30% faster by changing how we store pass information and reuse that for the multiple pre-RA scheduling modes. In overall fossil executions this will account for between 2% to 4.5% 1.8% to 2.5% (updated results below). The ~30% number was checked using perf/flamegraph data for Cyberpunk 2077, ROTTR and Total War Warhammer 3 (still valid in the updated measurements).
Summary of changes:
- Allocating memory in bulk
- Iterating through arrays instead of linked lists when possible
- Remove virtual functions while still keeping some code share between Vec4 and FS
- Smaller changes to how/what data is stored
- Allowing scheduler to be reused for pre-RA modes
Fossil run measurements in TGL (compared originally against f54e06e206db3278d38e6dabe781d083594e889c)
RISE OF THE TOMB RIDER (NATIVE)
N Min Max Median Avg Stddev
x 13 30.08 30.27 30.12 30.133077 0.047325929
+ 13 28.69 29.56 28.72 28.785385 0.23358137
Difference at 95.0% confidence
-1.34769 +/- 0.136431
-4.47247% +/- 0.452761%
(Student's t, pooled s = 0.168523)
ASSASSINS CREED ODISSEY (DXVK)
N Min Max Median Avg Stddev
x 13 14.66 14.72 14.69 14.686154 0.014455945
+ 13 14.18 14.23 14.2 14.199231 0.011875422
Difference at 95.0% confidence
-0.486923 +/- 0.0107096
-3.31552% +/- 0.0729229%
(Student's t, pooled s = 0.0132288)
BATMAN ARKHAM CITY (DXVK)
N Min Max Median Avg Stddev
x 13 529.45 530.68 529.7 529.75538 0.29912693
+ 13 506.25 506.91 506.67 506.64154 0.18165196
Difference at 95.0% confidence
-23.1138 +/- 0.200337
-4.36312% +/- 0.0378168%
(Student's t, pooled s = 0.247461)
CYBERPUNK 2077 (DXVK)
N Min Max Median Avg Stddev
x 13 118.75 119.15 118.92 118.93615 0.12593242
+ 13 116.12 116.42 116.3 116.27769 0.091209817
Difference at 95.0% confidence
-2.65846 +/- 0.0890123
-2.2352% +/- 0.0748404%
(Student's t, pooled s = 0.10995)
TOTAL WAR WARHAMMER 3 (DXVK)
N Min Max Median Avg Stddev
x 13 115.27 115.38 115.31 115.31385 0.03228479
+ 13 111.34 111.47 111.42 111.41846 0.035787836
Difference at 95.0% confidence
-3.89538 +/- 0.0275912
-3.37807% +/- 0.023927%
(Student's t, pooled s = 0.0340814)
Updated fossil run measurements in TGL, newer GCC/Fedora (compared against 424df6a6)
// Time in seconds
// Difference at 95.0% confidence
RISE OF THE TOMB RIDER (NATIVE) N=13
-0.702308 +/- 0.0185155
-2.32363% +/- 0.0612597%
ASSASSINS CREED ODISSEY (DXVK) N=13
-0.294615 +/- 0.0129147
-1.99095% +/- 0.0872754%
BATMAN ARKHAM CITY (DXVK) N=7
-13.5871 +/- 0.428562
-2.545% +/- 0.0802737%
CYBERPUNK 2077 (DXVK) N=13
-2.74692 +/- 0.179814
-2.22448% +/- 0.145615%
TOTAL WAR WARHAMMER 3 (DXVK) N=13
-2.08615 +/- 0.02064
-1.80155% +/- 0.0178242%
I haven't measured vec4, but would expect some (but not the same since there are no pre-RA) improvement there.
Edited by Caio Oliveira