Skip to content

isaspec: encode size optimization

Rob Clark requested to merge robclark/mesa:isaspec/sizeopt into main

Originally I had hoped that the compiler's CSE would be able to de-duplicate identical encoding for similar instructions. But it was not. So this MR splits the bitset encoding out to helper "snippet" functions which are de-duplicated by means of a hashtable. Which drastically reduces the size of the resulting object file. For an aarch64 release build:

add/remove: 51/15 grow/shrink: 3/3 up/down: 40772/-200496 (-159724)
Function                                     old     new   delta
snippet__instruction_9.constprop.isra          -    2424   +2424
snippet__instruction_8.constprop.isra          -    2424   +2424
snippet__instruction_11.constprop.isra         -    2424   +2424
snippet__instruction_10.constprop.isra         -    2424   +2424
snippet__instruction_7.constprop.isra          -    2148   +2148
snippet__instruction_24.constprop              -    1984   +1984
snippet__instruction_22.constprop              -    1976   +1976
snippet__instruction_26.constprop              -    1968   +1968
snippet__instruction_23.constprop              -    1968   +1968
snippet__instruction_25.constprop              -    1956   +1956
snippet__instruction_27.constprop              -    1952   +1952
snippet__instruction_28.constprop              -    1940   +1940
snippet__instruction_20.constprop.isra         -    1708   +1708
snippet__instruction_18.constprop              -    1632   +1632
snippet__instruction_16.constprop              -    1632   +1632
snippet__instruction_17.constprop              -    1436   +1436
snippet__instruction_40.constprop.isra         -    1312   +1312
snippet__instruction_39.constprop.isra         -     624    +624
snippet__instruction_38.constprop.isra         -     624    +624
snippet__instruction_44.constprop.isra         -     592    +592
snippet__instruction_21.constprop              -     532    +532
snippet__instruction_13.constprop.isra         -     468    +468
snippet__instruction_12.constprop.isra         -     428    +428
snippet__cat5_src3_0                           -     344    +344
snippet__instruction_35.constprop.isra         -     328    +328
snippet__instruction_33.constprop.isra         -     296    +296
snippet__instruction_6.constprop               -     292    +292
snippet__reg_gpr_0.constprop.isra              -     256    +256
snippet__instruction_46.constprop              -     248    +248
snippet__instruction_5.constprop               -     232    +232
snippet__instruction_3.constprop               -     200    +200
snippet__instruction_1.constprop               -     200    +200
snippet__cat1_immed_src_0.constprop.isra       -     196    +196
snippet__instruction_42.constprop.isra         -     176    +176
snippet__instruction_2.constprop               -     164    +164
snippet__instruction_0.constprop               -     164    +164
encode__cat6_src.isra                          -     136    +136
snippet__multisrc_4.constprop.isra             -     128    +128
snippet__multisrc_1.constprop                  -     116    +116
encode__cat6_typed.isra                        -      76     +76
encode__cat6_base.isra                         -      76     +76
encode__cat5_s2en_bindless_base.isra           -      76     +76
encode__reg_relative_gpr.isra                  -      72     +72
encode__reg_const.isra                         -      72     +72
encode__cat5_tex.isra                          -      72     +72
encode__cat5_samp.isra                         -      72     +72
encode__cat5_type.isra                         -      44     +44
snippet__cat1_gpr_src_0.constprop.isra         -      40     +40
encode__cat1_relative_gpr_src.isra             -      36     +36
encode__cat1_const_src.isra                    -      36     +36
snippet__cat3_src_2.constprop.isra             -      32     +32
encode__cat5_src2.isra                        84      92      +8
encode__cat5_src1.isra                        80      84      +4
encode__cat1_dst.isra                         96     100      +4
encode__cat3_src.isra                        276     244     -32
encode__cat1_relative_gpr_src.constprop.isra      40       -     -40
encode__cat1_gpr_src.constprop.isra           40       -     -40
encode__cat1_const_src.constprop.isra         40       -     -40
encode__cat5_type.constprop.isra              44       -     -44
encode__reg_const.constprop.isra              68       -     -68
encode__cat5_tex.constprop.isra               68       -     -68
encode__cat5_samp.constprop.isra              68       -     -68
encode__reg_relative_gpr.constprop.isra       72       -     -72
encode__cat6_base.constprop.isra              72       -     -72
encode__cat5_s2en_bindless_base.constprop.isra      72       -     -72
encode__cat6_typed.constprop                  76       -     -76
encode__cat6_src.constprop.isra              132       -    -132
encode__cat1_immed_src.constprop.isra        196       -    -196
encode__reg_gpr.constprop.isra               256       -    -256
encode__multisrc                             848     556    -292
encode__cat5_src3                            336       -    -336
encode__instruction.constprop             211384   12792 -198592
Total: Before=215964, After=56240, chg -73.96%

/cc @austriancoder

Edited by Rob Clark

Merge request reports