meson: Enable GCing of functions and data from compilation units by default.
Normally, the linker will pull in any compilation unit (aka .c file) from
a static lib (such as our shared util code) that is depended on by the
code linking against it. Since that code is already compiled, the .text
section is allowed to jump anywhere in .text, and the compiler can't
garbage collect unused functions inside of a compile unit.
Teasing callgraphs apart so that normal compilation-unit-level GCing can
reduce driver size hurts the logical organization of the code and is
difficult. As an example, once I'd split the format pack/unpack tables, I
had to split out util_format_read/write() from util_format.c to avoid
pulling in pack/unpack. But even then it didn't help, because it turns
out turnip's pack calls pull in util_format_bptc.c for bptc packing, but
that file also includes the unpack impls, and those internally call
util_format_unpack, and thus we pulled in all of unpack. Splitting all of
this to separate files makes code harder to find and maintain, and is a
waste of dev time.
By setting these compiler flags, the compiler puts each function and data
symbol in a separate ELF section and the linker can then safely GC unused
text and data sections from a compile unit that gets pulled in. There's a
bit of a space cost due to having those separate sections, but it ends up
being a huge win in disk space on my personal release driver builds:
- i965_dri.so -213k
- x86 gallium dri.so -430k
- libvulkan_intel.so -272k
- aarch64 gallium dri.so -330k
- libvulkan_freedreno.so -783k
No difference on iris drawoverhead -compat -test 1 on my skylake (n=60)
Effect on debugoptimized build times (n=5)
touch nir_lower_io.c build time (gold) +15.999% +/- 3.80377%
touch freedreno fd6_gmem.c build time (gold) +13.5294% +/- 4.86363%
touch nir_lower_io.c build time (lld) no change
touch freedreno fd6_gmem.c build time (lld) +2.45375% +/- 2.2383%
Edited by Emma Anholt