Skip to content
Snippets Groups Projects
  1. Jun 08, 2022
  2. May 09, 2022
    • Matt Turner's avatar
      intel/perf: Destination array calculation into function · bb738a5d
      Matt Turner authored
      
      Cuts 119 KiB from iris_dri.so and libvulkan_intel.so.
      
         text    data     bss     dec     hex filename
       917511       0       0  917511   e0007 meson-generated_.._intel_perf_metrics.c.o (before)
       796986       0       0  796986   c293a meson-generated_.._intel_perf_metrics.c.o (after)
      
         text    data     bss     dec     hex filename
      14130948 365708  210004 14706660 e067e4 iris_dri.so (before)
      14009332 365708  210004 14585044 de8cd4 iris_dri.so (after)
      
         text    data     bss     dec     hex filename
      8124225  214264   22820 8361309  7f955d libvulkan_intel.so (before)
      8002609  214264   22820 8239693  7dba4d libvulkan_intel.so (after)
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 8860ff33)
      
      Part-of: <mesa/mesa!16405>
      bb738a5d
    • Matt Turner's avatar
      intel/perf: Fix mistake in description string · 2ebf4b15
      Matt Turner authored
      
      Along with fixing the grammar, this allows it to be deduplicated since
      the properly worded description exists in later generations' XMLs.
      
      Cuts 96 B from iris_dri.so and libvulkan_intel.so.
      
         text	   data	    bss	    dec	    hex	filename
       917613	      0	      0	 917613	  e006d	meson-generated_.._intel_perf_metrics.c.o (before)
       917511	      0	      0	 917511	  e0007	meson-generated_.._intel_perf_metrics.c.o (after)
      
         text	   data	    bss	    dec	    hex	filename
      14131044 365708	 210004	14706756 e06844	iris_dri.so (before)
      14130948 365708	 210004	14706660 e067e4	iris_dri.so (after)
      
         text	   data	    bss	    dec	    hex	filename
      8124321	 214264	  22820	8361405	 7f95bd	libvulkan_intel.so (before)
      8124225	 214264	  22820	8361309	 7f955d	libvulkan_intel.so (after)
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit d80d3c67)
      
      Part-of: <mesa/mesa!16405>
      2ebf4b15
    • Matt Turner's avatar
      intel/perf: Mark intel_perf_counter_* enums as PACKED · eb1e25d1
      Matt Turner authored
      
      Reduces their sizes from 4 bytes to 1. Cuts 6 KiB from iris_dri.so and
      libvulkan_intel.so.
      
         text    data     bss     dec     hex filename
       924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (before)
       917613       0       0  917613   e006d meson-generated_.._intel_perf_metrics.c.o (after)
      
         text    data     bss     dec     hex filename
      14137732 365708  210004 14713444 e08264 iris_dri.so (before)
      14131044 365708  210004 14706756 e06844 iris_dri.so (after)
      
         text    data     bss     dec     hex filename
      8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (before)
      8124321  214264   22820 8361405  7f95bd libvulkan_intel.so (after)
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 7024b8e0)
      
      Part-of: <mesa/mesa!16405>
      eb1e25d1
    • Matt Turner's avatar
      intel/perf: Store indices to strings rather than pointers · 063863dd
      Matt Turner authored
      The compiler does a good job of deduplicating strings already, but we
      can eliminate the pointers to each string by combining the strings into
      a single char array and storing only an index into that array.
      
      The longest of the char arrays is the descriptions array, which is a
      little over 45 KiB, so still under MSVC's 64 KiB string literal limit
      [0]. Because the string length is under 64 KiB we can use uint16_t as
      the index type, which roughly doubles our savings as compared to an int.
      
      This cuts 77 KiB from iris_dri.so (0.5%) and libvulkan_intel.so (0.9%).
      
         text    data     bss     dec     hex filename
       926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (before)
       924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (after)
      
         text    data     bss     dec     hex filename
      14190852 391628  210004 14792484 e1b724 iris_dri.so (before)
      14137732 365708  210004 14713444 e08264 iris_dri.so (after)
      
         text    data     bss     dec     hex filename
      8184097  240184   22820 8447101  80e47d libvulkan_intel.so (before)
      8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (after)
      
      relinfo:
      iris_dri.so (before): 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
      iris_dri.so (after) : 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
      
      libvulkan_intel.so (before): 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
      libvulkan_intel.so (after) :  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
      
      [0] https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=msvc-170&viewFallbackFrom=vs-2019
      
      
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 6c0246dc)
      
      Part-of: <!16405>
      063863dd
    • Matt Turner's avatar
      intel/perf: Use slimmer intel_perf_query_counter_data struct · 39818465
      Matt Turner authored
      
      intel_perf_query_counter contains fields for things we can't or don't
      want to store in our static data (like runtime-determined max values) or
      oa_read_counter function pointers which are dependent on the GPU gen and
      would make deduplication very ineffective.
      
      Cuts 16 KiB from iris_dri.so and libvulkan_intel.so.
      
         text    data     bss     dec     hex filename
       926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (before)
       926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (after)
      
         text    data     bss     dec     hex filename
      14190852 408908  210004 14809764 e1faa4 iris_dri.so (before)
      14190852 391628  210004 14792484 e1b724 iris_dri.so (after)
      
         text    data     bss     dec     hex filename
      8184097  257464   22820 8464381  8127fd libvulkan_intel.so (before)
      8184097  240184   22820 8447101  80e47d libvulkan_intel.so (after)
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit df5e743c)
      
      Part-of: <!16405>
      39818465
    • Matt Turner's avatar
      intel/perf: Use a function to initialize perf counters · c0e75fa0
      Matt Turner authored
      
      And specifically mark it with ATTRIBUTE_NOINLINE. Otherwise it will be
      inlined and actually slightly increase code size.
      
      Cuts 505 KiB from iris_dri.so and libvulkan_intel.so.
      
         text    data     bss     dec     hex filename
      1538720       0       0 1538720  177aa0 meson-generated_.._intel_perf_metrics.c.o (before)
       926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (after)
      
         text    data     bss     dec     hex filename
      14751700 365708  210004 15327412 e9e0b4 iris_dri.so (before)
      14190852 408908  210004 14809764 e1faa4 iris_dri.so (after)
      
         text    data     bss     dec     hex filename
      8744913  214264   22820 8981997  890ded libvulkan_intel.so (before)
      8184097  257464   22820 8464381  8127fd libvulkan_intel.so (after)
      
      Relocations increase because the counter initializations are moved from
      code (in .text) to pointers (in .text) to .rodata, which require
      relocations.
      
      relinfo:
      iris_dri.so (before): 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
      iris_dri.so (after) : 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
      
      libvulkan_intel.so (before):  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
      libvulkan_intel.so (after) : 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit bbbbb032)
      
      Part-of: <!16405>
      c0e75fa0
    • Matt Turner's avatar
      intel/perf: Deduplicate perf counters · 06cb4be8
      Matt Turner authored
      
      No changes in resulting code (yes, seriously!). GCC constant propagates
      the static const arrays into the code, yielding bit for bit identical
      results. This will however enable further cleanups.
      
      Before this patch, we emit 11916 different initializations of
      intel_perf_query_counter. With this patch we emit an array of 539 and
      initialize the intel_perf_query_counters in terms of those.
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 5e6c7a57)
      
      Part-of: <!16405>
      06cb4be8
    • Matt Turner's avatar
      intel/perf: Don't print leading space from desc_units() · ca158e4d
      Matt Turner authored
      
      Just an annoyance I noticed when I needed to generate the description
      string in two different places.
      
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 3172b5bb)
      
      Part-of: <!16405>
      ca158e4d
    • Emma Anholt's avatar
      intel/perf: Move some static blocks of C code out of the python script. · f3500570
      Emma Anholt authored and Matt Turner's avatar Matt Turner committed
      
      Now my editor can help me format code as I type.
      
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      Reviewed-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      (cherry picked from commit 12e065dd)
      
      Part-of: <!16405>
      f3500570
    • Dave Airlie's avatar
      intel/perf: use a function to do common allocations · 502823ae
      Dave Airlie authored and Matt Turner's avatar Matt Turner committed
      
      This cuts the compile time down for this file on my ryzen from
      real	1m4.077s
      to
      real	0m30.827s
      
      Reviewed-by: Emma Anholt's avatarEmma Anholt <emma@anholt.net>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      (cherry picked from commit acc2d08c)
      
      Part-of: <!16405>
      502823ae
  3. Mar 19, 2022
  4. Mar 18, 2022
Loading