 07 May, 2018 1 commit


Alyssa Rosenzweig authored
History preserved in a branch. Rebase meson.build Fix syntax errors in the meson.build Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Import ir3_cmdline.c from freedreno into panfrost Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Begin removing freedrenospecific code in midgard Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fix panfrost include Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fully decouple midgard_cmdline from freedreno This enables the module to compile, providing stubs for the NIR compiler. Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fix panfrost dependency Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> [midgard] Dump NIR and remove unnecessary passes Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Further reduce midgard Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Iterate NIR instructions Further simplification Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Ditto Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Trace out emit path for load_const Store output intrinsic Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Also vertex shaders Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Lower var copies Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Load uniform stub Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> String through compiler context Learn how to use util_dynarray for current_block Import midgard shader defines by Connor Abbott These were found in the original Midgard disassemble by cwabbott, extracted from the project cwabbotsopengputools under the license stated. They will be used here for instruction emission in the Midgard compiler. Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Iterate midgard instruction types Remove type, next_type from load_store_t Instruction type tags Compute instruction lookahead Refactor get_lookahead_type Fix lookahead by lowering tag format Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fill in part of load_uniform, other ALU tags, etc Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Dump load_store op Macro for load_uniform instructions Use for store_vary32 as well Register aliases reg, offset arguments to load_store Hack until we have initial output :) Swizzle macro Factor out emit_binary_instruction Refactor file I/O Begin emitting ALU ops ALU padding I misunderstood padding; fix it Demonstrate some tacked on constants Set sources Move ALU register work String through constants Correct registers Use correct register in fmov Refactor into M_ALU macro ALU_2 Factor out attach_constants Remove print Emit ALU Fixes to ' Make register resolution at least somewhat plausible Remove some debugging prints ALU source modifiers EMIT_ALU_CASE to macro fmul fmin, fmax load_vary Fix src Shader stage to differentiate varying/attrib load Algebraic pass Actual optimisation loop Import full list of known ALU opcodes Emit for remaining ALU ops (where possible) Update ALU ops Disable incorrect fsin/fcos for now Correctly implement sin and cos, extending NIR Explain midgard_instruction in relation to scheduler Any configuration in load_const is okay Comment half floats Don't break aliasing rules Begin eliminate_constant_mov pass Finish mov elimination Use raw SSA in the midgard compiler Register allocate stub fmov elimination is much easier in SSA space Switch to /dev/shm Try hash Search for constants Attach maybe I feel silly  fix move elimination Update compiler options Reflow constant move loop Pair load/store instructions Don't introduce a dependency chain Correct fmov argument ordering [midgard] Disable vertex shader compilation The vertex shader epilogue for these GPUs is not yet well understood; it's not worth trying to compile for it quite yet. [midgard] FMA does not exist for GL [midgard] Lowering vecs to movs will be useful [midgard] Fix fmov instruction ordering [midgard] Properly noop load/stores midgard: Introduce synthwrite to catch gl_FragColor Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Stub framebuffer write Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Introduce variadic EMIT syntax sugar Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Second half of the fbwrite Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Literal out for proper fbwrite Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Use actual compact writeout fields Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Begin ALU op ombining Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Continue ALU combining work midgard: Cleanup printfs Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: ALU combining Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Instructioncombining aware lookahead Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Register allocation position Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Workaround missing preliminary load/store errata Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Synthwrite was a mistake Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Fix warnings Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Basic uniform loading support Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Set unknown field in varying load Saner load varying midgard: Use adder for add instructions Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Rework load_input, etc to act like vc4/freedreno Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Alias imov to fmov Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Fix store out regrssion Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Begin scalar work Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Don't lower fsat midgard: Fix build Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Lower to source mods pass Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Saturation arithmetic Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> Refactor ALU emit to allow for scalar emit in future Remove unnecessary alu defs Allow scalar ops to be emitted midgard: Implement scalar_alu_modifiers Correct swizzle placement midgard: Correct order midgard: Account for scalar component special case midgard: Sort out memory safety regression from scalar refactor midgard: vlut mask midgard: Begin porting over vec4 pass from freedreno midgard: Fix vec4 midgard: Remove deadcode Fix frcp support midgard: Fix bugs with scalar source modifiers midgard: Lower subtraction midgard: Begin debugging transcendental functions midgard: Proper SSA register aliasing midgard: General improvements relating to unused arguments midgard: Reenable vertex and disable double print midgard: Only emit fragment epilogue for fragment shaders midgard: Load attribute midgard: Assign var locations midgard: Fronthalf of SSA aliases midgard: Further progress on aliasing midgard: Optimise uniforms similarly midgard: Fix uniform special case midgard: Cleanup uniform aliasing midgard: Cleanup warnings midgard: Fix nondeterministic segfault midgard: Fix regression packing with unuseds midgard: Fix regression in regression fix midgard: Begin store vary emit midgard: Begin experimenting with nir_builder midgard: Write to special register from epilogue midgard: Load gl_Position in vertex epilogue midgard: Fix bug in aliasing implementation midgard: Further hack on vertex shader epilogue midgard: Defer stores to workaround hw errata? Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> midgard: Fix early constant inline termination Cut off duplicated embedded constants midgard: Move vertex epilogue to after var assignment midgard: Import ugly internal code to fix vertex shader epilogue midgard: Get vertex shaders working.... somehow midgard: Reenable fragment compilation midgard: Fix load/store noop emission Save real softpipe panfrost: Dump clears midgard: Workaround compact branch errata panfrost: XXX Hack in the trans library XXX Hook into panfrost, uglily Continue hacky panfrost integration panfrost: Begin ripping out drawing to enable shaders Begin interfacing with the hacky resource stuff Link in transfer map Hook up vertex functions Disable user buffers for now Solve some segfaults transfer_unmap Don't crash Work fixing varying writes Remove vertex epilogue varying magic Proceed implementing vertex 'epilogue' the Right way Remove cruft that has built up from previous refactor Update comments; nir_instr_remove old st_vary Remove nowunused defer_stores Remove redunant r0 move Note about the decaying issue Fix data hazard determination for ld/st pairing Finally get eliminate_varying_mov working nicely Cleanup from previous commit Dot products Call do_mat_op_to_vec Wrap do_mat_op_to_vec Get uniforms doing something somewhat sane Fix uniform access patterns Galliumify set_constant_buffer Cleanup comments Inline n2m_alu_outmod Compiler cleanup Begin watermark RA Fixes for watermark RA Proceed writing real RA? Get RA to work Quiet output Add some profiling stubs Remove redunant lower_io calls New information re varying registers Honour literal_out in ls4 Implement vertex epilogue as per 12.5.1 Perspective division Uniforms are backward; workaround buggy VLIW Fix crash on resource destory (mesa half) Remove softpipeism Work towards correct resizable shm windows Map the surface in the right place Continue Remove what we can Remove more Cut more Strip further Continue ACCELERATED flag Remove Strip shaders Fix overzealous inline constants Encode inline vector constants Mark errata with ERRATA, not XXX Enable two instruction chains instead of one Embedded constants with ALU combining (fixes longtime regression) Bundle duplicate constants Cull ssa0 moves (missed from inline constant in luts) Embedded to inline constant for rightconstant scalar ops Scalar op flip Remove prints Inline constants in vector ops Begin work on instruction unit switching Branch compact can be packed Continue unit hopping work Split out helpers to prepare for updating midgard.h Pull in new midgard.h from SPD f2i>u Basic support for integers Disable inline constants for the moment, since they're broken inot requires MUL apparently Import new ops Emit ball/bany from NIR Import backend algebraic NIR pass stuff nir: Implement optional b2f>iand lowering This pass is required by the Midgard compiler; our instruction set uses NIRstyle booleans (~0 for true) but lacks a dedicated b2f instruction. Normally, this lowering pass would be implemented in a backendspecific algebraic pass, but this conflicts with the existing iand>b2f pass in nir_opt_algebraic.py, hanging the compiler. This patch thus makes the existing pass optional (default on  all other backends should remain unaffected), adding an optional pass for lowering the opposite direction. Signedoffby: Alyssa Rosenzweig <alyssa@rosenzweig.io> f2b, b2f in midgard Small cleanup; fix floor/ceil LUT duplication Guarantee proper fragment writeout (incurring a temporary performance regression) Begin working on csel stuff midgard: Move fsinpi stuff to backendspecific pass Reenable embedded_to_inline_constant by making it integer aware Fix constant attaching ushr opcode Fix issue with imin/imax blocking Remove prints Componentwise test for r0 breakup Try to debug When flipping arguments, also flip modifiers Lower b2i to iand Fix segfault with inot Flip vector constants isub is not commutative fne _is_ commutative Remove prints Get rid of constant moves  unnecessary complexity Remove STAGE_PROFILING Uniform base is no longer needed Remove unused macro Enable basic nir_register support in order to chuck out old vec4 pass Call convert_from_ssa weakly and generalise to registers in LUT duplication Fix st_vary input bug triggered by vertex epilogue refactor Mask for clarity Remove whitespace Fix annoying compiler segfault Reenable constant inlining (unaffected by registerisation) Fix varying move regresison and reenable Stubs to emit textures from NIR Begin basic texture op emission Get texture handles correct Set flags Set .cont and .last Hardcode mask/filter for now Hardcode a swizzle as well Force texture full for now Do something with the input swizzle Fix spelling error in header midgard: Emit fmov for source/dest texture midgard: Lower vars as necessary Rescale for the replay :v Handle weird 3D texture swizzle Stub for cubemap Hook up texture/sampler functions in softpipe shim Don't advertise compute/geometry shaders Import softpipe meson.build into panfrost Move shim into ~/panfrost Include panfrost_dri.so Register as fake swr Use the panfrost name Restore original softpipe

 08 Mar, 2018 2 commits


Ian Romanick authored
A bunch of shaders have sequences like: i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? 1 : 0)))) Other optimizations (and NIR's typeless nature) reduce this to i2b(x == y) which is silly. Skylake total instructions in shared programs: 14498698 > 14497948 (<.01%) instructions in affected programs: 74480 > 73730 (1.01%) helped: 277 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68% 95% mean confidence interval for instructions value: 3.35 2.06 95% mean confidence interval for instructions %change: 1.74% 1.16% Instructions are helped. total cycles in shared programs: 532015500 > 531999238 (<.01%) cycles in affected programs: 5943878 > 5927616 (0.27%) helped: 251 HURT: 74 helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14 helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53% HURT stats (abs) min: 1 max: 4550 x̄: 214.04 x̃: 15 HURT stats (rel) min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33% 95% mean confidence interval for cycles value: 158.51 58.43 95% mean confidence interval for cycles %change: 1.07% 0.04% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4753 > 4735 (0.38%) loops in affected programs: 18 > 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %change: 100.00% 100.00% Loops are helped. Haswell and Broadwell had simliar results. (Broadwell shown) total instructions in shared programs: 14791877 > 14791127 (<.01%) instructions in affected programs: 77326 > 76576 (0.97%) helped: 278 HURT: 1 helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49% 95% mean confidence interval for instructions value: 3.33 2.05 95% mean confidence interval for instructions %change: 1.70% 1.13% Instructions are helped. total cycles in shared programs: 558250067 > 558252872 (<.01%) cycles in affected programs: 5806328 > 5809133 (0.05%) helped: 235 HURT: 83 helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16 helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51% HURT stats (abs) min: 1 max: 10590 x̄: 265.19 x̃: 20 HURT stats (rel) min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54% 95% mean confidence interval for cycles value: 89.87 107.51 95% mean confidence interval for cycles %change: 1.06% 0.32% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4735 > 4717 (0.38%) loops in affected programs: 18 > 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %change: 100.00% 100.00% Loops are helped. total fills in shared programs: 83111 > 83110 (<.01%) fills in affected programs: 28 > 27 (3.57%) helped: 1 HURT: 0 Ivy Bridge total instructions in shared programs: 11774173 > 11773436 (<.01%) instructions in affected programs: 70819 > 70082 (1.04%) helped: 267 HURT: 0 helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2 helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63% 95% mean confidence interval for instructions value: 3.51 2.01 95% mean confidence interval for instructions %change: 1.94% 1.21% Instructions are helped. total cycles in shared programs: 257153833 > 257148932 (<.01%) cycles in affected programs: 585341 > 580440 (0.84%) helped: 167 HURT: 100 helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16 helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88% HURT stats (abs) min: 1 max: 200 x̄: 25.95 x̃: 16 HURT stats (rel) min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65% 95% mean confidence interval for cycles value: 33.25 3.46 95% mean confidence interval for cycles %change: 1.47% 0.54% Cycles are helped. total loops in shared programs: 3416 > 3398 (0.53%) loops in affected programs: 18 > 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %change: 100.00% 100.00% Loops are helped. LOST: 2 GAINED: 0 Sandy Bridge total instructions in shared programs: 10499306 > 10499094 (<.01%) instructions in affected programs: 6051 > 5839 (3.50%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2 helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45% 95% mean confidence interval for instructions value: 7.66 2.20 95% mean confidence interval for instructions %change: 5.47% 3.12% Instructions are helped. total cycles in shared programs: 145862568 > 145861370 (<.01%) cycles in affected programs: 61733 > 60535 (1.94%) helped: 36 HURT: 2 helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35 helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81% HURT stats (abs) min: 18 max: 102 x̄: 60.00 x̃: 60 HURT stats (rel) min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48% 95% mean confidence interval for cycles value: 41.28 21.77 95% mean confidence interval for cycles %change: 6.16% 3.00% Cycles are helped. total loops in shared programs: 1803 > 1785 (1.00%) loops in affected programs: 18 > 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %change: 100.00% 100.00% Loops are helped. LOST: 4 GAINED: 0 No changes on Iron Lake of GM45. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
On vector platforms, this helps elide some constant loads. v2: Reorder the transformations. No changes on Broadwell or Skylake. Haswell total instructions in shared programs: 13093793 > 13060163 (0.26%) instructions in affected programs: 1277532 > 1243902 (2.63%) helped: 13216 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: 2.57 2.49 95% mean confidence interval for instructions %change: 3.65% 3.54% Instructions are helped. total cycles in shared programs: 409580819 > 409268463 (0.08%) cycles in affected programs: 71730652 > 71418296 (0.44%) helped: 9898 HURT: 2352 helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16 helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50% HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6 HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97% 95% mean confidence interval for cycles value: 33.19 17.80 95% mean confidence interval for cycles %change: 4.50% 4.26% Cycles are helped. total fills in shared programs: 82059 > 82052 (<.01%) fills in affected programs: 21 > 14 (33.33%) helped: 7 HURT: 0 Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown) total instructions in shared programs: 11811851 > 11780605 (0.26%) instructions in affected programs: 1155007 > 1123761 (2.71%) helped: 12304 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: 2.56 2.48 95% mean confidence interval for instructions %change: 3.71% 3.59% Instructions are helped. total cycles in shared programs: 257618409 > 257316805 (0.12%) cycles in affected programs: 71999580 > 71697976 (0.42%) helped: 9155 HURT: 2380 helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16 helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62% HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4 HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33% 95% mean confidence interval for cycles value: 34.32 17.97 95% mean confidence interval for cycles %change: 4.55% 4.29% Cycles are helped. GM45 and Iron Lake had nearly identical results (Iron Lake shown) total instructions in shared programs: 7886750 > 7879944 (0.09%) instructions in affected programs: 373781 > 366975 (1.82%) helped: 3715 HURT: 47 helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1 helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06% HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2 HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35% 95% mean confidence interval for instructions value: 1.85 1.77 95% mean confidence interval for instructions %change: 2.91% 2.73% Instructions are helped. total cycles in shared programs: 178114636 > 178095452 (0.01%) cycles in affected programs: 7227666 > 7208482 (0.27%) helped: 3349 HURT: 301 helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4 helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63% HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10 HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50% 95% mean confidence interval for cycles value: 5.52 4.99 95% mean confidence interval for cycles %change: 0.81% 0.73% Cycles are helped. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]

 06 Mar, 2018 7 commits


Ian Romanick authored
All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14514555 > 14514547 (<.01%) instructions in affected programs: 1972 > 1964 (0.41%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.41% 0.40% Instructions are helped. total cycles in shared programs: 533141444 > 533136780 (<.01%) cycles in affected programs: 164728 > 160064 (2.83%) helped: 181 HURT: 3 helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30 helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80% HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14 HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68% 95% mean confidence interval for cycles value: 27.12 23.58 95% mean confidence interval for cycles %change: 3.54% 3.16% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533667 > 10533539 (<.01%) instructions in affected programs: 10148 > 10020 (1.26%) helped: 124 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04% 95% mean confidence interval for instructions value: 1.06 1.00 95% mean confidence interval for instructions %change: 2.46% 1.95% Instructions are helped. total cycles in shared programs: 146136887 > 146132122 (<.01%) cycles in affected programs: 206382 > 201617 (2.31%) helped: 171 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30 helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67% 95% mean confidence interval for cycles value: 29.19 26.54 95% mean confidence interval for cycles %change: 3.20% 2.76% Cycles are helped. Iron Lake total instructions in shared programs: 7886515 > 7886507 (<.01%) instructions in affected programs: 3016 > 3008 (0.27%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.27% 0.26% Instructions are helped. total cycles in shared programs: 178100396 > 178100388 (<.01%) cycles in affected programs: 156128 > 156120 (<.01%) helped: 4 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 3.68 1.68 95% mean confidence interval for cycles %change: 0.03% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857872 > 4857868 (<.01%) instructions in affected programs: 1544 > 1540 (0.26%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.28% 0.24% Instructions are helped. total cycles in shared programs: 122167654 > 122167662 (<.01%) cycles in affected programs: 96248 > 96256 (<.01%) helped: 0 HURT: 4 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %change: <.01% 0.02% Cycles are HURT. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
The replacement of the comparison operators must happen during this step. If it does not, the next pass of nir_opt_algebraic will reapply De Morgan's Law in the "opposite direction" before performing dead code elimination. The resulting infinite loop will eventually get OOM killed. Haswell, Broadwell, and Skylake had similar results. (Broadwell shown) total instructions in shared programs: 14808185 > 14808036 (<.01%) instructions in affected programs: 13758 > 13609 (1.08%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3 helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01% 95% mean confidence interval for instructions value: 4.67 2.97 95% mean confidence interval for instructions %change: 1.09% 0.88% Instructions are helped. total cycles in shared programs: 559438333 > 559435832 (<.01%) cycles in affected programs: 199160 > 196659 (1.26%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51 helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40% HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40 HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74% 95% mean confidence interval for cycles value: 71.47 39.69 95% mean confidence interval for cycles %change: 1.64% 0.93% Cycles are helped. Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11811776 > 11811553 (<.01%) instructions in affected programs: 15201 > 14978 (1.47%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6 helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26% 95% mean confidence interval for instructions value: 7.21 4.23 95% mean confidence interval for instructions %change: 1.48% 1.12% Instructions are helped. total cycles in shared programs: 257617270 > 257614589 (<.01%) cycles in affected programs: 212107 > 209426 (1.26%) helped: 45 HURT: 0 helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54 helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32% 95% mean confidence interval for cycles value: 74.02 45.14 95% mean confidence interval for cycles %change: 1.59% 1.01% Cycles are helped. Iron Lake total instructions in shared programs: 7886648 > 7886515 (<.01%) instructions in affected programs: 14106 > 13973 (0.94%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4 helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81% 95% mean confidence interval for instructions value: 5.65 3.52 95% mean confidence interval for instructions %change: 1.03% 0.76% Instructions are helped. total cycles in shared programs: 178100812 > 178100396 (<.01%) cycles in affected programs: 67970 > 67554 (0.61%) helped: 29 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54% 95% mean confidence interval for cycles value: 18.30 10.39 95% mean confidence interval for cycles %change: 0.71% 0.45% Cycles are helped. GM45 total instructions in shared programs: 4857939 > 4857872 (<.01%) instructions in affected programs: 7426 > 7359 (0.90%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4 helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77% 95% mean confidence interval for instructions value: 6.06 2.87 95% mean confidence interval for instructions %change: 1.06% 0.67% Instructions are helped. total cycles in shared programs: 122167930 > 122167654 (<.01%) cycles in affected programs: 43118 > 42842 (0.64%) helped: 15 HURT: 0 helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54% 95% mean confidence interval for cycles value: 25.03 11.77 95% mean confidence interval for cycles %change: 0.82% 0.41% Cycles are helped. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
All of the affected shaders are HDR mappers from Serious Sam 3. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14516285 > 14516273 (<.01%) instructions in affected programs: 348 > 336 (3.45%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 5.55% 3.06% Instructions are helped. total cycles in shared programs: 533163876 > 533163808 (<.01%) cycles in affected programs: 1144 > 1076 (5.94%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for cycles value: 18.84 15.16 95% mean confidence interval for cycles %change: 6.20% 5.68% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533321 > 10533309 (<.01%) instructions in affected programs: 372 > 360 (3.23%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 4.96% 2.86% Instructions are helped. total cycles in shared programs: 146136632 > 146136428 (<.01%) cycles in affected programs: 11668 > 11464 (1.75%) helped: 12 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29% 95% mean confidence interval for cycles value: 17.66 16.34 95% mean confidence interval for cycles %change: 2.82% 1.58% Cycles are helped. Iron Lake total instructions in shared programs: 7886301 > 7886277 (<.01%) instructions in affected programs: 576 > 552 (4.17%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %change: 5.30% 3.72% Instructions are helped. total cycles in shared programs: 178113176 > 178113176 (0.00%) cycles in affected programs: 2116 > 2116 (0.00%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for cycles value: 3.25 3.25 95% mean confidence interval for cycles %change: 0.93% 0.94% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857756 > 4857744 (<.01%) instructions in affected programs: 294 > 282 (4.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %change: 5.71% 3.09% Instructions are helped. total cycles in shared programs: 122178730 > 122178722 (<.01%) cycles in affected programs: 700 > 692 (1.14%) helped: 2 HURT: 0 Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
All platforms had similar results. (Skylake shown) total instructions in shared programs: 14516592 > 14516586 (<.01%) instructions in affected programs: 500 > 494 (1.20%) helped: 2 HURT: 0 total cycles in shared programs: 533167044 > 533166998 (<.01%) cycles in affected programs: 6988 > 6942 (0.66%) helped: 2 HURT: 0 Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
I noticed the fge version while looking at a shader for an unrelated reason. The feq version prevents a regression in a later change that performs strength reduction of some compares. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514808 > 14514796 (<.01%) instructions in affected programs: 750 > 738 (1.60%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40% 95% mean confidence interval for instructions value: 6.67 0.67 95% mean confidence interval for instructions %change: 2.43% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533144939 > 533144853 (<.01%) cycles in affected programs: 8911 > 8825 (0.97%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19 helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31% 95% mean confidence interval for cycles value: 32.94 10.06 95% mean confidence interval for cycles %change: 2.30% 0.26% Cycles are helped. Haswell total instructions in shared programs: 13093785 > 13093775 (<.01%) instructions in affected programs: 924 > 914 (1.08%) helped: 4 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: 4.53 1.20 95% mean confidence interval for instructions %change: 2.02% 0.97% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 409580553 > 409580118 (<.01%) cycles in affected programs: 10909 > 10474 (3.99%) helped: 5 HURT: 1 helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18 helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78% HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13 HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39% 95% mean confidence interval for cycles value: 180.68 35.68 95% mean confidence interval for cycles %change: 19.55% 3.79% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11811851 > 11811840 (<.01%) instructions in affected programs: 1032 > 1021 (1.07%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: 4.17 0.51 95% mean confidence interval for instructions %change: 1.86% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 257618403 > 257618168 (<.01%) cycles in affected programs: 10784 > 10549 (2.18%) helped: 4 HURT: 2 helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17 helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72% HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11 HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for cycles value: 133.11 54.78 95% mean confidence interval for cycles %change: 14.79% 5.59% Inconclusive result (value mean confidence interval includes 0). GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown) total instructions in shared programs: 10533871 > 10533859 (<.01%) instructions in affected programs: 865 > 853 (1.39%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21% 95% mean confidence interval for instructions value: 6.67 0.67 95% mean confidence interval for instructions %change: 2.16% 0.29% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 146139904 > 146139852 (<.01%) cycles in affected programs: 15213 > 15161 (0.34%) helped: 4 HURT: 0 helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15 helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29% 95% mean confidence interval for cycles value: 23.79 2.21 95% mean confidence interval for cycles %change: 0.88% 0.09% Inconclusive result (%change mean confidence interval includes 0). Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com>

Ian Romanick authored
These transformations are inexact because section 4.7.1 (Range and Precision) says: Operations and builtin functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Ian Romanick authored
This transformation is inexact because section 4.7.1 (Range and Precision) says: Operations and builtin functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. v2: Reorder operands and mark as inexact. The latter suggested by Jason. shaderdb results: Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514817 > 14514808 (<.01%) instructions in affected programs: 229 > 220 (3.93%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% total cycles in shared programs: 533145211 > 533144939 (<.01%) cycles in affected programs: 37268 > 36996 (0.73%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total cycles in shared programs: 257618409 > 257618403 (<.01%) cycles in affected programs: 12582 > 12576 (0.05%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% No changes on Iron Lake or GM45. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

 27 Feb, 2018 1 commit


Timothy Arceri authored
Reviewedby: Marek Olšák <marek.olsak@amd.com>

 22 Feb, 2018 2 commits


Samuel Pitoiset authored
Similar for the 4 case. Suggested by Bas. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Samuel Pitoiset authored
Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

 30 Jan, 2018 8 commits


Ian Romanick authored
This was specifically designed to simplify 1+mix(0, a1, condition) to mix(1, a, condition) by pushing the 1+ inside. Skylake, Broadwell, and Haswell had similar results. Skylake shown. total instructions in shared programs: 14521753 > 14521716 (<.01%) instructions in affected programs: 10619 > 10582 (0.35%) helped: 51 HURT: 14 helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1 helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95% HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1 HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32% 95% mean confidence interval for instructions value: 1.31 0.17 95% mean confidence interval for instructions %change: 0.80% 0.27% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533000205 > 533003533 (<.01%) cycles in affected programs: 110610 > 113938 (3.01%) helped: 43 HURT: 28 helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16 helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67% HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14 HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62% 95% mean confidence interval for cycles value: 43.81 137.56 95% mean confidence interval for cycles %change: 1.47% 3.60% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10018840 > 10018713 (<.01%) instructions in affected programs: 9431 > 9304 (1.35%) helped: 51 HURT: 3 helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1 helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81% HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1 HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22% 95% mean confidence interval for instructions value: 5.36 0.66 95% mean confidence interval for instructions %change: 1.66% 0.46% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 87571944 > 87572785 (<.01%) cycles in affected programs: 117234 > 118075 (0.72%) helped: 42 HURT: 23 helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30 helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74% HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10 HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61% 95% mean confidence interval for cycles value: 61.05 86.93 95% mean confidence interval for cycles %change: 3.47% 0.33% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10542933 > 10542844 (<.01%) instructions in affected programs: 11487 > 11398 (0.77%) helped: 52 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72% HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1 HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22% 95% mean confidence interval for instructions value: 3.17 0.07 95% mean confidence interval for instructions %change: 1.13% 0.52% Instructions are helped. total cycles in shared programs: 146098397 > 146097094 (<.01%) cycles in affected programs: 128140 > 126837 (1.02%) helped: 47 HURT: 8 helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18 helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95% HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9 HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34% 95% mean confidence interval for cycles value: 37.49 9.90 95% mean confidence interval for cycles %change: 1.22% 0.71% Cycles are helped. Iron Lake total instructions in shared programs: 7886711 > 7886509 (<.01%) instructions in affected programs: 10425 > 10223 (1.94%) helped: 50 HURT: 2 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89% 95% mean confidence interval for instructions value: 8.05 0.28 95% mean confidence interval for instructions %change: 1.83% 0.26% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178115324 > 178114612 (<.01%) cycles in affected programs: 765726 > 765014 (0.09%) helped: 39 HURT: 1 helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8 helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: 32.07 3.53 95% mean confidence interval for cycles %change: 0.86% 0.10% Inconclusive result (%change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857762 > 4857661 (<.01%) instructions in affected programs: 5523 > 5422 (1.83%) helped: 25 HURT: 1 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86% 95% mean confidence interval for instructions value: 9.99 2.22 95% mean confidence interval for instructions %change: 2.01% 0.08% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 122179674 > 122179194 (<.01%) cycles in affected programs: 530162 > 529682 (0.09%) helped: 22 HURT: 1 helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7 helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: 46.56 4.82 95% mean confidence interval for cycles %change: 1.20% 0.36% Inconclusive result (value mean confidence interval includes 0). Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14521769 > 14521753 (<.01%) instructions in affected programs: 8782 > 8766 (0.18%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.23% 0.16% Instructions are helped. total cycles in shared programs: 533000376 > 533000205 (<.01%) cycles in affected programs: 447035 > 446864 (0.04%) helped: 9 HURT: 9 helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40 helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09% HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12% 95% mean confidence interval for cycles value: 25.07 6.07 95% mean confidence interval for cycles %change: 0.08% 0.27% Inconclusive result (value mean confidence interval includes 0). No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
If both comparisons are used as sources for instructions other than the iand, this transformation is detrimental. If the nonidentical value in both compares is constant, the fmin or fmax will be constantfolded away, so the transformation is always a win. It is interesting to me that on Iron Lake only 81 shaders have instruction counts changed, but 726 shaders have cycle counts changed. shaderdb results: Skylake total instructions in shared programs: 14525728 > 14521017 (0.03%) instructions in affected programs: 1164726 > 1160015 (0.40%) helped: 1692 HURT: 5 helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2 helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1 HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86% 95% mean confidence interval for instructions value: 3.52 2.03 95% mean confidence interval for instructions %change: 0.86% 0.74% Instructions are helped. total cycles in shared programs: 533115449 > 532991404 (0.02%) cycles in affected programs: 119401803 > 119277758 (0.10%) helped: 1145 HURT: 467 helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18 helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42% HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15 HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39% 95% mean confidence interval for cycles value: 122.16 31.74 95% mean confidence interval for cycles %change: 0.94% 0.57% Cycles are helped. total spills in shared programs: 9597 > 9534 (0.66%) spills in affected programs: 403 > 340 (15.63%) helped: 1 HURT: 1 total fills in shared programs: 13904 > 13790 (0.82%) fills in affected programs: 1627 > 1513 (7.01%) helped: 2 HURT: 1 LOST: 0 GAINED: 2 Broadwell total instructions in shared programs: 14816966 > 14812590 (0.03%) instructions in affected programs: 1499885 > 1495509 (0.29%) helped: 1672 HURT: 15 helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8 HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53% 95% mean confidence interval for instructions value: 3.14 2.05 95% mean confidence interval for instructions %change: 0.85% 0.73% Instructions are helped. total cycles in shared programs: 559353622 > 559345595 (<.01%) cycles in affected programs: 139893703 > 139885676 (<.01%) helped: 921 HURT: 697 helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18 helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87% HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38 HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14% 95% mean confidence interval for cycles value: 59.64 49.72 95% mean confidence interval for cycles %change: 1.02% 0.66% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78902 > 78861 (0.05%) spills in affected programs: 2418 > 2377 (1.70%) helped: 1 HURT: 11 total fills in shared programs: 83782 > 83678 (0.12%) fills in affected programs: 3515 > 3411 (2.96%) helped: 2 HURT: 11 LOST: 0 GAINED: 5 Haswell and Ivy Bridge had similar results. Haswell shown. total instructions in shared programs: 9033898 > 9032010 (0.02%) instructions in affected programs: 308064 > 306176 (0.61%) helped: 921 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1 helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for instructions value: 2.21 1.87 95% mean confidence interval for instructions %change: 0.88% 0.68% Instructions are helped. total cycles in shared programs: 84628949 > 84620520 (<.01%) cycles in affected programs: 2164913 > 2156484 (0.39%) helped: 518 HURT: 359 helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20 helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01% HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8 HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40% 95% mean confidence interval for cycles value: 15.17 4.05 95% mean confidence interval for cycles %change: 0.77% 0.32% Cycles are helped. LOST: 0 GAINED: 4 Sandy Bridge total instructions in shared programs: 10544860 > 10542933 (0.02%) instructions in affected programs: 360019 > 358092 (0.54%) helped: 931 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33% 95% mean confidence interval for instructions value: 2.23 1.89 95% mean confidence interval for instructions %change: 0.76% 0.58% Instructions are helped. total cycles in shared programs: 146106820 > 146098397 (<.01%) cycles in affected programs: 3435047 > 3426624 (0.25%) helped: 572 HURT: 329 helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15 helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33% HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6 HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19% 95% mean confidence interval for cycles value: 16.85 1.85 95% mean confidence interval for cycles %change: 0.39% 0.01% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7886925 > 7886711 (<.01%) instructions in affected programs: 25763 > 25549 (0.83%) helped: 75 HURT: 6 helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1 helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53% HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86% 95% mean confidence interval for instructions value: 3.69 1.60 95% mean confidence interval for instructions %change: 2.54% 0.57% Instructions are helped. total cycles in shared programs: 178116888 > 178115324 (<.01%) cycles in affected programs: 5858790 > 5857226 (0.03%) helped: 484 HURT: 242 helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03% 95% mean confidence interval for cycles value: 2.76 1.55 95% mean confidence interval for cycles %change: 0.12% 0.01% Inconclusive result (%change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857870 > 4857762 (<.01%) instructions in affected programs: 13994 > 13886 (0.77%) helped: 39 HURT: 5 helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48% HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86% 95% mean confidence interval for instructions value: 3.86 1.05 95% mean confidence interval for instructions %change: 2.61% 0.04% Inconclusive result (%change mean confidence interval includes 0). total cycles in shared programs: 122180744 > 122179674 (<.01%) cycles in affected programs: 3686646 > 3685576 (0.03%) helped: 273 HURT: 141 helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02% 95% mean confidence interval for cycles value: 3.42 1.75 95% mean confidence interval for cycles %change: 0.15% 0.03% Inconclusive result (%change mean confidence interval includes 0). Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shaderdb changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shaderdb results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 > 14525913 (<.01%) instructions in affected programs: 4613 > 4505 (2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 > 533118403 (<.01%) cycles in affected programs: 34334 > 34027 (0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
shaderdb results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 > 14525836 (<.01%) instructions in affected programs: 1964 > 1902 (3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: 9.46 0.60 95% mean confidence interval for instructions %change: 3.97% 0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 > 533115756 (<.01%) cycles in affected programs: 96061 > 91925 (4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: 379.43 211.43 95% mean confidence interval for cycles %change: 4.84% 3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 > 9033898 (<.01%) instructions in affected programs: 535 > 485 (9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 > 84628949 (<.01%) cycles in affected programs: 63197 > 60744 (3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: 253.40 73.67 95% mean confidence interval for cycles %change: 4.24% 2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the nonidentical value in both compares is constant, the fmin or fmax will be constantfolded away, so the transformation is always a win. shaderdb results: Skylake total instructions in shared programs: 14526147 > 14525898 (<.01%) instructions in affected programs: 70239 > 69990 (0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: 2.86 2.02 95% mean confidence interval for instructions %change: 0.46% 0.31% Instructions are helped. total cycles in shared programs: 533120531 > 533119892 (<.01%) cycles in affected programs: 994875 > 994236 (0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: 19.44 6.91 95% mean confidence interval for cycles %change: 0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 > 14815787 (<.01%) instructions in affected programs: 64658 > 64440 (0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: 2.62 1.87 95% mean confidence interval for instructions %change: 0.45% 0.30% Instructions are helped. total cycles in shared programs: 559340386 > 559339907 (<.01%) cycles in affected programs: 1090491 > 1090012 (0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: 15.94 5.75 95% mean confidence interval for cycles %change: 0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 > 9033948 (<.01%) instructions in affected programs: 24096 > 23938 (0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: 4.71 3.60 95% mean confidence interval for instructions %change: 0.84% 0.58% Instructions are helped. total cycles in shared programs: 84631628 > 84631402 (<.01%) cycles in affected programs: 148674 > 148448 (0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: 19.42 3.28 95% mean confidence interval for cycles %change: 0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 > 10015293 (<.01%) instructions in affected programs: 27701 > 27538 (0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: 4.87 3.71 95% mean confidence interval for instructions %change: 0.82% 0.51% Instructions are helped. total cycles in shared programs: 87524771 > 87524569 (<.01%) cycles in affected programs: 112324 > 112122 (0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: 29.14 6.69 95% mean confidence interval for cycles %change: 0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 > 10545465 (<.01%) instructions in affected programs: 37198 > 37008 (0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: 5.14 3.91 95% mean confidence interval for instructions %change: 0.68% 0.47% Instructions are helped. total cycles in shared programs: 146113059 > 146112427 (<.01%) cycles in affected programs: 423514 > 422882 (0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: 26.03 4.07 95% mean confidence interval for cycles %change: 0.43% 0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 > 7886925 (<.01%) instructions in affected programs: 1340 > 1306 (2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: 20.44 3.44 95% mean confidence interval for instructions %change: 5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 > 178116888 (<.01%) cycles in affected programs: 6262 > 6154 (1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: 93.27 39.27 95% mean confidence interval for cycles %change: 5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 > 4857870 (<.01%) instructions in affected programs: 674 > 657 (2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 > 122180744 (<.01%) cycles in affected programs: 3764 > 3692 (1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

Ian Romanick authored
Doing the same for the existing feq and fne transformations didn't help anything in shaderdb. shaderdb results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 > 14526147 (0.02%) instructions in affected programs: 402420 > 399104 (0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: 1.51 1.41 95% mean confidence interval for instructions %change: 3.06% 2.78% Instructions are helped. total cycles in shared programs: 533146915 > 533120531 (<.01%) cycles in affected programs: 10356261 > 10329877 (0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: 11.78 7.22 95% mean confidence interval for cycles %change: 1.98% 1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 > 9034106 (0.04%) instructions in affected programs: 389831 > 386521 (0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: 1.49 1.39 95% mean confidence interval for instructions %change: 2.68% 2.41% Instructions are helped. total cycles in shared programs: 84636243 > 84631628 (<.01%) cycles in affected programs: 4745058 > 4740443 (0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: 4.51 1.29 95% mean confidence interval for cycles %change: 1.64% 1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 > 10015456 (0.03%) instructions in affected programs: 512820 > 509403 (0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: 1.46 1.35 95% mean confidence interval for instructions %change: 2.38% 2.12% Instructions are helped. total cycles in shared programs: 87538223 > 87524771 (0.02%) cycles in affected programs: 5435520 > 5422068 (0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: 7.34 2.06 95% mean confidence interval for cycles %change: 1.62% 1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 > 7886959 (0.02%) instructions in affected programs: 331581 > 330094 (0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: 1.25 1.12 95% mean confidence interval for instructions %change: 0.91% 0.75% Instructions are helped. total cycles in shared programs: 178130766 > 178116996 (<.01%) cycles in affected programs: 12534564 > 12520794 (0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: 7.41 6.07 95% mean confidence interval for cycles %change: 0.28% 0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 > 4857887 (0.02%) instructions in affected programs: 237565 > 236540 (0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: 1.18 1.04 95% mean confidence interval for instructions %change: 0.88% 0.71% Instructions are helped. total cycles in shared programs: 122189118 > 122180816 (<.01%) cycles in affected programs: 8776418 > 8768116 (0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: 6.78 5.26 95% mean confidence interval for cycles %change: 0.24% 0.18% Cycles are helped. Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Reviewedby: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewedby: Elie Tournier <elie.tournier@collabora.com>

 01 Aug, 2017 1 commit


Connor Abbott authored
The optimizations are only valid for 32bit integers. They were mistakenly firing for 64bit integers as well. Cc: mesastable@lists.freedesktop.org Reviewedby: Matt Turner <mattst88@gmail.com>

 20 Jul, 2017 1 commit


Matt Turner authored
Two of the ARB_shader_ballot piglit tests hit the find_lsb case, removing some of the noise allowed me to better debug the test when it was failing. Reviewedby: Connor Abbott <cwabbott0@gmail.com>

 24 Apr, 2017 3 commits


Timothy Arceri authored
This shuffles constants down in the reverse of what the previous patch does and applies some simpilifications that may be made possible from doing so. Shaderdb results BDW: total instructions in shared programs: 12980814 > 12977822 (0.02%) instructions in affected programs: 281889 > 278897 (1.06%) helped: 1231 HURT: 128 total cycles in shared programs: 246562852 > 246567288 (0.00%) cycles in affected programs: 11271524 > 11275960 (0.04%) helped: 1630 HURT: 1378 V2: mark float opts as inexact Reviewedby: Elie Tournier <elie.tournier@collabora.com> Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

Timothy Arceri authored
V2: mark float opts as inexact If one of the inputs to an mul/add is the result of another mul/add there is a chance that we can reuse the result of that mul/add in other calls if we do the multiplication in the right order. Also by attempting to move all constants to the top we increase the chance of constant folding. For example it is a fairly common pattern for shaders to do something similar to this: const float a = 0.5; in vec4 b; in float c; ... b.x = b.x * c; b.y = b.y * c; ... b.x = b.x * a + a; b.y = b.y * a + a; So by simply detecting that constant a is part of the multiplication in ffma and switching it with previous fmul that updates b we end up with: ... c = a * c; ... b.x = b.x * c + a; b.y = b.y * c + a; Shaderdb results BDW: total instructions in shared programs: 13011050 > 12967888 (0.33%) instructions in affected programs: 4118366 > 4075204 (1.05%) helped: 17739 HURT: 1343 total cycles in shared programs: 246717952 > 246410716 (0.12%) cycles in affected programs: 166870802 > 166563566 (0.18%) helped: 18493 HURT: 7965 total spills in shared programs: 14937 > 14560 (2.52%) spills in affected programs: 9331 > 8954 (4.04%) helped: 284 HURT: 33 total fills in shared programs: 20211 > 19671 (2.67%) fills in affected programs: 12586 > 12046 (4.29%) helped: 286 HURT: 33 LOST: 39 GAINED: 33 Some of the hurt will go away when we shuffle things back down to the bottom in the following patch. It's also noteworthy that almost all of the spill changes are in Deus Ex both hurt and helped. Reviewedby: Elie Tournier <elie.tournier@collabora.com> Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

Timothy Arceri authored
Didn't turn out as useful as I'd hoped, but it will help alot more on i965 by reducing regressions when we drop brw_do_channel_expressions() and brw_do_vector_splitting(). I'm not sure how much sense 'is_not_used_by_conditional' makes on platforms other than i965 but since this is a new opt it at least won't do any harm. shaderdb BDW: total instructions in shared programs: 13029581 > 13029415 (0.00%) instructions in affected programs: 15268 > 15102 (1.09%) helped: 86 HURT: 0 total cycles in shared programs: 247038346 > 247036198 (0.00%) cycles in affected programs: 692634 > 690486 (0.31%) helped: 183 HURT: 27 Reviewedby: Elie Tournier <elie.tournier@collabora.com> Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

 14 Mar, 2017 1 commit


Jason Ekstrand authored
The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sortof random. This commit reorganizes things and makes them all consistent:  All nonbool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>.  Integer <> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they signextend when upconverting.  Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and autogenerated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewedby: Eric Anholt <eric@anholt.net>

 10 Mar, 2017 1 commit


Emil Velikov authored
Analogous to earlier commit(s). Signedoffby: Emil Velikov <emil.velikov@collabora.com> Reviewedby: Eric Engestrom <eric.engestrom@imgtec.com>

 17 Feb, 2017 2 commits


Jason Ekstrand authored
This reduces the instruction count in some fp64 and int64 piglit tests Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

Jason Ekstrand authored
NIR is a typeless IR and the two opcodes, when considered bitwise, do exactly the same thing. There's no reason to have two versions. Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

 20 Jan, 2017 2 commits


Ian Romanick authored
Previously both sources were unsized. This caused problems when the thing being shifted was 64bit but the shift count was 32bit. The expectation in NIR is that all unsized sources (and destination) will ultimately have the same size. The changes in nir_opt_algebraic.py are to prevent errors like: Failed to parse transformation: 03:12:25 (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options>lower_extract_byte') 03:12:25 Traceback (most recent call last): 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__ 03:12:25 xform = SearchAndReplace(xform) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__ 03:12:25 BitSizeValidator(varset).validate(self.search, self.replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate 03:12:25 validate_dst_class = self._validate_bit_class_up(replace) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up 03:12:25 src_class = self._validate_bit_class_up(val.sources[i]) 03:12:25 File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up 03:12:25 assert src_class == src_type_bits 03:12:25 AssertionError Signedoffby: Ian Romanick <ian.d.romanick@intel.com> Suggestedby: Connor Abbott <cwabbott0@gmail.com> Reviewedby: Connor Abbott <cwabbott0@gmail.com> Cc: Jason Ekstrand <jason@jlekstrand.net>

Elie Tournier authored
Add the following optimisations: min(x, x) = abs(x) min(x, abs(x)) = abs(x) min(x, abs(x)) = x max(x, abs(x)) = x max(x, abs(x)) = abs(x) max(x, x) = abs(x) shaderdb: total instructions in shared programs: 13067779 > 13067775 (0.00%) instructions in affected programs: 249 > 245 (1.61%) helped: 4 HURT: 0 total cycles in shared programs: 252054838 > 252054806 (0.00%) cycles in affected programs: 504 > 472 (6.35%) helped: 2 HURT: 0 Signedoffby: Elie Tournier <tournier.elie@gmail.com> Reviewedby: Plamena Manolova <plamena.manolova@intel.com> Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

 14 Jan, 2017 1 commit


Timothy Arceri authored
shaderdb results BDW: total instructions in shared programs: 13060410 > 13060313 (0.00%) instructions in affected programs: 24533 > 24436 (0.40%) helped: 88 HURT: 0 total cycles in shared programs: 256585692 > 256586698 (0.00%) cycles in affected programs: 647290 > 648296 (0.16%) helped: 35 HURT: 30 Reviewedby: Matt Turner <mattst88@gmail.com>

 11 Jan, 2017 3 commits


Timothy Arceri authored
Otherwise we will end up with an extra instruction to compare the result of the inot. On BDW: total instructions in shared programs: 13060620 > 13060481 (0.00%) instructions in affected programs: 103379 > 103240 (0.13%) helped: 127 HURT: 0 total cycles in shared programs: 256590950 > 256587408 (0.00%) cycles in affected programs: 11324730 > 11321188 (0.03%) helped: 114 HURT: 21 Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Timothy Arceri authored
We turn these from bcsel into inot/b2f combos in order for other optimisation passes to get further. Once we have finished turn the ones that remain and are used in more than a single expression back into a bcsel. On BDW: total instructions in shared programs: 13060965 > 13060297 (0.01%) instructions in affected programs: 835701 > 835033 (0.08%) helped: 670 HURT: 2 total cycles in shared programs: 256599536 > 256598006 (0.00%) cycles in affected programs: 114655488 > 114653958 (0.00%) helped: 419 HURT: 240 LOST: 0 GAINED: 1 The 2 HURT is because inserting bcsel creates the only use of const 1.0 in two shaders from trioffriendshipandmadness. Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Timothy Arceri authored
On BDW: total instructions in shared programs: 13061890 > 13061877 (0.00%) instructions in affected programs: 2441 > 2428 (0.53%) helped: 13 HURT: 0 total cycles in shared programs: 256612254 > 256611784 (0.00%) cycles in affected programs: 16418 > 15948 (2.86%) helped: 10 HURT: 2 V2: don't use ffma directly Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

 09 Jan, 2017 3 commits


Timothy Arceri authored
On BDW: total instructions in shared programs: 13061877 > 13060965 (0.01%) instructions in affected programs: 133569 > 132657 (0.68%) helped: 566 HURT: 0 total cycles in shared programs: 256611784 > 256599536 (0.00%) cycles in affected programs: 861016 > 848768 (1.42%) helped: 379 HURT: 73 Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Kenneth Graunke authored
On BDW: total instructions in shared programs: 13074882 > 13068703 (0.05%) instructions in affected programs: 1823116 > 1816937 (0.34%) helped: 4187 HURT: 537 total cycles in shared programs: 256622718 > 256425382 (0.08%) cycles in affected programs: 123790120 > 123592784 (0.16%) helped: 3823 HURT: 2037 total spills in shared programs: 15276 > 14929 (2.27%) spills in affected programs: 9446 > 9099 (3.67%) helped: 352 HURT: 1 total fills in shared programs: 20496 > 20144 (1.72%) fills in affected programs: 13040 > 12688 (2.70%) helped: 352 HURT: 1 LOST: 2 GAINED: 21 v2: Rely on 'a' being a wellformed boolean (Connor, Eric). Signedoffby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Timothy Arceri <timothy.arceri@collabora.com> Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Kenneth Graunke authored
On BDW: total instructions in shared programs: 13071119 > 13070371 (0.01%) instructions in affected programs: 83424 > 82676 (0.90%) helped: 505 HURT: 45 (all TCS, all hurt by a single instruction) total cycles in shared programs: 256601322 > 256588932 (0.00%) cycles in affected programs: 819410 > 807020 (1.51%) helped: 450 HURT: 57 total loops in shared programs: 2950 > 2942 (0.27%) loops in affected programs: 8 > 0 helped: 7 HURT: 0 v2: Drop unnecessary 'a@bool' annotation (Connor, Eric). Add a comment explaining the rule (Ian). Signedoffby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewedby: Timothy Arceri <timothy.arceri@collabora.com> Reviewedby: Jason Ekstrand <jason@jlekstrand.net> Reviewedby: Matt Turner <mattst88@gmail.com>

 23 Dec, 2016 1 commit


Jason Ekstrand authored
This sequence shows up The Talos Principal, at least under Vulkan, and prevents loop analysis from properly computing trip counts in a few loops. Reviewedby: Ian Romanick <ian.d.romanick@intel.com>
