Skip to content
  • Ian Romanick's avatar
    nir/algebraic: Reassociate open-coded flrp(1, b, c) · ab869261
    Ian Romanick authored
    
    
    In a previous verion of this patch, Jason commented,
    
       "Re-associating based on whether or not something has a constant
       value of 1.0 seems a bit sneaky.  I think it's well within the rules
       but it seems like something that could bite you."
    
    That is possibly true.  The reassociation will generate different
    results if fabs(b) >= 2**24 and fabs(c) < 0.5.  The delta increases as
    fabs(c) approaches 0.
    
    However, i965 has done this same reassociation indirectly for years.
    We would previously allow nir_op_flrp on all pre-Gen11 hardware even
    though Gen4 and Gen5 do not have a LRP instruction.  Optimizations in
    nir_opt_algebraic would convert expressions like a+c(b-a) into flrp(a,
    b, c).  On Gen7+, the hardware performs the same arithmetic as
    a(1-c)+bc.  Gen6 seems to implement LRP as a+c(b-a).  On Gen4 and
    Gen5, we would lower LRP to a sequence of instructions that implement
    a(1-c)+bc.  The lowering happens after all constant folding, so we
    would litterally generate a 1+(-1) instruction sequence in this
    scenario: one instruction to load either 1 or -1 in a register, and
    another instruction to add either -1 or 1 to it.
    
    This patch just cuts out the middle man.  Do the reassociation that
    we've always done, but do it explicitly at a time when we can benefit
    from other optimizations.
    
    A few cases that were hurt by "nir: Lower flrp(±1, b, c) and flrp(a,
    ±1, c) differently" are restored by this patch.  This includes a few
    shaders in ET:QW.
    
    I tried a similar thing for open-coded flrp(-1, b, c), and it hurt
    instructions on 35 shaders for ILK without helping any.  The helped /
    hurt cycles was about even.
    
    No changes on any other Intel platforms.
    
    Iron Lake and GM45 had similar results. (Iron Lake shown)
    total instructions in shared programs: 8172020 -> 8164367 (-0.09%)
    instructions in affected programs: 1089851 -> 1082198 (-0.70%)
    helped: 3285
    HURT: 64
    helped stats (abs) min: 1 max: 6 x̄: 2.35 x̃: 2
    helped stats (rel) min: 0.13% max: 12.00% x̄: 1.15% x̃: 0.83%
    HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
    HURT stats (rel)   min: 0.24% max: 0.64% x̄: 0.39% x̃: 0.38%
    95% mean confidence interval for instructions value: -2.32 -2.25
    95% mean confidence interval for instructions %-change: -1.16% -1.09%
    Instructions are helped.
    
    total cycles in shared programs: 188758338 -> 188719974 (-0.02%)
    cycles in affected programs: 20004922 -> 19966558 (-0.19%)
    helped: 3012
    HURT: 477
    helped stats (abs) min: 2 max: 142 x̄: 13.41 x̃: 12
    helped stats (rel) min: 0.01% max: 6.37% x̄: 0.52% x̃: 0.24%
    HURT stats (abs)   min: 2 max: 328 x̄: 4.27 x̃: 4
    HURT stats (rel)   min: <.01% max: 1.55% x̄: 0.14% x̃: 0.11%
    95% mean confidence interval for cycles value: -11.38 -10.62
    95% mean confidence interval for cycles %-change: -0.46% -0.41%
    Cycles are helped.
    
    Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
    ab869261