Skip to content
  • Roland Scheidegger's avatar
    llvmpipe: avoid most 64 bit math in rasterization · 49ec647c
    Roland Scheidegger authored
    The trick here is to recognize that in the c + n * dcdx calculations,
    not only can the lower FIXED_ORDER bits not change (as the dcdx values
    have those all zero) but that this means the sign bit of the calculations
    cannot be different as well, that is
    sign(c + n*dcdx) == sign((c >> FIXED_ORDER) + n*(dcdx >> FIXED_ORDER)).
    That shaves off more than enough bits to never require 64bit masks.
    A shifted plane c value could still easily exceed 32 bits, however since we
    throw out planes which are trivial accept even before binning (and similarly
    don't even get to see tris for which there was a trivial reject plane)) this
    is never a problem.
    The idea isnt't all that revolutionary, in fact something similar was tried
    ages ago (9773722c
    
    ) back when the values were
    only 32 bit anyway. I believe now it didn't quite work then because the
    adjustment needed for testing trivial reject / partial masks wasn't handled
    correctly.
    This still keeps the separate 32/64 bit paths for now, as the 32 bit one still
    looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled
    from setup which would be a good reason to ditch the 32 bit path, we'd need to
    change the special-purpose rasterization functions for small tris).
    
    This passes piglit triangle-rasterization (-fbo -auto -max_size
    -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks
    to make it work correctly with large sizes) easily (full piglit as
    well of course, but most tests wouldn't use triangles large enough to
    be affected, that is tris with a bounding box over 128x128).
    The profiler says indeed time spent in rast_tri functions is reduced
    substantially, BUT of course only if the tris are large. I measured a 3%
    improvement in mesa gloss demo when supersized to twice the screen size...
    
    Reviewed-by: default avatarBrian Paul <brianp@vmware.com>
    Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    49ec647c