Skip to content
  • Scott D Phillips's avatar
    i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear · 11b1afdc
    Scott D Phillips authored and Tapani Pälli's avatar Tapani Pälli committed
    
    
    The reference for MOVNTDQA says:
    
        For WC memory type, the nontemporal hint may be implemented by
        loading a temporary internal buffer with the equivalent of an
        aligned cache line without filling this data to the cache.
        [...] Subsequent MOVNTDQA reads to unread portions of the WC
        cache line will receive data from the temporary internal
        buffer if data is available.
    
    This hidden cache line sized temporary buffer can improve the
    read performance from wc maps.
    
    v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
    v3: add Android build support (Tapani)
    v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
        separate sse41 to own static library (Tapani)
    
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
    Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
    Acked-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
    Signed-off-by: default avatarTapani Pälli <tapani.palli@intel.com>
    11b1afdc