• Chris Wilson's avatar
    lib: Provide an accelerated routine for readback from WC · 6a06d014
    Chris Wilson authored
    Reading from WC is awfully slow as each access is uncached and so
    performed synchronously, stalling for the memory load. x86 did introduce
    some new instructions in SSE 4.1 to provide a small internal buffer to
    accelerate reading back a cacheline at a time from uncached memory, for
    this purpose.
    
    v2: Don't be lazy and handle misalignment.
    v3: Switch out of sse41 before emitting the generic memcpy routine
    v4: Replace opencoded memcpy_from_wc
    v5: Always flush the internal buffer before use (Eric)
    v6: Assume bulk moves, so check for dst alignment.
    v7: Use _mm_fence for _buitlin_ia32_mfence for consistency, remove
    superfluous defines (Ville)
    Signed-off-by: Chris Wilson's avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Eric Anholt <eric@anholt.net>
    Reviewed-by: Ville Syrjälä's avatarVille Syrjälä <ville.syrjala@linux.intel.com>
    6a06d014
Name
Last commit
Last update
assembler Loading commit data...
benchmarks Loading commit data...
debugger Loading commit data...
docs Loading commit data...
include Loading commit data...
lib Loading commit data...
m4 Loading commit data...
man Loading commit data...
overlay Loading commit data...
scripts Loading commit data...
shaders Loading commit data...
tests Loading commit data...
tools Loading commit data...
.editorconfig Loading commit data...
.gitignore Loading commit data...
CONTRIBUTING Loading commit data...
COPYING Loading commit data...
MAINTAINERS Loading commit data...
Makefile.am Loading commit data...
NEWS Loading commit data...
README Loading commit data...
autogen.sh Loading commit data...
configure.ac Loading commit data...
meson.build Loading commit data...
meson.sh Loading commit data...