Skip to content
Snippets Groups Projects
  1. Dec 17, 2010
  2. Dec 07, 2010
    • Siarhei Siamashka's avatar
      Fix for potential unaligned memory accesses · 3d094997
      Siarhei Siamashka authored
      The temporary scanline buffer allocated on stack was declared
      as uint8_t array. As a result, the compiler was free to select
      any arbitrary alignment for it (even though there is typically
      no reason to use really weird alignments here and the stack is
      normally at least 4 bytes aligned on most platforms). Having
      improper alignment is non-portable and can impact performance
      or even make the code misbehave depending on the target platform.
      
      Using uint64_t type for this array should ensure that any possible
      memory accesses done by pixman code are going to be handled correctly
      (pixman-combine64.c can access this buffer via uint64_t * pointer).
      
      Some alignment related problem was reported in:
      http://lists.freedesktop.org/archives/pixman/2010-November/000747.html
      3d094997
    • Siarhei Siamashka's avatar
      ARM: added 'neon_src_rpixbuf_8888' fast path · 985e59a8
      Siarhei Siamashka authored
      With this optimization added, pixman assisted conversion from
      non-premultiplied to premultiplied alpha format is now fully
      NEON optimized (both with and without R/B color components
      swapping in the process).
      985e59a8
  3. Dec 03, 2010
  4. Nov 22, 2010
  5. Nov 21, 2010
  6. Nov 19, 2010
    • Cyril Brulebois's avatar
      Fix argument quoting for AC_INIT. · e7ee43c3
      Cyril Brulebois authored
      
      One gets rid of this accordingly:
      | autoreconf -vfi
      | autoreconf: Entering directory `.'
      | autoreconf: configure.ac: not using Gettext
      | autoreconf: running: aclocal --force
      | configure.ac:61: warning: AC_INIT: not a literal: "pixman@lists.freedesktop.org"
      | autoreconf: configure.ac: tracing
      | configure.ac:61: warning: AC_INIT: not a literal: "pixman@lists.freedesktop.org"
      
      Signed-off-by: default avatarCyril Brulebois <kibi@debian.org>
      e7ee43c3
  7. Nov 16, 2010
  8. Nov 12, 2010
    • Andrea Canciani's avatar
      Improve conical gradients opacity check · da0176e8
      Andrea Canciani authored
      Conical gradients are completely opaque if all of their stops are
      opaque and the repeat mode is not 'none'.
      da0176e8
    • Andrea Canciani's avatar
      Fix opacity check · 151f2554
      Andrea Canciani authored
      Radial gradients are "conical", thus they can have some non-opaque
      parts even if all of their stops are completely opaque.
      
      To guarantee that a radial gradient is actually opaque, it needs to
      also have one of the two circles containing the other one. In this
      case when extrapolating, the whole plane is completely covered (as
      explained in the comment in pixman-radial-gradient.c).
      151f2554
    • Andrea Canciani's avatar
      Remove unused stop_range field · 19ed415b
      Andrea Canciani authored
      19ed415b
  9. Nov 10, 2010
    • Siarhei Siamashka's avatar
      ARM: optimization for scaled src_0565_0565 with nearest filter · d8fe87a6
      Siarhei Siamashka authored
      The performance improvement is only in the ballpark of 5% when
      compared against C code built with a reasonably good compiler
      (gcc 4.5.1). But gcc 4.4 produces approximately 30% slower code
      here, so assembly optimization makes sense to avoid dependency
      on the compiler quality and/or optimization options.
      
      Benchmark from ARM11:
          == before ==
          op=1, src_fmt=10020565, dst_fmt=10020565, speed=34.86 MPix/s
      
          == after ==
          op=1, src_fmt=10020565, dst_fmt=10020565, speed=36.62 MPix/s
      
      Benchmark from ARM Cortex-A8:
          == before ==
          op=1, src_fmt=10020565, dst_fmt=10020565, speed=89.55 MPix/s
      
          == after ==
          op=1, src_fmt=10020565, dst_fmt=10020565, speed=94.91 MPix/s
      d8fe87a6
    • Siarhei Siamashka's avatar
      ARM: NEON optimization for scaled src_0565_8888 with nearest filter · b8007d04
      Siarhei Siamashka authored
      Benchmark from ARM Cortex-A8 @720MHz:
          == before ==
          op=1, src_fmt=10020565, dst_fmt=20028888, speed=8.99 MPix/s
      
          == after ==
          op=1, src_fmt=10020565, dst_fmt=20028888, speed=76.98 MPix/s
      
          == unscaled ==
          op=1, src_fmt=10020565, dst_fmt=20028888, speed=137.78 MPix/s
      b8007d04
    • Siarhei Siamashka's avatar
      ARM: NEON optimization for scaled src_8888_0565 with nearest filter · 2e855a2b
      Siarhei Siamashka authored
      Benchmark from ARM Cortex-A8 @720MHz:
          == before ==
          op=1, src_fmt=20028888, dst_fmt=10020565, speed=42.51 MPix/s
      
          == after ==
          op=1, src_fmt=20028888, dst_fmt=10020565, speed=55.61 MPix/s
      
          == unscaled ==
          op=1, src_fmt=20028888, dst_fmt=10020565, speed=117.99 MPix/s
      2e855a2b
    • Siarhei Siamashka's avatar
      ARM: NEON optimization for scaled over_8888_0565 with nearest filter · 4a09e472
      Siarhei Siamashka authored
      Benchmark from ARM Cortex-A8 @720MHz:
          == before ==
          op=3, src_fmt=20028888, dst_fmt=10020565, speed=10.29 MPix/s
      
          == after ==
          op=3, src_fmt=20028888, dst_fmt=10020565, speed=36.36 MPix/s
      
          == unscaled ==
          op=3, src_fmt=20028888, dst_fmt=10020565, speed=79.40 MPix/s
      4a09e472
    • Siarhei Siamashka's avatar
      ARM: NEON optimization for scaled over_8888_8888 with nearest filter · 67a4991f
      Siarhei Siamashka authored
      Benchmark from ARM Cortex-A8 @720MHz:
          == before ==
          op=3, src_fmt=20028888, dst_fmt=20028888, speed=12.73 MPix/s
      
          == after ==
          op=3, src_fmt=20028888, dst_fmt=20028888, speed=28.75 MPix/s
      
          == unscaled ==
          op=3, src_fmt=20028888, dst_fmt=20028888, speed=53.03 MPix/s
      67a4991f
    • Siarhei Siamashka's avatar
      ARM: performance tuning of NEON nearest scaled pixel fetcher · 0b56244a
      Siarhei Siamashka authored
      Interleaving the use of NEON registers helps to avoid some stalls
      in NEON pipeline and provides a small performance improvement.
      0b56244a
    • Siarhei Siamashka's avatar
      ARM: macro template in C code to simplify using scaled fast paths · 6e76af0d
      Siarhei Siamashka authored
      This template can be used to instantiate scaled fast path functions
      by providing main loop code and calling NEON assembly optimized
      scanline processing functions from it. Another macro can be used
      to simplify adding entries to fast path tables.
      6e76af0d
    • Siarhei Siamashka's avatar
      ARM: nearest scaling support for NEON scanline compositing functions · 88014a0e
      Siarhei Siamashka authored
      Now it is possible to generate scanline processing functions
      for the case when the source image is scaled with NEAREST filter.
      
      Only 16bpp and 32bpp pixel formats are supported for now. But the
      others can be also added later when needed. All the existing NEON
      fast path functions should be quite easy to reuse for implementing
      fast paths which can work with scaled source images.
      88014a0e
    • Siarhei Siamashka's avatar
      ARM: NEON: source image pixel fetcher can be overrided now · 324712e4
      Siarhei Siamashka authored
      Added a special macro 'pixld_src' which is now responsible for fetching
      pixels from the source image. Right now it just passes all its arguments
      directly to 'pixld' macro, but it can be used in the future to provide
      a special pixel fetcher for implementing nearest scaling.
      
      The 'pixld_src' has a lot of arguments which define its behavior. But
      for each particular fast path implementation, we already know NEON
      registers allocation and how many pixels are processed in a single block.
      That's why a higher level macro 'fetch_src_pixblock' is also introduced
      (it's easier to use because it has no arguments) and used everywhere
      in 'pixman-arm-neon-asm.S' instead of VLD instructions.
      
      This patch does not introduce any functional changes and the resulting code
      in the compiled object file is exactly the same.
      324712e4
    • Siarhei Siamashka's avatar
      ARM: fix 'vld1.8'->'vld1.32' typo in add_8888_8888 NEON fast path · cb3f1830
      Siarhei Siamashka authored
      This was mostly harmless and had no effect on little endian systems.
      But wrong vector element size is at least inconsistent and also
      can theoretically cause problems on big endian ARM systems.
      cb3f1830
  10. Nov 05, 2010
    • Siarhei Siamashka's avatar
      Do CPU features detection from 'constructor' function when compiled with gcc · fed4a2fd
      Siarhei Siamashka authored
      There is attribute 'constructor' supported since gcc 2.7 which allows
      to have a constructor function for library initialization. This eliminates
      an extra branch for each composite operation and also helps to avoid
      complains from race condition detection tools like helgrind.
      
      The other compilers may or may not support this attribute properly.
      Ideally, the compilers should fail to compile the code with unknown
      attribute, so the configure check should do the right job. But in
      reality the problems are surely possible. Fortunately such problems
      should be quite easy to find because NULL pointer dereference should
      happen almost immediately if the constructor fails to run.
      
      clang 2.7:
        supports __attribute__((constructor)) properly and pretends to be gcc
      
      tcc 0.9.25:
        ignores __attribute__((constructor)), but does not pretend to be gcc
      fed4a2fd
    • Søren Sandmann Pedersen's avatar
      Delete the source_image_t struct. · 99699771
      Søren Sandmann Pedersen authored
      It serves no purpose anymore now that the source_class_t field is gone.
      99699771
    • Søren Sandmann Pedersen's avatar
      [mmx] Mark some of the output variables as early-clobber. · f405b407
      Søren Sandmann Pedersen authored
      
      GCC assumes that input variables in inline assembly are fully consumed
      before any output variable is written. This means it may allocate the
      variables in the same register unless the output variables are marked
      as early-clobber.
      
      From Jeremy Huddleston:
      
          I noticed a problem building pixman with clang and reported it to
          the clang developers.  They responded back with a comment about
          the inline asm in pixman-mmx.c and suggested a fix:
      
          """
          Incidentally, Jeremy, in the asm that reads
          __asm__ (
          "movq %7, %0\n"
          "movq %7, %1\n"
          "movq %7, %2\n"
          "movq %7, %3\n"
          "movq %7, %4\n"
          "movq %7, %5\n"
          "movq %7, %6\n"
          : "=y" (v1), "=y" (v2), "=y" (v3),
            "=y" (v4), "=y" (v5), "=y" (v6), "=y" (v7)
          : "y" (vfill));
      
          all the output operands except the last one should be marked as
          earlyclobber ("=&y"). This is working by accident with gcc.
          """
      
      Cc: jeremyhu@apple.com
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      f405b407
    • Søren Sandmann Pedersen's avatar
      Remove workaround for a bug in the 1.6 X server. · 9c19a85b
      Søren Sandmann Pedersen authored
      There used to be a bug in the X server where it would rely on
      out-of-bounds accesses when it was asked to composite with a
      window as the source. It would create a pixman image pointing
      to some bogus position in memory, but then set a clip region
      to the position where the actual bits were.
      
      Due to a bug in old versions of pixman, where it would not clip
      against the image bounds when a clip region was set, this would
      actually work. So when the pixman bug was fixed, a workaround was
      added to allow certain out-of-bound accesses.
      
      However, the 1.6 X server is so old now that we can remove this
      workaround. This does mean that if you update pixman to 0.22 or later,
      you will need to use a 1.7 X server or later.
      9c19a85b
  11. Nov 01, 2010
Loading