1. 25 Jun, 2019 1 commit
  2. 24 Jun, 2019 1 commit
  3. 20 Jun, 2019 10 commits
  4. 18 Jun, 2019 3 commits
  5. 17 Jun, 2019 5 commits
  6. 12 Jun, 2019 2 commits
    • Alyssa Rosenzweig's avatar
      panfrost: Remove "vertex/tiler render target" silliness · 8c88bd02
      Alyssa Rosenzweig authored
      I don't think these are actual structures, just figments over
      cargoculting dumped memory without making any sense of it. Nothing seems
      to break if the region is zeroed out, anyway.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      8c88bd02
    • Alyssa Rosenzweig's avatar
      panfrost: Replace pantrace with direct decoding · fc7bcee8
      Alyssa Rosenzweig authored
      History lesson! In the early days of a Panfrost, we had a library
      independent of the driver called `panwrap` which would be LD_PRELOAD'ed
      into a driver to decode its cmdstream in real-time. When upstreaming
      Panfrost, we realized that we would much rather have this decode
      functionality maintained in-tree to avoid divergence, but that we could
      not upstream panwrap because of its use with the legacy API. So we
      instead dumped GPU memory to the filesystem with an out-of-tree panwrap,
      and decoded that with the in-tree pandecode module. When we migrated to
      the new kernel, we just added support for doing this memory dump
      directly from the driver (via a module "pantrace").
      
      This works, but dumping memory every frame is sloooooooooooooow and
      error-prone. I figured if we have pandecode in-tree, we might as well
      link to it directly in the driver, allowing us to decode Panfrost's
      command streams without dumping memory to the filesystem first. This
      cleans up the code *substantially* and improves dumping performance by a
      HUGE margin. I'm talking "several seconds per frame" to "dumping in
      real-time" kind of jump.
      
      Note to users: this removes the environmental option "PANTRACE_BASE".
      Instead, for equivalent functionality set "PAN_MESA_DEBUG=trace" and
      redirect stdout to the file of your choosing.
      
      This should be debugging Panfrost much more pleasant.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      fc7bcee8
  7. 10 Jun, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Refactor texture/sampler upload · 416fc3b5
      Alyssa Rosenzweig authored
      We move some coding packing the texture/sampler descriptors into
      dedicated functions (out of the terrifyingly long emit_for_draw
      monolith), cleaning them up as we go.
      
      The discovery triggering the cleanup is the format for including manual
      strides in the presence of mipmaps/cubemaps. Rather than placed at the
      end like previously assumed, they are interleaved after each address.
      This difference is relevant when handling NPOT linear mipmaps.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      416fc3b5
  8. 05 Jun, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Don't flip scanout · 2adf35e4
      Alyssa Rosenzweig authored
      The mesa/st flips the viewport, so we respect that rather than
      trying to flip the framebuffer itself and ignoring the viewport and
      using a messy heuristic.
      
      However, this brings an underlying disagreement about the interpretation
      of winding order to light. The blob uses a different strategy than Mesa
      for handling viewport Y flipping, so the meanings of the winding order
      bit are flipped for it. To keep things clean on our end, we rename to
      explicitly use Gallium (rather than flipped OpenGL) conventions.
      
      Fixes upside-down Xwayland/egl windows.
      
      v2: Adjust lowering configuration to correctly flip gl_PointCoord.y and
      gl_FragCoord.y. v1 was R-b'd by Tomeu, but then retracted due to these
      regressions which are not fixed.
      Suggested-by: Rob Clark's avatarRob Clark <robdclark@chromium.org>
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
      Sort-of-reviewed-by: default avatarTomeu Vizoso <tomeu.vizoso@collabora.com>
      2adf35e4
  9. 19 May, 2019 2 commits
  10. 16 May, 2019 2 commits
  11. 07 May, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Refactor blend descriptors · 050b934a
      Alyssa Rosenzweig authored
      This commit does a fairly large cleanup of blend descriptors, although
      there should not be any functional changes. In particular, we split
      apart the Midgard and Bifrost blend descriptors, since they are
      radically different. From there, we can identify that the Midgard
      descriptor as previously written was really two render targets'
      descriptors stuck together. From this observation, we split the Midgard
      descriptor into what a single RT actually needs. This enables us to
      correctly dump blending configuration for MRT samples on Midgard. It
      also allows the Midgard and Bifrost blend code to peacefully coexist,
      with runtime selection rather than a #ifdef. So, as a bonus, this will
      help the future Bifrost effort, eliminating one major source of
      compile-time architectural divergence.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>
      050b934a
  12. 01 May, 2019 1 commit
  13. 07 Apr, 2019 1 commit
  14. 31 Mar, 2019 2 commits
  15. 19 Mar, 2019 2 commits
  16. 12 Mar, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Identify fragment_extra flags · 587ad37e
      Alyssa Rosenzweig authored
      The fragment_extra structure contains additional fields extending the
      MRT framebuffer descriptor, snuck in between the main framebuffer
      descriptor and the render targets. Its fields include those related to
      transaction elimination and depth/stencil buffers. This patch identifies
      the flags field (previously just "unk" with some magic values) as well
      as identifying some (but not all) flags set by the driver.
      
      The process of identifying flags brought a bug to light where
      transaction elimination (checksumming) could not be enabled unless AFBC
      was in-use. This issue is now resolved.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>
      Reviewed-by: default avatarTomeu Vizoso <tomeu.vizoso@collabora.com>
      587ad37e
  17. 25 Feb, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Decode render target swizzle/channels · f943047e
      Alyssa Rosenzweig authored
      On MRT-capable systems, the framebuffer format is encoded as a 64-bit
      word in the render target descriptor. Previously, the two 32-bit
      words were exposed as opaque hex values. This commit identifies a 12-bit
      Mali swizzle and a 2-bit channel counter, removing some of the magic. It
      also adds decoding support for the AFBC and MSAA enable bits, which were
      already known but otherwise ignored in pandecode.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>
      f943047e
  18. 21 Feb, 2019 1 commit
    • Alyssa Rosenzweig's avatar
      panfrost: Add pandecode (command stream debugger) · f6117820
      Alyssa Rosenzweig authored
      The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting
      communication between the driver and the kernel. Modern panwrap versions
      do no processing of their own; instead, they create a trace directory.
      This directory contains the following files:
      
       - control.log: a line-by-line plain text file, denoting important
         syscalls (mmaps and job submits) along with their arguments
      
       - memory_*.bin, shader_*.bin: binary dumps of mapped memory
      
      Together, these files contain enough information to reconstruct the
      command stream and shaders of (at minimum) a single frame.
      
      The `pandecode` utility takes this directory structure as input,
      reconstructing the mapped memory and using the job submit command as an
      entrypoint. It then walks the descriptors as the hardware would, parsing
      and pretty-printing. Its final output is the pretty-printed command
      stream interleaved with the disassembled shaders, suitable for driver
      debugging. For instance, the behaviour of two driver versions (one
      working, one broken) can be compared by diff'ing their decoded logs.
      
      pandecode/decode.c was originally a part of `panwrap`; it is the oldest
      living code in the project. Its history is generally not worth
      preserving.
      
      panwrap itself will continue to live downstream for the foreseeable
      future, as it is specifically written for the vendor kernel. It is
      possible, however, to produce equivalent traces directly from Panfrost,
      bypassing the intermediate wrapping layer for well-behaved drivers.
      Signed-off-by: Alyssa Rosenzweig's avatarAlyssa Rosenzweig <alyssa@rosenzweig.io>
      f6117820