Skip to content
Snippets Groups Projects
  1. Feb 09, 2013
  2. Feb 08, 2013
    • Paul Berry's avatar
      glsl: Support transform feedback of varying structs. · 0419b7a3
      Paul Berry authored
      
      Since transform feedback needs to be able to access individual fields
      of varying structs, we can no longer match up the arguments to
      glTransformFeedbackVaryings() with variables in the vertex shader.
      
      Instead, we build up a hashtable which records information about each
      possible name that is a candidate for transform feedback, and then
      match up the arguments to glTransformFeedbackVaryings() with the
      contents of that hashtable.
      
      Populating the hashtable uses the program_resource_visitor
      infrastructure, so the logic is shared with how we handle uniforms.
      
      NOTE: This is a candidate for the 9.1 branch.
      
      Reviewed-by: default avatarJordan Justen <jordan.l.justen@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      (cherry picked from commit 99b78337)
      0419b7a3
    • Paul Berry's avatar
      glsl: Use parse_program_resource_name to parse transform feedback varyings. · 5be2e143
      Paul Berry authored
      
      Previously, transform feedback varyings were parsed in an ad-hoc
      fashion that wasn't compatible with structs (or array of structs).
      This patch makes it use parse_program_resource_name(), which correctly
      handles both.
      
      Note that parse_program_resource_name()'s technique for handling
      mal-formed input strings is to simply let them through and rely on the
      fact that a future name lookup will fail.  Because of this,
      tfeedback_decl::init() no longer needs to return a boolean error
      code--it always succeeds, and if the input was mal-formed the error
      will be detected later.
      
      NOTE: This is a candidate for the 9.1 branch.
      
      Reviewed-by: default avatarJordan Justen <jordan.l.justen@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      (cherry picked from commit 53febac0)
      5be2e143
    • Paul Berry's avatar
      glsl: Rename uniform_field_visitor to program_resource_visitor. · 11e4347b
      Paul Berry authored
      
      There's actually nothing uniform-specific in uniform_field_visitor.
      It is potentially useful for all kinds of program resources (in
      particular, future patches will use it for transform feedback
      varyings).
      
      This patch renames it to program_resource_visitor, and clarifies
      several comments, to reflect the fact that it is useful for more than
      just uniforms.
      
      NOTE: This is a candidate for the 9.1 branch.
      
      Reviewed-by: default avatarJordan Justen <jordan.l.justen@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      (cherry picked from commit b4db34cc)
      11e4347b
    • Paul Berry's avatar
      mesa/glsl: Separate parsing logic from _mesa_get_uniform_location. · 49a5f829
      Paul Berry authored
      
      The parsing logic is moved to a new function in the GLSL module,
      parse_program_resource_name().  This name was chosen because it should
      eventually be useful for handling everything that OpenGL 4.3 calls
      "program resources" (e.g. uniforms, vertex inputs, fragment outputs,
      and transform feedback varyings).
      
      Future patches will make use of this function for linking transform
      feedback varyings.
      
      NOTE: This is a candidate for the 9.1 branch.
      
      Reviewed-by: default avatarJordan Justen <jordan.l.justen@intel.com>
      Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>
      (cherry picked from commit b92900d2)
      49a5f829
    • Kenneth Graunke's avatar
      i965/blorp: Support blits between ARGB and XRGB formats. · 5265c42e
      Kenneth Graunke authored
      
      Now that we have support for overriding alpha to 1.0, we can handle
      blitting between these formats in either direction.
      
      For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
      MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
      former, so ignore the latter for now.  We can always add it later.
      
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Tested-by: default avatarMartin Steigerwald <martin@lichtvoll.de>
      (cherry picked from commit 7d467f3c)
      5265c42e
    • Kenneth Graunke's avatar
      i965/blorp: Support overriding destination alpha to 1.0. · 3114f5ac
      Kenneth Graunke authored
      
      Currently, Blorp requires the source and destination formats to be
      equal.  However, we'd really like to be able to blit between XRGB and
      ARGB formats; our BLT engine paths have supported this for a long time.
      
      For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
      interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
      channel to 1.0 when writing the destination colors.  This is fairly
      straightforward with blending.
      
      For now, this code is never used, as the source and destination formats
      still must be equal.  The next patch will relax that restriction.
      
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Tested-by: default avatarMartin Steigerwald <martin@lichtvoll.de>
      (cherry picked from commit c0554141)
      3114f5ac
    • Kenneth Graunke's avatar
      i965: Implement CopyTexSubImage2D via BLORP (and use it by default). · 332c50b6
      Kenneth Graunke authored
      
      The BLT engine has many limitations.  Currently, it can only blit
      X-tiled buffers (since we don't have a kernel API to whack the BLT
      tiling mode register), which means all depth/stencil operations get
      punted to meta code, which can be very CPU-intensive.
      
      Even if we used the BLT engine, it can't blit between buffers with
      different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
      and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
      limitation, and the only way around that is to use BLORP.
      
      Previously, BLORP only handled BlitFramebuffer.  This patch adds an
      additional frontend for doing CopyTexSubImage.  It also makes it the
      default.  This is partly to increase testing and avoid hiding bugs,
      and partly because the BLORP path can already handle more cases.  With
      trivial extensions, it should be able to handle everything the BLT can.
      
      This helps PlaneShift massively, which tries to CopyTexSubImage2D
      between depth buffers whenever a player casts a spell.  Since these
      are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
      while delivering ~1 FPS.  This is particularly bad in an MMO setting
      because people cast spells all the time.
      
      It also helps Xonotic in 4X MSAA mode.  At default power management
      settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
      (This data is from v1 of the patch.)
      
      No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).
      
      v2: Create a fake intel_renderbuffer to wrap the destination texture
          image and then reuse do_blorp_blit rather than reimplementing most
          of it.  Remove unnecessary clipping code and conditional rendering
          check.
      
      v3: Reuse formats_match() to centralize checks; delete temporary
          renderbuffers.  Reorganize the code.
      
      v4: Actually copy stencil when dealing with separate stencil buffers but
          packed depth/stencil formats.  Tested by a new Piglit test.
      
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
      Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
      Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
      Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
      (cherry picked from commit 0b3bebba)
      332c50b6
    • Kenneth Graunke's avatar
      mesa: Put extern "C" guards in renderbuffer.h. · 55e3f79d
      Kenneth Graunke authored
      
      I need to use this from C++ code.
      
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      (cherry picked from commit 29aef6cc)
      55e3f79d
    • Kenneth Graunke's avatar
      i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms. · 1d2ef430
      Kenneth Graunke authored
      
      Ivybridge doesn't appear to have the same errata as Sandybridge; no
      corruption was observed by setting it to more than the minimal correct
      value.  It's possible that we were simply lucky, since the URB entries
      are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
      underlying hardware issue is fixed.
      
      Either way, we may as well program the minimum value since it's now
      readily available, likely to be more efficient, and possibly more
      correct.
      
      v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
          mistake.)  They're the same, but using the right names is better.
      
      NOTE: This is a candidate for all stable branches.
      Reviewed-by: default avatarPaul Berry <stereotype441@gmail.com>
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit 44aa2e15)
      1d2ef430
    • Kenneth Graunke's avatar
      i965: Fix the SF Vertex URB Read Length calculation for Sandybridge. · 3acd5ed7
      Kenneth Graunke authored
      (This commit message was primarily written by Paul Berry, who explained
       what's going on far better than I would have.)
      
      Previous to this patch, we thought that the only restrictions on
      3DSTATE_SF's URB read length were (a) it needs to be large enough to
      read all the VUE data that the SF needs, and (b) it can't be so large
      that it tries to read VUE data that doesn't exist.  Since the VUE map
      already tells us how much VUE data exists, we didn't bother worrying
      about restriction (a); we just did the easy thing and programmed the
      read length to satisfy restriction (b).
      
      However, we didn't notice this erratum in the hardware docs: "[errata]
      Corruption/Hang possible if length programmed larger than recommended".
      Judging by the context surrounding this erratum, it's pretty clear that
      it means "URB read length must be exactly the size necessary to read all
      the VUE data that the SF needs, and no larger".  Which means that we
      can't program the read length based on restriction (b)--we have to
      program it based on restriction (a).
      
      The URB read size needs to precisely match the amount of data that the
      SF consumes; it doesn't work to simply base it on the size of the VUE.
      
      Thankfully, the PRM contains the precise formula the hardware expects.
      
      Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
      corruption in PlaneShift, and Piglit's fbo-5-varyings test.
      
      NOTE: This is a candidate for all stable branches.
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172
      
      
      Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
      Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
      Reviewed-by: default avatarPaul Berry <stereotype441@gmail.com>
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit 09fbc298)
      3acd5ed7
    • Kenneth Graunke's avatar
      i965: Compute the maximum SF source attribute. · 697f8e56
      Kenneth Graunke authored
      
      The maximum SF source attribute is necessary to compute the Vertex URB
      read length properly, which will be done in the next commit.
      
      NOTE: This is a candidate for all stable branches.
      Reviewed-by: default avatarPaul Berry <stereotype441@gmail.com>
      Tested-by: default avatarMartin Steigerwald <martin@lichtvoll.de>
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit 5e9bc7bd)
      697f8e56
    • Kenneth Graunke's avatar
      i965: Refactor Gen6+ SF attribute override code. · 45ae093e
      Kenneth Graunke authored
      
      The next patch will benefit from easy access to the source attribute
      number and whether or not we're swizzling.  It doesn't want the final
      attr_override DWord form, however.
      
      NOTE: This is a candidate for all stable branches.
      Reviewed-by: default avatarPaul Berry <stereotype441@gmail.com>
      Tested-by: default avatarMartin Steigerwald <martin@lichtvoll.de>
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit b3efc5be)
      45ae093e
    • Kenneth Graunke's avatar
      i965: Add chipset limits for Haswell GT1/GT2. · 535e9529
      Kenneth Graunke authored
      
      The maximum number of URB entries come from the 3DSTATE_URB_VS and
      3DSTATE_URB_GS state packet documentation; the thread count information
      comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.
      
      Signed-off-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: default avatarEugeni Dodonov <eugeni.dodonov@intel.com>
      (cherry picked from commit 9add4e80)
      535e9529
    • Vinson Lee's avatar
      i965: Fix assignment instead of comparison in asserts. · a7e2c615
      Vinson Lee authored and Kenneth Graunke's avatar Kenneth Graunke committed
      
      Fixes side effect in assertion defects reported by Coverity.
      
      Signed-off-by: default avatarVinson Lee <vlee@freedesktop.org>
      Reviewed-by: default avatarChad Versace <chad.versace@linux.intel.com>
      (cherry picked from commit 1559994c)
      a7e2c615
    • Paul Berry's avatar
      mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange. · 5611a5a3
      Paul Berry authored
      
      In the documentation for BindBufferRange, OpenGL specs from 3.0
      through 4.1 contain this language:
      
          "The error INVALID_VALUE is generated if size is less than or
          equal to zero or if offset + size is greater than the value of
          BUFFER_SIZE."
      
      This text was dropped from OpenGL 4.2, and it does not appear in the
      GLES 3.0 spec.
      
      Presumably the reason for the change is because come clients change
      the size of the buffer after calling BindBufferRange.  We don't want
      to generate an error at the time of the BindBufferRange call just
      because the old size of the buffer was too small, when the buffer is
      about to be resized.
      
      Since this is a deliberate relaxation of error conditions in order to
      allow clients to work, it seems sensible to apply it to all versions
      of GL, not just GL 4.2 and above.
      
      (Note that there is no danger of this change allowing a client to
      access data beyond the end of a buffer.  We already have code to
      ensure that that doesn't happen in the case where the client shrinks
      the buffer after calling BindBufferRange).
      
      Eliminates a spurious error message in the gles3 conformance test
      "transform_feedback_offset_size".
      
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      (cherry picked from commit 04f0d6cc)
      5611a5a3
    • Ian Romanick's avatar
      i965: Set UniformBufferOffsetAlignment to sizeof(vec4) · a48e5526
      Ian Romanick authored
      
      This matches the behavior of the Windows driver, but a bspec reference
      should would be nice.
      
      NOTE: This is a candidate for the 9.0 and 9.1 branches.
      
      Signed-off-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      Reviewed-by: Kenneth Graunke's avatarKenneth Graunke <kenneth@whitecape.org>
      (cherry picked from commit f29ab4ec)
      a48e5526
    • Matt Turner's avatar
      mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3 · c59808c7
      Matt Turner authored
      
      Should have been done in d9948e49 but I missed it because
      MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
      value as MAX_VARYING_COMPONENTS.
      
      NOTE: Candidate for the 9.1 branch
      Reviewed-by: default avatarIan Romanick <ian.d.romanick@intel.com>
      c59808c7
  3. Feb 07, 2013
  4. Feb 06, 2013
  5. Feb 05, 2013
  6. Feb 04, 2013
  7. Feb 01, 2013
  8. Jan 31, 2013
    • Jerome Glisse's avatar
      r600g: add cs memory usage accounting and limit it v3 · 9d8a866d
      Jerome Glisse authored
      
      We are now seing cs that can go over the vram+gtt size to avoid
      failing flush early cs that goes over 70% (gtt+vram) usage. 70%
      is use to allow some fragmentation.
      
      The idea is to compute a gross estimate of memory requirement of
      each draw call. After each draw call, memory will be precisely
      accounted. So the uncertainty is only on the current draw call.
      In practice this gave very good estimate (+/- 10% of the target
      memory limit).
      
      v2: Remove left over from testing version, remove useless NULL
          checking. Improve commit message.
      v3: Add comment to code on memory accounting precision
      
      Signed-off-by: default avatarJerome Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarMarek Olšák <maraeo@gmail.com>
      9d8a866d
    • Marek Olšák's avatar
      r600g: fix htile buffer leak · 3b8d4f94
      Marek Olšák authored
      NOTE: This is a candidate for the 9.1 branch.
      3b8d4f94
  9. Jan 29, 2013
Loading