Commits · a0528269a3673e4c4e43c96a9bdc2cb98f1968f2 · QwertyChouskie / mesa

Feb 09, 2013

winsys/radeon: improve debuging printing · a0528269

Jerome Glisse authored 12 years ago


Make sure one can identify virtual address failure from allocation
failure.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 9a476845)

a0528269

xorg: fix exa finish access · 18ef6b12

Jerome Glisse authored 12 years ago


The exa core will already set the pointer to NULL prior calling
the callback function. So don't bail out in the callback if it's
already NULL.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
(cherry picked from commit 3310acdf)

18ef6b12

Feb 08, 2013

glsl: Support transform feedback of varying structs. · 0419b7a3

Paul Berry authored 12 years ago


Since transform feedback needs to be able to access individual fields
of varying structs, we can no longer match up the arguments to
glTransformFeedbackVaryings() with variables in the vertex shader.

Instead, we build up a hashtable which records information about each
possible name that is a candidate for transform feedback, and then
match up the arguments to glTransformFeedbackVaryings() with the
contents of that hashtable.

Populating the hashtable uses the program_resource_visitor
infrastructure, so the logic is shared with how we handle uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 99b78337)

0419b7a3

glsl: Use parse_program_resource_name to parse transform feedback varyings. · 5be2e143

Paul Berry authored 12 years ago


Previously, transform feedback varyings were parsed in an ad-hoc
fashion that wasn't compatible with structs (or array of structs).
This patch makes it use parse_program_resource_name(), which correctly
handles both.

Note that parse_program_resource_name()'s technique for handling
mal-formed input strings is to simply let them through and rely on the
fact that a future name lookup will fail.  Because of this,
tfeedback_decl::init() no longer needs to return a boolean error
code--it always succeeds, and if the input was mal-formed the error
will be detected later.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 53febac0)

5be2e143

glsl: Rename uniform_field_visitor to program_resource_visitor. · 11e4347b

Paul Berry authored 12 years ago


There's actually nothing uniform-specific in uniform_field_visitor.
It is potentially useful for all kinds of program resources (in
particular, future patches will use it for transform feedback
varyings).

This patch renames it to program_resource_visitor, and clarifies
several comments, to reflect the fact that it is useful for more than
just uniforms.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b4db34cc)

11e4347b

mesa/glsl: Separate parsing logic from _mesa_get_uniform_location. · 49a5f829

Paul Berry authored 12 years ago


The parsing logic is moved to a new function in the GLSL module,
parse_program_resource_name().  This name was chosen because it should
eventually be useful for handling everything that OpenGL 4.3 calls
"program resources" (e.g. uniforms, vertex inputs, fragment outputs,
and transform feedback varyings).

Future patches will make use of this function for linking transform
feedback varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit b92900d2)

49a5f829

i965/blorp: Support blits between ARGB and XRGB formats. · 5265c42e

Kenneth Graunke authored 12 years ago


Now that we have support for overriding alpha to 1.0, we can handle
blitting between these formats in either direction.

For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and
MESA_FORMAT_RGBX8888_REV.  Most places only appear to worry about the
former, so ignore the latter for now.  We can always add it later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit 7d467f3c)

5265c42e

i965/blorp: Support overriding destination alpha to 1.0. · 3114f5ac

Kenneth Graunke authored 12 years ago


Currently, Blorp requires the source and destination formats to be
equal.  However, we'd really like to be able to blit between XRGB and
ARGB formats; our BLT engine paths have supported this for a long time.

For ARGB -> XRGB, nothing needs to occur: the missing alpha is already
interpreted as 1.0.  For XRGB -> ARGB, we need to smash the alpha
channel to 1.0 when writing the destination colors.  This is fairly
straightforward with blending.

For now, this code is never used, as the source and destination formats
still must be equal.  The next patch will relax that restriction.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
(cherry picked from commit c0554141)

3114f5ac

i965: Implement CopyTexSubImage2D via BLORP (and use it by default). · 332c50b6

Kenneth Graunke authored 12 years ago


The BLT engine has many limitations.  Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.

Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer.  This is a fundamental
limitation, and the only way around that is to use BLORP.

Previously, BLORP only handled BlitFramebuffer.  This patch adds an
additional frontend for doing CopyTexSubImage.  It also makes it the
default.  This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases.  With
trivial extensions, it should be able to handle everything the BLT can.

This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell.  Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS.  This is particularly bad in an MMO setting
because people cast spells all the time.

It also helps Xonotic in 4X MSAA mode.  At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)

No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).

v2: Create a fake intel_renderbuffer to wrap the destination texture
    image and then reuse do_blorp_blit rather than reimplementing most
    of it.  Remove unnecessary clipping code and conditional rendering
    check.

v3: Reuse formats_match() to centralize checks; delete temporary
    renderbuffers.  Reorganize the code.

v4: Actually copy stencil when dealing with separate stencil buffers but
    packed depth/stencil formats.  Tested by a new Piglit test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3]
Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2]
Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]
(cherry picked from commit 0b3bebba)

332c50b6

mesa: Put extern "C" guards in renderbuffer.h. · 55e3f79d

Kenneth Graunke authored 12 years ago


I need to use this from C++ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 29aef6cc)

55e3f79d

i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms. · 1d2ef430

Kenneth Graunke authored 12 years ago


Ivybridge doesn't appear to have the same errata as Sandybridge; no
corruption was observed by setting it to more than the minimal correct
value.  It's possible that we were simply lucky, since the URB entries
are 1024-bit on Ivybridge vs. 512-bit Sandybridge.  Or perhaps the
underlying hardware issue is fixed.

Either way, we may as well program the minimum value since it's now
readily available, likely to be more efficient, and possibly more
correct.

v2: Use GEN7_SBE_* defines rather than GEN6_SF_*.  (A copy and paste
    mistake.)  They're the same, but using the right names is better.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 44aa2e15)

1d2ef430

i965: Fix the SF Vertex URB Read Length calculation for Sandybridge. · 3acd5ed7

Kenneth Graunke authored 12 years ago

(This commit message was primarily written by Paul Berry, who explained
 what's going on far better than I would have.)

Previous to this patch, we thought that the only restrictions on
3DSTATE_SF's URB read length were (a) it needs to be large enough to
read all the VUE data that the SF needs, and (b) it can't be so large
that it tries to read VUE data that doesn't exist.  Since the VUE map
already tells us how much VUE data exists, we didn't bother worrying
about restriction (a); we just did the easy thing and programmed the
read length to satisfy restriction (b).

However, we didn't notice this erratum in the hardware docs: "[errata]
Corruption/Hang possible if length programmed larger than recommended".
Judging by the context surrounding this erratum, it's pretty clear that
it means "URB read length must be exactly the size necessary to read all
the VUE data that the SF needs, and no larger".  Which means that we
can't program the read length based on restriction (b)--we have to
program it based on restriction (a).

The URB read size needs to precisely match the amount of data that the
SF consumes; it doesn't work to simply base it on the size of the VUE.

Thankfully, the PRM contains the precise formula the hardware expects.

Fixes random UI corruption in Steam's "Big Picture Mode", random terrain
corruption in PlaneShift, and Piglit's fbo-5-varyings test.

NOTE: This is a candidate for all stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172


Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit)
Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift)
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 09fbc298)

3acd5ed7

i965: Compute the maximum SF source attribute. · 697f8e56

Kenneth Graunke authored 12 years ago


The maximum SF source attribute is necessary to compute the Vertex URB
read length properly, which will be done in the next commit.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 5e9bc7bd)

697f8e56

i965: Refactor Gen6+ SF attribute override code. · 45ae093e

Kenneth Graunke authored 12 years ago


The next patch will benefit from easy access to the source attribute
number and whether or not we're swizzling.  It doesn't want the final
attr_override DWord form, however.

NOTE: This is a candidate for all stable branches.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit b3efc5be)

45ae093e

i965: Add chipset limits for Haswell GT1/GT2. · 535e9529

Kenneth Graunke authored 13 years ago


The maximum number of URB entries come from the 3DSTATE_URB_VS and
3DSTATE_URB_GS state packet documentation; the thread count information
comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
(cherry picked from commit 9add4e80)

535e9529

i965: Fix assignment instead of comparison in asserts. · a7e2c615

Vinson Lee authored 12 years ago and

Kenneth Graunke committed 12 years ago


Fixes side effect in assertion defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
(cherry picked from commit 1559994c)

a7e2c615

mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange. · 5611a5a3

Paul Berry authored 12 years ago


In the documentation for BindBufferRange, OpenGL specs from 3.0
through 4.1 contain this language:

    "The error INVALID_VALUE is generated if size is less than or
    equal to zero or if offset + size is greater than the value of
    BUFFER_SIZE."

This text was dropped from OpenGL 4.2, and it does not appear in the
GLES 3.0 spec.

Presumably the reason for the change is because come clients change
the size of the buffer after calling BindBufferRange.  We don't want
to generate an error at the time of the BindBufferRange call just
because the old size of the buffer was too small, when the buffer is
about to be resized.

Since this is a deliberate relaxation of error conditions in order to
allow clients to work, it seems sensible to apply it to all versions
of GL, not just GL 4.2 and above.

(Note that there is no danger of this change allowing a client to
access data beyond the end of a buffer.  We already have code to
ensure that that doesn't happen in the case where the client shrinks
the buffer after calling BindBufferRange).

Eliminates a spurious error message in the gles3 conformance test
"transform_feedback_offset_size".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 04f0d6cc)

5611a5a3

i965: Set UniformBufferOffsetAlignment to sizeof(vec4) · a48e5526

Ian Romanick authored 12 years ago


This matches the behavior of the Windows driver, but a bspec reference
should would be nice.

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f29ab4ec)

a48e5526

mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3 · c59808c7

Matt Turner authored 12 years ago


Should have been done in d9948e49 but I missed it because
MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same
value as MAX_VARYING_COMPONENTS.

NOTE: Candidate for the 9.1 branch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

c59808c7

Feb 07, 2013

radeonsi: Handle scaled and integer formats for samplers and vertex elements. · ad62f424

Michel Dänzer authored 12 years ago

Also, add assertions to stress that render targets don't support scaled
formats.

20 more little piglits.
(cherry picked from commit 46dd16bca8b4526e46badc9cb1d7c058a8e6173e)

ad62f424

radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support. · fc044555
Michel Dänzer authored 12 years ago
```
The hardware can't do it.
(cherry picked from commit f6e9430da2d3510f84baefa0fdf26ec5c457f146)
```
fc044555

radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args(). · 6799bddf

Michel Dänzer authored 12 years ago

The proper return type is assigned at the end of the function.
(cherry picked from commit 180db2bcb28e94bb1ce18d76b2b3a5818d76262c)

6799bddf

radeonsi: Use unique names for referring to texture sampling intrinsics. · 93f61add

Michel Dänzer authored 12 years ago

Append the overloaded vector type used for passing in the addressing
parameters.

Without this, LLVM uses the same function signature for all those types,
which cannot work.

Fixes problems e.g. with FlightGear and Red Eclipse.
(cherry picked from commit 1b3afea30de757815555d9eb1d6e72e2586d6a0c)

93f61add

r600g: fix slice tile max for compressed texture and async dma · d04b50b4

Jerome Glisse authored 12 years ago

Was using the pixel size instead of the number of block for the slice
tile max computation which resulted in dma writing at wrong address.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>

d04b50b4

Feb 06, 2013
- r300g: fix blending with blend color and RGBA formats · f1c46c84
  Marek Olšák authored 12 years ago
  
  NOTE: This is a candidate for the stable branches. (cherry picked from commit f40a7fc3)
  f1c46c84
Feb 05, 2013

Require libdrm_radeon 2.4.42 for radeonsi. · 4bc85f9a

Michel Dänzer authored 12 years ago

It has new PCI IDs and an important tiled surface layout fix.
(cherry picked from commit 02a423b2)

4bc85f9a

Feb 04, 2013

radeonsi: add Oland pci ids · e1d798a9

Alex Deucher authored 12 years ago


Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit 4161d70b)

e1d798a9

radeonsi: default PA_SC_RASTER_CONFIG to 0 · 6b0fa537

Alex Deucher authored 12 years ago


That should work in all cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch.
(cherry picked from commit af0af758)

6b0fa537

radeonsi: add support for Oland chips · 0cc0097b

Alex Deucher authored 12 years ago


Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Note: this is a candidate for the 9.1 branch
(cherry picked from commit 83e4407f)

0cc0097b

radeonsi: Fix draws using user index buffer. · 7f90de54

Michel Dänzer authored 12 years ago

Was broken since commit bf469f4e
('gallium: add void *user_buffer in pipe_index_buffer').

Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a8a5055f)

7f90de54

radeonsi: Remove spurious traces of R16G16B16 support. · 8cd237bc

Michel Dänzer authored 12 years ago

The hardware can't do it, and these were causing warnings in some piglit tests.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6455d40b)

8cd237bc

radeonsi: Enable texture arrays. · 5ca77c27

Michel Dänzer authored 12 years ago

28/30 piglit tests pass.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 6bcb8238)

5ca77c27

radeonsi: Improve packing of texture address parameters. · b104d151

Michel Dänzer authored 12 years ago

In particular, the LOD bias and depth comparison values are packed before the
'normal' texture coordinates, and the array slice and LOD values are appended.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit 120efeef)

b104d151

radeonsi: Adapt to sample intrinsics changes. · 5f9f3f38

Michel Dänzer authored 12 years ago

Fix up intrinsic names, and bitcast texture address parameters to integers.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit e5fb7347)

5f9f3f38

Feb 01, 2013

mesa: don't expose IBM_rasterpos_clip in a core context · b127ad34

Marek Olšák authored 12 years ago


glRasterPos doesn't exist in the core profile.

NOTE: This is a candidate for the stable branches (9.0 and 9.1).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit cc5fdaf2)

b127ad34

r300g: always put MSAA resources in VRAM · 1003652a

Marek Olšák authored 12 years ago

This along with the latest drm-fixes branch should help with bad performance
of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6).

Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA.

NOTE: This is a candidate for the 9.1 branch.
(cherry picked from commit a06f03d7)

1003652a

Jan 31, 2013

r600g: add cs memory usage accounting and limit it v3 · 9d8a866d

Jerome Glisse authored 12 years ago


We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.

The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory will be precisely
accounted. So the uncertainty is only on the current draw call.
In practice this gave very good estimate (+/- 10% of the target
memory limit).

v2: Remove left over from testing version, remove useless NULL
    checking. Improve commit message.
v3: Add comment to code on memory accounting precision

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

9d8a866d

r600g: fix htile buffer leak · 3b8d4f94
Marek Olšák authored 12 years ago
```
NOTE: This is a candidate for the 9.1 branch.
```
3b8d4f94

Jan 29, 2013
- build: Add missing comma in AS_IF · ff515c4e
  Matt Turner authored 12 years ago
  
  Reported-by: Lauri <Kasanen<curaga@operamail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15
  ff515c4e
- docs/relnotes-9.1: document new features in radeon drivers · d7ca04a7
  Marek Olšák authored 12 years ago
  
  (cherry picked from commit 84513095)
  d7ca04a7

Admin message

Admin message