Commits · 11c0ff60ef19cca84452aa989fb8bb25127473e0 · Luca Weiss / mesa

Mar 13, 2015
- Add release notes for the 10.5.1 release · 11c0ff60
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  mesa-10.5.1
  
  11c0ff60
- Update version to 10.5.1 · 0f32ac39
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  0f32ac39
Mar 12, 2015

freedreno/ir3: fix failed assert in grouping · ce13666f

Rob Clark authored 10 years ago and

Emil Velikov committed 9 years ago


Turns out there are scenarios where we need to insert mov's in "front"
of an input.  Triggered by shaders like:

  VERT
  DCL IN[0]
  DCL IN[1]
  DCL OUT[0], POSITION
  DCL OUT[1], GENERIC[9]
  DCL SAMP[0]
  DCL TEMP[0], LOCAL
    0: MOV TEMP[0].xy, IN[1].xyyy
    1: MOV TEMP[0].w, IN[1].wwww
    2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY
    3: MOV OUT[1], TEMP[0]
    4: MOV OUT[0], IN[0]
    5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 27648efa)

ce13666f

freedreno/ir3: handle flat bypass for a4xx · 065a24bd

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


We may not need this for later a4xx patchlevels, but we do at least need
this for patchlevel 0.  Bypass bary.f for fetching varyings when flat
shading is needed (rather than configure via cmdstream).  This requires
a special dummy bary.f w/ (ei) flag to signal to scheduler when all
varyings are consumed.  And requires shader variants based on rasterizer
flatshade state to handle TGSI_INTERPOLATE_COLOR.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit e9f2abe3)

065a24bd

freedreno/ir3: add support for memory (cat6) instructions · 1dec8bbb

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Scheduled basically the same as texture (cat5) instructions, using (sy)
flag for synchronization.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 9d732d31)

1dec8bbb

freedreno/ir3: fix up cat6 instruction encodings · af4d1096

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


I think there is at least one more sub-encoding, but these two should be
enough to cover the common load/store instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 20b50a07)

af4d1096

freedreno/a4xx: aniso filtering · 645d7f46
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit dd70e786)
```
645d7f46
freedreno: update generated headers · 80c4ba0c
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c70097ae)
```
80c4ba0c

freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly · aca5fdae

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Fixes xonotic, some webgl stuff, and really pretty much anything with
more than 4 varyings.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 51e33574)

aca5fdae

freedreno: update generated headers · 7abc57b6

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit fb1301e4)

Conflicts:
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h

7abc57b6

freedreno/a4xx: bit of cleanup · 20ea65be
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit bdf02348)
```
20ea65be

freedreno/a2xx: fix increment in assert · 38777e13

Rob Clark authored 10 years ago and

Emil Velikov committed 9 years ago

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883


Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 68552266)

38777e13

Mar 11, 2015

i965: Fix out-of-bounds accesses into pull_constant_loc array · 4de2f250

Iago Toral authored 9 years ago and

Emil Velikov committed 9 years ago

The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202


Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90)
Nominated-by: Matt Turner <mattst88@gmail.com>

4de2f250

i965/fs: Don't issue FB writes for bound but unwritten color targets. · fbd06fe6

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).

Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2].  This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts.  Now we skip writing to it, fixing rendering.

Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].

Thanks to Jason Ekstrand for tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e95969cd)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs_visitor.cpp

fbd06fe6

i965/fs: Make emit_shader_time_end() insert before EOT. · c232d765

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS).  This is duplicated several times and rather awkward.

I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.

Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT.  Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.

Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF.  (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)

v2: Rebase on v3 of the previous patch and other shader_time fixes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1]
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4ebeb715)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

c232d765

i965/fs: Make get_timestamp() pass back the MOV rather than emitting it. · 0d625e1a

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.

v2: Don't lose smear!  Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV.  Thanks Topi!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e43af8d0)

0d625e1a

i965/fs: Make emit_shader_time_write return rather than emit. · e9e18265

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bea854c7)

e9e18265

i965/fs: Set smear on shader_time diff register. · 82ef4994

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register.  We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.

Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f1adc45d)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

82ef4994

i965/fs: Set force_writemask_all on shader_time instructions. · c3fc8b28

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.

This fixes assert failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ef9cc7d0)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

c3fc8b28

r300g: fix sRGB->sRGB blits · aea510a9
Marek Olšák authored 9 years ago and Emil Velikov committed 9 years ago
```
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit c939231e)
```
aea510a9
r300g: fix a crash when resolving into an sRGB texture · c898d5c9
Marek Olšák authored 9 years ago and Emil Velikov committed 9 years ago
```
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 9953586a)
```
c898d5c9
r300g: fix RGTC1 and LATC1 SNORM formats · 32a7f119
Marek Olšák authored 9 years ago and Emil Velikov committed 9 years ago
```
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 74a757f9)
```
32a7f119

r300g: Fix the ATI1N swizzle (RGTC1 and LATC1) · 578ac079

Stefan Dösinger authored 9 years ago and

Emil Velikov committed 9 years ago

This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01
test as well as the precision part of Wine's 3dc format test (fd.o bug
89156).

The Z component seems to contain a lower precision version of the
result, probably a temporary value from the decompression computation.
The Y and W component contain different data that depends on the input
values as well, but I could not make sense of them (Not that I tried
very hard).

GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in
piglit, and both formats are affected by a compiler bug if they're
sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx,
which returns random garbage.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156


Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f710b990)

578ac079

freedreno/ir3: fix silly typo for binning pass shaders · 0ea3c150

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Was resulting in gl_PointSize write being optimized out, causing
particle system type shaders to hang if hw binning enabled.

Fixes neverball, OGLES2ParticleSystem, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 60096ed9)

0ea3c150

freedreno/ir3: get the # of miplevels from getinfo · b542424a

Ilia Mirkin authored 9 years ago and

Emil Velikov committed 9 years ago


This fixes ARB_texture_query_levels to actually return the desired
value.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit cb3eb43a)

b542424a

freedreno/ir3: fix array count returned by TXQ · d8ed6aa4

Ilia Mirkin authored 9 years ago and

Emil Velikov committed 9 years ago


Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 8ac957a5)

d8ed6aa4

freedreno: move fb state copy after checking for size change · 5b1bd4fc

Ilia Mirkin authored 9 years ago and

Emil Velikov committed 9 years ago


Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit f3dfe651)

5b1bd4fc

glsl: Mark array access when copying to a temporary for the ?: operator. · cddbb3a7

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/
array-selection.vert test contains the following code:

   gl_Position = (pick_from_a_or_b ? a : b)[i];

where "a" and "b" are uniform vec4[2] variables.

ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and
generates an if-block to copy one or the other:

   (declare (temporary) (array vec4 2) conditional_tmp)
   (if (var_ref pick_from_a_or_b)
     ((assign () (var_ref conditional_tmp) (var_ref a)))
     ((assign () (var_ref conditional_tmp) (var_ref b))))

However, we failed to update max_array_access for "a" and "b", so it
remained 0 - here, the whole array is being accessed.  At link time,
update_array_sizes() used this bogus information to change the types
of "a" and "b" to vec4[1].  We then had assignments from a vec4[1] to
a vec4[2], which is highly illegal.

This tripped assertions in nir_split_var_copies with scalar VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 9f1e250e)

cddbb3a7

meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage · e4d3bd68

Neil Roberts authored 9 years ago and

Emil Velikov committed 9 years ago


The yoffset needs to be interpreted as a slice offset for 1D array
textures. This patch implements that by moving the yoffset into
zoffset similar to how it moves the height into depth.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 7286a689)

e4d3bd68

meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage · 614e7ebd

Neil Roberts authored 9 years ago and

Emil Velikov committed 9 years ago

Now that a layered source PBO is interpreted as a single tall 2D image
it's quite easy to accept the image height packing option by just
creating an image that is tall enough to include the image padding.

I'm not sure whether the image height property should affect 1D_ARRAY
textures. My intuition and interpretation of the GL spec (which is a
bit vague) would be that it shouldn't. However the software fallback
path in Mesa uses the property for packing but not for unpacking. The
binary NVidia driver uses it for both. This patch doesn't use it for
either case so it is different from the software fallback. There is
some discussion about this here:

http://lists.freedesktop.org/archives/mesa-dev/2015-February/077925.html



This is tested by the texsubimage Piglit test with the array and pbo
arguments. Previously this test was skipping this code path because it
always sets the image height.

I've also tested it by modifying the getteximage-targets test. It
wasn't using this code path before because it was using the default
texture object so this code couldn't successfully create a frame
buffer. I also modified it to add some image padding with the image
height in the PBO.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a08bff1e)

614e7ebd

Revert "common: Fix PBOs for 1D_ARRAY." · 7f32fa0d

Neil Roberts authored 9 years ago and

Emil Velikov committed 9 years ago


This reverts commit 546aba14.

I think the changes to the calls to glBlitFramebuffer from this patch
are no different to what it was doing previously because it used to
set height to 1 before doing the blits. However it was introducing
some problems with the blit for layer 0 because this was no longer
special cased. It didn't fix problems with the yoffset which needs to
be interpreted as a slice offset. I think a better solution would be
to modify the original if statement to cope with the yoffset.

Conflicts:
	src/mesa/drivers/common/meta_tex_subimage.c

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 7d10d2fe)

7f32fa0d

meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex · a15de1ae

Neil Roberts authored 9 years ago and

Emil Velikov committed 9 years ago


A layered PBO image is now interpreted as a single tall 2D image so
the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this
was just redundantly rebinding the same image repeatedly.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit a44606eb)

a15de1ae

Mar 07, 2015

i965: Avoid applying negate to wrong MAD source. · 31fcb21e

Matt Turner authored 9 years ago and

Emil Velikov committed 9 years ago

For some given GLSL IR like (+ (neg x) (* 1.2 x)), the try_emit_mad
function would see that one of the +'s sources was a negate expression
and set mul_negate = true without confirming that it was actually a
multiply.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89315
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89095


Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d528907f)
[Emil Velikov: drop the changes in brw_vec4_visitor.cpp]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
	src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp

31fcb21e

main: Fix target checking for CopyTexSubImage*D. · 0cd8e357

Laura Ekstrand authored 9 years ago and

Emil Velikov committed 9 years ago

This fixes a dEQP test failure.  In the test,
glCopyTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking.  To remedy this, target
checking was separated from the main error-checking function and
called prior to _mesa_get_current_tex_object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89312



Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit ca65764d)

0cd8e357

main: Fix target checking for CompressedTexSubImage*D. · 8b4db9c6

Laura Ekstrand authored 9 years ago and

Emil Velikov committed 9 years ago

This fixes a dEQP test failure.  In the test,
glCompressedTexSubImage2D was called with target = 0 and failed to throw
INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx,
target) being called before the target checking.  To remedy this, target
checking was made into its own function and called prior to
_mesa_get_current_tex_object.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89311



Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
(cherry picked from commit 549078cb)

8b4db9c6

intel: fix EGLImage renderbuffer _BaseFormat · b0400a58

Frank Henigman authored 9 years ago and

Emil Velikov committed 9 years ago


Correctly set _BaseFormat field when creating a gl_renderbuffer
with EGLImage storage.

Change-Id: I8c9f7302d18b617f54fa68304d8ffee087ed8a77
Signed-off-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
(cherry picked from commit e4372994)
Nominated-by: Chad Versace <chad.versace@intel.com>

b0400a58

Revert SHA1 additions. · ef1c87ba

Matt Turner authored 9 years ago and

Emil Velikov committed 9 years ago


The shader-cache isn't finished, so the configure checks are a bit
premature and will only stand to confuse users of Mesa 10.5.0.

This is a squash of the follow four reverts:

   Revert "Rename sha1.c and sha1.h to mesa-sha1.c and mesa-sha1.h"
   Revert "configure: Add machinery for --enable-shader-cache (and --disable-shader-cache)"
   Revert "sha1: Fix gcry_md_hd_t typo."
   Revert "mesa: Add mesa SHA-1 functions"

Reviewed-by: Carl Worth <cworth@cworth.org>

ef1c87ba

i965/vec4: Don't lose the saturate modifier in copy propagation. · a71223eb

Andrey Sudnik authored 9 years ago and

Emil Velikov committed 9 years ago

Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224


Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 0dfec59a)

a71223eb

i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta. · 47a3ae1f

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

A while back I switched intel_blit_framebuffer to prefer Meta over the
BLT.  This meant that Gen8 platforms would start using the 3D engine
for blits, just like we do on Gen6-7.5.

However, I hadn't considered Gen4-5 when making that change.  The BLT
engine appears to be substantially faster on 965GM than using Meta to
drive the 3D engine.  This isn't too surprising: original Gen4 doesn't
support tile offsets (that came on G45), and the level/layer fields
don't work for cubemap rendering, so for inconvenient miplevel
alignments, we end up blitting or copying data to/from temporaries
in order to render to it.  We may as well just use the blitter.

I chose to use the BLT on Gen4-5 because they use the same ring for
both 3D and BLT; Gen6+ splits it out.

Fixes regressions on 965GM due to botched tile offset code (we should
fix those properly as well, but they're longstanding bugs - for now,
put things back to the status quo).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit aa0705c0)

47a3ae1f

i965: Tell intel_get_memcpy() which direction the memcpy() is going. · dbf97463

Matt Turner authored 9 years ago and

Emil Velikov committed 9 years ago

The SSSE3 swizzling code was written for fast uploads to the GPU and
assumed the destination was always 16-byte aligned. When we began using
this code for fast downloads as well we didn't do anything to account
for the fact that the destination pointer given by glReadPixels() or
glGetTexImage() is not guaranteed to be suitably aligned.

With SSSE3 enabled (at compile-time), some applications would crash when
an SSE aligned-store instruction tried to store to an unaligned
destination (or an assertion that the destination is aligned would
trigger).

To remedy this, tell intel_get_memcpy() whether we're uploading or
downloading so that it can select whether to assume the destination or
source is aligned, respectively.

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416


Tested-by: Uriy Zhuravlev <stalkerg@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 2e4c95df)

dbf97463

Admin message

Admin message