Commits · mesa-10.5.2 · Vlad Schiller / mesa

Mar 28, 2015
- Add release notes for the 10.5.2 release · 5e59f895
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  mesa-10.5.2
  
  5e59f895
- Update version to 10.5.2 · ebbfa797
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  ebbfa797
- cherry-ignore: add commit non applicable for 10.5 · fda3bc1e
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  fda3bc1e
Mar 26, 2015

configure: Introduce new output variable to ax_check_python_mako_module.m4 · e98909b0

Samuel Iglesias Gonsálvez authored 9 years ago and

Emil Velikov committed 9 years ago


This output variables gives more flexibility for future changes
in autoconf to detect if it is needed to auto-generate files and
check for the auto-generation dependencies.

It is still returning error when Python is not installed.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
(cherry picked from commit ced94253)

Squashed with commit

configure.ac: move AC_MSG_RESULT reporting back into the m4 macro

The one who does AC_MSG_CHECKING should provide the AC_MSG_RESULT.

Fixes: ced94253 (configure: Introduce new output variable to
ax_check_python_mako_module.m4"

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89328


Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
(cherry picked from commit 248eb54e)

e98909b0

glsl: Generate link error for non-matching gl_FragCoord redeclarations · d83d2ea9

Anuj Phogat authored 9 years ago and

Emil Velikov committed 9 years ago


in different fragment shaders. This also applies to a case when gl_FragCoord
is redeclared with no layout qualifiers in one fragment shader and not
declared but used in other fragment shader.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Khronos Bug#12957
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>

(cherry picked from commit d8208312)

d83d2ea9

mapi: Make private copies of name strings provided by client. · d6413ed9

Mario Kleiner authored 9 years ago and

Emil Velikov committed 9 years ago


glXGetProcAddress("glFoo") ends up in stub_add_dynamic() to
create dynamic stubs for dynamic functions. stub_add_dynamic()
doesn't store the caller provided name string "Foo" in a mesa
private copy, but just stores a pointer to the "glFoo" string
passed to glXGetProcAddress - a pointer into arbitrary memory
outside mesa's control.

If the caller passes some dynamically allocated/changing
memory buffer to glXGetProcAddress(), or the caller gets unmapped
from memory, e.g., some dynamically loaded application
plugin which uses OpenGL, this ends badly - with a dangling
pointer.

strdup() the name string provided by the client to avoid
this problem.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 1110113a)

d6413ed9

clover: Return 0 as storage size for local kernel args that are not set v2 · 3147f0bd

Tom Stellard authored 9 years ago and

Emil Velikov committed 9 years ago


The storage size for local kernel args can be queried before the
arguments are set by using the CL_KERNEL_LOCAL_MEM_SIZE param
of clGetKernelWorkGroupInfo().

The spec says that if local kernel arguments have not been specified,
then we should assume their size is 0.

v2:
  - Implement using c++11 member initialization.

Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dfb1ae9d)

3147f0bd

glsl: fix names in lower_constant_arrays_to_uniforms · c2760f0a

Tapani Pälli authored 9 years ago and

Emil Velikov committed 9 years ago


Patch changes lowering pass to use unique name for each uniform
so that arrays from different stages cannot end up having same
name.

v2: instead of global counter, use pointer to achieve
    unique name (Kenneth Graunke)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590


Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 3cf99701)

c2760f0a

i965: Set nr_params to the number of uniform components in the VS/GS path. · 859b4afc

Francisco Jerez authored 10 years ago and

Emil Velikov committed 9 years ago


Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to
the number of uniform *vectors* required by the shader rather than the number
of uniform components, contradicting the comment.  This is inconsistent with
what the state upload code and scalar path expect but it happens to work until
Gen8 because vec4_visitor interprets it as a number of vectors on construction
and later on overwrites its original value with the number of uniform
components referenced by the shader.

Also there's no need to add the number of samplers, they're not actually
passed in as uniforms.

Fixes a memory corruption issue on BDW with SIMD8 VS.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit fd149628)
[Emil Velikov: s/DIV_ROUND_UP/CEILING/]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

859b4afc

Mar 25, 2015

radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords · d33bf815

Marek Olšák authored 9 years ago and

Emil Velikov committed 9 years ago


radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.)

Discovered by Coverity. Reported by Ilia Mirkin.

Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit a984abda)

d33bf815

glx: Handle out-of-sequence swap completion events correctly. (v2) · 8ebda1f1

Mario Kleiner authored 9 years ago and

Emil Velikov committed 9 years ago


The code for emitting INTEL_swap_events swap completion
events needs to translate from 32-Bit sbc on the wire to
64-Bit sbc for the events and handle wraparound accordingly.

It assumed that events would be sent by the server in the
order their corresponding swap requests were emitted from
the client, iow. sbc count should be always increasing. This
was correct for DRI2.

This is not always the case under the DRI3/Present backend,
where the Present extension can execute presents and send out
completion events in a different order than the submission
order of the present requests, due to client code specifying
targetMSC target vblank counts which are not strictly
monotonically increasing. This confused the wraparound
handling. This patch fixes the problem by handling 32-Bit
wraparound in both directions. As long as successive swap
completion events real 64-Bit sbc's don't differ by more
than 2^30, this should be able to do the right thing.

How this is supposed to work:

awire->sbc contains the low 32-Bits of the true 64-Bit sbc
of the current swap event, transmitted over the wire.

glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit
sbc of the most recently processed swap event.

glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper
32-Bits of the current sbc. The final 64-Bit output sbc
aevent->sbc is computed from the sum of awire->sbc and
glxDraw->eventSbcWrap.

Under DRI3/Present, swap completion events can be received
slightly out of order due to non-monotic targetMsc specified
by client code, e.g., present request submission:

Submission sbc:   1   2   3
targetMsc:        10  11  9

Reception of completion events:
Completion sbc:   3   1   2

The completion sequence 3, 1, 2 would confuse the old wraparound
handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound
has happened when it hasn't.

The client can queue multiple present requests, in the case of
Mesa up to n requests for n-buffered rendering, e.g., n =  2-4 in
the current Mesa GLX DRI3/Present implementation. In the case of
direct Pixmap presents via xcb_present_pixmap() the number n is
limited by the amount of memory available.

We reasonably assume that the number of outstanding requests n is
much less than 2 billion due to memory contraints and common sense.
Therefore while the order of received sbc's can be a bit scrambled,
successive 64-Bit sbc's won't deviate by much, a given sbc may be
a few counts lower or higher than the previous received sbc.

Therefore any large difference between the incoming awire->sbc and
the last recorded glxDraw->lastEventSbc will be due to 32-Bit
wraparound and we need to adapt glxDraw->eventSbcWrap accordingly
to adjust the upper 32-Bits of the sbc.

Two cases, correponding to the two if-statements in the patch:

a) Previous sbc event was below the last 2^32 boundary, in the previous
glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32
epoch, therefore the low 32-Bit awire->sbc wrapped around to zero,
or close to zero --> awire->sbc is apparently much lower than the
glxDraw->lastEventSbc recorded for the previous epoch

--> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch to be one higher than the previous one.

--> Case a) also handles the old DRI2 behaviour.

b) Previous sbc event was above closest 2^32 boundary, but now a
late event from the previous 2^32 epoch arrives, with a true sbc
that belongs to the previous 2^32 segment, so the awire->sbc of
this late event has a high count close to 2^32, whereas
glxDraw->lastEventSbc is closer to zero --> awire->sbc is much
greater than glXDraw->lastEventSbc.

--> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust
the current epoch back to the previous lower epoch of this late
completion event.

We assume such a wraparound to a higher (a) epoch or lower (b)
epoch has happened if awire->sbc and glxDraw->lastEventSbc differ
by more than 2^30 counts, as such a difference can only happen
on wraparound, or if somehow 2^30 present requests would be pending
for a given drawable inside the server, which is rather unlikely.

v2: Explain the reason for this patch and the new wraparound handling
    much more extensive in commit message, no code change wrt. initial
    version.

Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
(cherry picked from commit cc5ddd58)

8ebda1f1

auxiliary/os: fix the android build - s/drm_munmap/os_munmap/ · 0410d9b1

Emil Velikov authored 9 years ago


Squash this silly typo introduced with commit c63eb5dd(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 55f0c0a2)

0410d9b1

loader: include <sys/stat.h> for non-sysfs builds · af3e6e28

Emil Velikov authored 9 years ago

Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530


Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
(cherry picked from commit 771cd266)

af3e6e28

c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default · 29810e43

Felix Janda authored 9 years ago and

Emil Velikov committed 9 years ago

Previously PTHREAD_MUTEX_RECURSIVE_NP had been used on linux for
compatibility with old glibc. Since mesa defines __GNU_SOURCE__
on linux PTHREAD_MUTEX_RECURSIVE is also available since at least
1998. So we can unconditionally use the portable version
PTHREAD_MUTEX_RECURSIVE.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88534


Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit aead7fe2)

29810e43

freedreno: update generated headers · 2e0f2ad5

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit e92bc6b3)

2e0f2ad5

freedreno: fix slice pitch calculations · 411f975a

Ilia Mirkin authored 9 years ago and

Emil Velikov committed 9 years ago


For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 620e29b7)

411f975a

freedreno/a3xx: use the same layer size for all slices · 3fa76f3f

Ilia Mirkin authored 9 years ago and

Emil Velikov committed 9 years ago


We only program in one layer size per texture, so that means that all
levels must share one size. This makes the piglit test

bin/texelFetch fs sampler2DArray

have the same breakage as its non-array version instead of being
completely off, and makes

bin/ext_texture_array-gen-mipmap

start passing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 89b26d5a)

3fa76f3f

glsl: optimize (0 cmp x + y) into (-x cmp y). · 5e572b1c

Samuel Iglesias Gonsálvez authored 9 years ago and

Emil Velikov committed 9 years ago


The optimization done by commit 34ec1a24 did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b43bbfa9)

5e572b1c

st/egl: don't ship the dri2.c link at the tarball · 2beab3c0

Emil Velikov authored 9 years ago

During 'make dist' the path of the symbolic link (x11/dri2.c) becomes
too long, and tar converts it to hard one. To make it more complicated
on Haiku tar errors out (due to lack of hardlink support) rather than
falling back to the next best thing.
So remove the symlink from git, and disable the scons x11_drm egl code.
The offending code is not build with either automake nor android.

Brian, Jose would you have any objections against this ? I was
playing around to get the symlink resolved, although I could not get the
dependency tracking resolved, so env.Command() was never executed :-\

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89680


Cc: mesa-stable@lists.freedesktop.org
Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Cc: Brian Paul <brianp@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

2beab3c0

automake: add missing egl files to the tarball · d80bc650

Emil Velikov authored 9 years ago


Namely the Haiku EGL driver backend and the SConscript for the dri2 EGL
driver backend.

Cc: Alexander von Gluck IV <kallisti5@unixzen.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 5dc573e5)

d80bc650

Mar 13, 2015
- docs: Add sha256 sums for the 10.5.1 release · 2abba086
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  2abba086
- Add release notes for the 10.5.1 release · 11c0ff60
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  mesa-10.5.1
  
  11c0ff60
- Update version to 10.5.1 · 0f32ac39
  Emil Velikov authored 9 years ago
  
  Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
  0f32ac39
Mar 12, 2015

freedreno/ir3: fix failed assert in grouping · ce13666f

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Turns out there are scenarios where we need to insert mov's in "front"
of an input.  Triggered by shaders like:

  VERT
  DCL IN[0]
  DCL IN[1]
  DCL OUT[0], POSITION
  DCL OUT[1], GENERIC[9]
  DCL SAMP[0]
  DCL TEMP[0], LOCAL
    0: MOV TEMP[0].xy, IN[1].xyyy
    1: MOV TEMP[0].w, IN[1].wwww
    2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY
    3: MOV OUT[1], TEMP[0]
    4: MOV OUT[0], IN[0]
    5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 27648efa)

ce13666f

freedreno/ir3: handle flat bypass for a4xx · 065a24bd

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


We may not need this for later a4xx patchlevels, but we do at least need
this for patchlevel 0.  Bypass bary.f for fetching varyings when flat
shading is needed (rather than configure via cmdstream).  This requires
a special dummy bary.f w/ (ei) flag to signal to scheduler when all
varyings are consumed.  And requires shader variants based on rasterizer
flatshade state to handle TGSI_INTERPOLATE_COLOR.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit e9f2abe3)

065a24bd

freedreno/ir3: add support for memory (cat6) instructions · 1dec8bbb

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Scheduled basically the same as texture (cat5) instructions, using (sy)
flag for synchronization.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 9d732d31)

1dec8bbb

freedreno/ir3: fix up cat6 instruction encodings · af4d1096

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


I think there is at least one more sub-encoding, but these two should be
enough to cover the common load/store instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 20b50a07)

af4d1096

freedreno/a4xx: aniso filtering · 645d7f46
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit dd70e786)
```
645d7f46
freedreno: update generated headers · 80c4ba0c
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit c70097ae)
```
80c4ba0c

freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly · aca5fdae

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Fixes xonotic, some webgl stuff, and really pretty much anything with
more than 4 varyings.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 51e33574)

aca5fdae

freedreno: update generated headers · 7abc57b6

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago


Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit fb1301e4)

Conflicts:
	src/gallium/drivers/freedreno/a3xx/a3xx.xml.h

7abc57b6

freedreno/a4xx: bit of cleanup · 20ea65be
Rob Clark authored 9 years ago and Emil Velikov committed 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit bdf02348)
```
20ea65be

freedreno/a2xx: fix increment in assert · 38777e13

Rob Clark authored 9 years ago and

Emil Velikov committed 9 years ago

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883


Signed-off-by: Rob Clark <robclark@freedesktop.org>
(cherry picked from commit 68552266)

38777e13

Mar 11, 2015

i965: Fix out-of-bounds accesses into pull_constant_loc array · 4de2f250

Iago Toral authored 9 years ago and

Emil Velikov committed 9 years ago

The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed
to do an out of bounds access into an uniform array to make sure that we
handle that situation gracefully inside the driver, however, as Ken describes
in bug 79202, Valgrind reports that this is leading to an out-of-bounds access
in fs_visitor::demote_pull_constants().

Before accessing the pull_constant_loc array we should make sure that
the uniform we are trying to access is valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202


Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 6ac1bc90)
Nominated-by: Matt Turner <mattst88@gmail.com>

4de2f250

i965/fs: Don't issue FB writes for bound but unwritten color targets. · fbd06fe6

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

We used to loop over all color attachments, and emit FB writes for each
one, even if the shader didn't write to a corresponding output variable.
Those color attachments would be filled with garbage (undefined values).

Football Manager binds a framebuffer with 4 color attachments, but draws
to it using a shader that only writes to gl_FragData[0..2].  This meant
that color attachment 3 would be filled with garbage, resulting in
rendering artifacts.  Now we skip writing to it, fixing rendering.

Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to
GRFs, while writes to gl_FragData[i] initialize outputs[i].

Thanks to Jason Ekstrand for tracking this down.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e95969cd)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs_visitor.cpp

fbd06fe6

i965/fs: Make emit_shader_time_end() insert before EOT. · c232d765

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS).  This is duplicated several times and rather awkward.

I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.

Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT.  Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.

Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF.  (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)

v2: Rebase on v3 of the previous patch and other shader_time fixes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1]
Acked-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 4ebeb715)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

c232d765

i965/fs: Make get_timestamp() pass back the MOV rather than emitting it. · 0d625e1a

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.

v2: Don't lose smear!  Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV.  Thanks Topi!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit e43af8d0)

0d625e1a

i965/fs: Make emit_shader_time_write return rather than emit. · e9e18265

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago


Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit bea854c7)

e9e18265

i965/fs: Set smear on shader_time diff register. · 82ef4994

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register.  We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.

Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit f1adc45d)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

82ef4994

i965/fs: Set force_writemask_all on shader_time instructions. · c3fc8b28

Kenneth Graunke authored 9 years ago and

Emil Velikov committed 9 years ago

These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.

This fixes assert failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974


Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit ef9cc7d0)

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs.cpp

c3fc8b28

Admin message