 06 Jun, 2019 27 commits


Erik FayeLund authored
HTML has the <p>tag for this purpose. It adds some margins, but that just makes this read better, IMO. Signedoffby: Erik FayeLund <erik.fayelund@collabora.com> Reviewedby: Eric Engestrom <eric.engestrom@intel.com>

Erik FayeLund authored
This reads better if we include the asterisk in the codeblock, as it's part of the functionreference, even though it's not technically speaking code. But as the <code>tag isn't purely for code, this should be fine. Signedoffby: Erik FayeLund <erik.fayelund@collabora.com> Reviewedby: Eric Engestrom <eric.engestrom@intel.com>

Erik FayeLund authored
Looks like I missed a few cases when I recently added more codetags here. So let's add these cases as well. Signedoffby: Erik FayeLund <erik.fayelund@collabora.com> Reviewedby: Eric Engestrom <eric.engestrom@intel.com>

Erik FayeLund authored
When rewriting 20c56e18 after review, I accidentally dropped the "at" here. Sorry for that, and let's fix it up! Signedoffby: Erik FayeLund <erik.fayelund@collabora.com> Fixes: 20c56e18 ("docs: use proper links instead of codetags") Reviewedby: Eric Engestrom <eric.engestrom@intel.com>

Gurchetan Singh authored
AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 is an implementation defined flexible YUV format. Most of the times, it's NV12 or YV12. On Intel, NV12 is preferred since it can be used by the display engine. This API adds a dependency between gralloc and buffer consumers, unfortunately. Right now, the code seems to work for i915 gralloc, but not cros_gralloc. Add a preprocessor flag to fix this. TEST=android.graphics.cts.MediaVulkanGpuTest#testMediaImportAndRendering Reviewedby: Tapani Pälli <tapani.palli@intel.com>

Connor Abbott authored
While we're here, copy the comment explaining this from radeonsi. Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Connor Abbott authored
When e9d935ed added force_dcc_off(), we forced it off for any preloaded image descriptor which had stores associated with them, since the same preloaded descriptors were used for loads and stores. However, when the preloading was removed in 16be87c9, the existing logic was kept despite it not being necessary anymore. The comment above force_dcc_off() only mentions stores, so only force DCC off for stores. Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: Marek Olšák <marek.olsak@amd.com> Reviewedby: Marek Olšák <marek.olsak@amd.com>

Gert Wollny authored
Signedoffby: Gert Wollny <gert.wollny@collabora.com> Reviewedby: Tapani Pälli <tapani.palli@intel.com> Reviewedby: Marek Olšák <marek.olsak@amd.com> Reviewedby: Emil Velikov <emil.velikov@collabora.com>

Gert Wollny authored
The old copy didn't include EXT_clip_control, so update it. Signedoffby: Gert Wollny <gert.wollny@collabora.com> Reviewedby: Tapani Pälli <tapani.palli@intel.com> Ackedby: Marek Olšák <marek.olsak@amd.com> Reviewedby: Emil Velikov <emil.velikov@collabora.com>

Gert Wollny authored
On according hosts this enables the piglits as "pass": arb_clip_control* v2: sync flag with host Signedoffby: Gert Wollny <gert.wollny@collabora.com> Reviewedby: ChiaI Wu <olvaffe@gmail.com> (v1) Reviewedby: Emil Velikov <emil.velikov@collabora.com>

Charmaine Lee authored
This fixes the missing rebind when the can_pre_flush bit is not set and the vertex buffers are the same as what have been sent. Cc: mesastable@lists.freedesktop.org Reviewedby: Neha Bhende <bhenden@vmware.com> Signedoffby: Charmaine Lee <charmainel@vmware.com> Signedoffby: Thomas Hellstrom <thellstrom@vmware.com>

Deepak Rawat authored
Depending on whether compiled with framepointer or not, the temporary memory location used for the bp parameter in these macros are referenced relative to the stack pointer or the frame pointer. Hence we can never reference that parameter when we've modified either the stack pointer or the frame pointer, because then the compiler would generate an incorrect stack reference. Fix this by pushing the temporary memory parameter on a known location on the stack before modifying the stack and frame pointers. Also in case of failuire RPCI channel is not closed which lead to vmx running out of channels. Cc: mesastable@lists.freedesktop.org Signedoffby: Deepak Rawat <drawat@vmware.com> Reviewedby: Sinclair Yeh <syeh@vmware.com> Reviewedby: Thomas Hellstrom <thellstrom@vmware.com> Signedoffby: Thomas Hellstrom <thellstrom@vmware.com>

Samuel Pitoiset authored
This might fix initial subpass transitions when multiview is used. Noticed while implementing sample locations during layout transitions. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> ReviewedBy: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Nataraj Deshpande authored
Checking isl_fmt returned value in assert seems appropriate instead of format variable. Fixes: f1654fa7 "anv/android: support creating images from external format" Signedoffby: Nataraj Deshpande <nataraj.deshpande@intel.com> Reviewedby: Tapani Pälli <tapani.palli@intel.com> Reviewedby: Sagar Ghuge <sagar.ghuge@intel.com>

Iago Toral authored
We were not accountint for small immediates in the B mux so the scheduler was interpreting these are regular register file accesses, which could lead to additional (incorrect) writeread dependencies. Shaderdb changes: total instructions in shared programs: 9163664 > 9137263 (0.29%) instructions in affected programs: 3931035 > 3904634 (0.67%) helped: 12457 HURT: 2563 total maxtemps in shared programs: 1325787 > 1325597 (0.01%) maxtemps in affected programs: 5746 > 5556 (3.31%) helped: 186 HURT: 16 helped stats (abs) min: 1 max: 4 x̄: 1.12 x̃: 1 helped stats (rel) min: 1.45% max: 22.22% x̄: 4.42% x̃: 3.28% HURT stats (abs) min: 1 max: 3 x̄: 1.12 x̃: 1 HURT stats (rel) min: 2.86% max: 10.00% x̄: 5.76% x̃: 5.88% 95% mean confidence interval for maxtemps value: 1.04 0.84 95% mean confidence interval for maxtemps %change: 4.16% 3.07% Maxtemps are helped. Reviewedby: Eric Anholt <eric@anholt.net>

Vasily Khoruzhick authored
Signedoffby: Vasily Khoruzhick <anarsoul@gmail.com> Reviewedby: Qiang Yu <yuq825@gmail.com>

Vasily Khoruzhick authored
Program may need no regalloc at all, e.g. in case when program consists of single discard op. Signedoffby: Vasily Khoruzhick <anarsoul@gmail.com> Reviewedby: Qiang Yu <yuq825@gmail.com>

Jason Ekstrand authored
If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewedby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Eric Anholt <eric@anholt.net>

Jason Ekstrand authored
If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewedby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Eric Anholt <eric@anholt.net>

Jason Ekstrand authored
Reviewedby: Kenneth Graunke <kenneth@whitecape.org>

Jason Ekstrand authored
Fixes: 8410cf66 "nir/propagate_invariant: Skip unknown vars" Reviewedby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Eric Anholt <eric@anholt.net>

Ian Romanick authored
If the 2nd and 3rd source are both Boolean values, we can potentially avoid a resolve by only resolving the result of the b32csel. No changes on any Gen6+ Intel platform. v2: Use ?: instead of cast from bool to unsigned. Suggested by Caio. Iron Lake total instructions in shared programs: 8142729 > 8142677 (<.01%) instructions in affected programs: 12890 > 12838 (0.40%) helped: 26 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %change: 0.52% 0.39% Instructions are helped. total cycles in shared programs: 188549632 > 188549394 (<.01%) cycles in affected programs: 60754 > 60516 (0.39%) helped: 25 HURT: 1 helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8 helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70% 95% mean confidence interval for cycles value: 12.91 5.40 95% mean confidence interval for cycles %change: 0.84% 0.23% Cycles are helped. GM45 total instructions in shared programs: 5013119 > 5013093 (<.01%) instructions in affected programs: 6764 > 6738 (0.38%) helped: 13 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %change: 0.52% 0.34% Instructions are helped. total cycles in shared programs: 128977804 > 128977700 (<.01%) cycles in affected programs: 37738 > 37634 (0.28%) helped: 13 HURT: 0 helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: 8.00 8.00 95% mean confidence interval for cycles %change: 0.36% 0.24% Cycles are helped. Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

Ian Romanick authored
Previously we would blindly emit an sequence like: mov(1) f0.1<1>UW g1.14<0,1,0>UW ... cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F /* 15F */ (+f0.1) cmp.z.f0.1(16) null<1>D g7<8,8,1>D 0D The first move sets the flags based on the initial execution mask. Later discard sequences contain a predicated compare that can only remove more SIMD channels. Often times the only user of the result from the first compare is the second compare. Instead, generate a sequence like mov(1) f0.1<1>UW g1.14<0,1,0>UW ... cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F /* 15F */ (+f0.1) cmp.ge.f0.1(8) null<1>F g5<8,8,1>F 0x41700000F /* 15F */ If the results stored in g7 and f0.0 are not used, the comparison will be eliminated. This removes an instruction and potentially reduces register pressure. v2: Major rewrite of the commit message (including fixing the assembly code). Suggested by Matt. All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224434 > 17198659 (0.15%) instructions in affected programs: 2908125 > 2882350 (0.89%) helped: 18891 HURT: 5 helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02% HURT stats (abs) min: 9 max: 105 x̄: 51.40 x̃: 35 HURT stats (rel) min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56% 95% mean confidence interval for instructions value: 1.39 1.34 95% mean confidence interval for instructions %change: 1.79% 1.73% Instructions are helped. total cycles in shared programs: 361468458 > 361170679 (0.08%) cycles in affected programs: 38470116 > 38172337 (0.77%) helped: 16202 HURT: 1456 helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18 helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18% HURT stats (abs) min: 1 max: 5982 x̄: 87.51 x̃: 28 HURT stats (rel) min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64% 95% mean confidence interval for cycles value: 18.24 15.49 95% mean confidence interval for cycles %change: 2.26% 2.14% Cycles are helped. total spills in shared programs: 12147 > 12176 (0.24%) spills in affected programs: 175 > 204 (16.57%) helped: 8 HURT: 5 total fills in shared programs: 25262 > 25292 (0.12%) fills in affected programs: 269 > 299 (11.15%) helped: 8 HURT: 5 Haswell total instructions in shared programs: 13530316 > 13502647 (0.20%) instructions in affected programs: 2507824 > 2480155 (1.10%) helped: 18859 HURT: 10 helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1 helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41% HURT stats (abs) min: 5 max: 39 x̄: 25.70 x̃: 31 HURT stats (rel) min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31% 95% mean confidence interval for instructions value: 1.49 1.44 95% mean confidence interval for instructions %change: 2.42% 2.34% Instructions are helped. total cycles in shared programs: 377865412 > 377639034 (0.06%) cycles in affected programs: 40169572 > 39943194 (0.56%) helped: 15550 HURT: 1938 helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18 helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25% HURT stats (abs) min: 1 max: 4862 x̄: 89.17 x̃: 35 HURT stats (rel) min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75% 95% mean confidence interval for cycles value: 14.42 11.47 95% mean confidence interval for cycles %change: 2.05% 1.91% Cycles are helped. total spills in shared programs: 26769 > 26814 (0.17%) spills in affected programs: 826 > 871 (5.45%) helped: 9 HURT: 10 total fills in shared programs: 38383 > 38425 (0.11%) fills in affected programs: 834 > 876 (5.04%) helped: 9 HURT: 10 LOST: 5 GAINED: 10 Ivy Bridge total instructions in shared programs: 12079250 > 12044139 (0.29%) instructions in affected programs: 2409680 > 2374569 (1.46%) helped: 16135 HURT: 0 helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2 helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68% 95% mean confidence interval for instructions value: 2.21 2.14 95% mean confidence interval for instructions %change: 2.76% 2.67% Instructions are helped. total cycles in shared programs: 180116747 > 179900405 (0.12%) cycles in affected programs: 25439823 > 25223481 (0.85%) helped: 13817 HURT: 1499 helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18 helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97% HURT stats (abs) min: 1 max: 3684 x̄: 98.99 x̃: 52 HURT stats (rel) min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42% 95% mean confidence interval for cycles value: 15.68 12.57 95% mean confidence interval for cycles %change: 1.77% 1.63% Cycles are helped. LOST: 8 GAINED: 10 Sandy Bridge total instructions in shared programs: 10878990 > 10863659 (0.14%) instructions in affected programs: 1806702 > 1791371 (0.85%) helped: 13023 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1 helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10% 95% mean confidence interval for instructions value: 1.18 1.17 95% mean confidence interval for instructions %change: 1.68% 1.62% Instructions are helped. total cycles in shared programs: 154082878 > 153862810 (0.14%) cycles in affected programs: 20199374 > 19979306 (1.09%) helped: 12048 HURT: 510 helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18 helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52% HURT stats (abs) min: 1 max: 448 x̄: 54.39 x̃: 16 HURT stats (rel) min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17% 95% mean confidence interval for cycles value: 17.97 17.08 95% mean confidence interval for cycles %change: 1.84% 1.75% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 8155075 > 8142729 (0.15%) instructions in affected programs: 949495 > 937149 (1.30%) helped: 5810 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2 helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85% 95% mean confidence interval for instructions value: 2.14 2.11 95% mean confidence interval for instructions %change: 2.59% 2.48% Instructions are helped. total cycles in shared programs: 188584610 > 188549632 (0.02%) cycles in affected programs: 17274446 > 17239468 (0.20%) helped: 3881 HURT: 90 helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6 helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30% HURT stats (abs) min: 2 max: 10 x̄: 2.80 x̃: 2 HURT stats (rel) min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07% 95% mean confidence interval for cycles value: 9.35 8.27 95% mean confidence interval for cycles %change: 0.85% 0.77% Cycles are helped. GM45 total instructions in shared programs: 5019308 > 5013119 (0.12%) instructions in affected programs: 489028 > 482839 (1.27%) helped: 2912 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2 helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81% 95% mean confidence interval for instructions value: 2.14 2.11 95% mean confidence interval for instructions %change: 2.54% 2.39% Instructions are helped. total cycles in shared programs: 129002592 > 128977804 (0.02%) cycles in affected programs: 12669152 > 12644364 (0.20%) helped: 2759 HURT: 37 helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4 helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31% HURT stats (abs) min: 2 max: 10 x̄: 3.62 x̃: 4 HURT stats (rel) min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04% 95% mean confidence interval for cycles value: 9.53 8.20 95% mean confidence interval for cycles %change: 0.79% 0.70% Cycles are helped. Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

Ian Romanick authored
This is the same as the need_dest parameter to prepare_alu_destination_and_sources. This allows us to not change the register that is expected to hold an result if an instruction is reemitted. This is particularly a problem if the reemitted instruction is a partial write. A later patch will use this feature. No shaderdb changes on any Intel platform. v2: Don't do the Boolean resolve when there is no destination. If the ALU instruction didn't write a register, there's nothing to resolve. This replaces an earlier patch "intel/fs: Allocate dummy destination register when need_dest is false". Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

Ian Romanick authored
This also helps a later patch (intel/fs: Improve discard_if code generation) on about 200 shaders. v2: Document that other instruction sequences are also valid in subtract_merge_with_compare_intervening_mismatch_flag_write. Suggested by Caio. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224438 > 17224434 (<.01%) instructions in affected programs: 296 > 292 (1.35%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 2.04% 0.81% Instructions are helped. total cycles in shared programs: 361468455 > 361468458 (<.01%) cycles in affected programs: 2862 > 2865 (0.10%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31% HURT stats (abs) min: 3 max: 4 x̄: 3.50 x̃: 3 HURT stats (rel) min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: 4.34 5.84 95% mean confidence interval for cycles %change: 0.70% 0.90% Inconclusive result (value mean confidence interval includes 0). Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

Ian Romanick authored
There were two errors. First, the pass could propagate conditional modifiers from an instruction that writes on flag register to an instruction that writes a different flag register. For example, cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F could be come cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F Second, if an instruction writes f0.1 has it's condition propagated, the modified instruction will incorrectly write flag f0.0. For example, linterp(16) vgrf6:F, g2:F, attr0:F cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F (f0.1) discard_jump(16) (null):UD could become linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F (f0.1) discard_jump(16) (null):UD None of these cases will occur currently. The only time we use f0.1 is for generating discard intrinsics. In all those cases, we generate a squence like: cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d (f0.1) discard_jump(16) (null):UD Due to the mixed types and incompatible conditions, this sequence would never see any cmod propagation. The next patch will change this. No shaderdb changes on any Intel platform. v2: Fix typo in comment in test case subtract_delete_compare_other_flag. Noticed by Caio. Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

Ian Romanick authored
Tests like this should have been added in 4467040c ("i965/fs: Propagate conditional modifiers from not instructions"). Reviewedby: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewedby: Matt Turner <mattst88@gmail.com>

 05 Jun, 2019 13 commits


Kenneth Graunke authored
BLORP now handles this so there's no reason to fall back. Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Kenneth Graunke authored
This patch makes blorp_blit handle SINT<>UINT blit value clamping. After reading the source's integer data (which is expanded to 32bit), we either IMAX with 0 (for SINT > UINT, to clamp negative numbers) or UMIN with (1 << 31)  1 (for UINT > SINT, to clamp positive numbers outside of the representable range). Such blits are not allowed by the OpenGL or Vulkan APIs directly: The Vulkan 1.1 spec for vkCmdBlitImage says: "Integer formats can only be converted to other integer formats with the same signedness." The GL 4.5 spec for glBlitFramebuffer says: "An INVALID_OPERATION error is generated if format conversions are not supported, which occurs under any of the following conditions: [...] * The read buffer contains unsigned integer values and any draw buffer does not contain unsigned integer values. * The read buffer contains signed integer values and any draw buffer does not contain signed integer values." However, they are useful for other operations, such as texture upload and download, which typically are implemented via blorp_blit(). i965 has code to fall back in this case (which the next commit will delete), and Gallium expects blit() to handle this case for texture upload. Fixes the following tests on iris:  GTFGL46.gtf32.GL3Tests.packed_pixels.packed_pixels  GTFGL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo  GTFGL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Caio Marcelo de Oliveira Filho authored
This let deref optimizations apply to globals before lowering them. Reviewedby: Jason Ekstrand <jason@jlekstrand.net>

Kenneth Graunke authored
Both GLSL IR and NIR perform the same mod > floor lowering for 32bit types. But nir_lower_double_ops is slightly more defensive against lowered drcp precision loss, and handles mod(x, x) = 0 directly. This works well...assuming nir_lower_double_ops actually gets an fmod op to lower in the first place. The previous patches enabled NIRbased lowering for the remaining drivers, so we can stop using the GLSL IR lowering when using NIR. Fixes KHRGL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIRbased drivers. The AMD NIR backend also has code to handle fmod, so we could potentially skip this and still be fine. I don't have an opinion on that. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIRbased drivers. Reviewedby: Marek Olšák <marek.olsak@amd.com> Ackedby: Eric Anholt <eric@anholt.net>

Kenneth Graunke authored
Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIRbased drivers. Reviewedby: Marek Olšák <marek.olsak@amd.com> Ackedby: Eric Anholt <eric@anholt.net>

Kenneth Graunke authored
We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we recombine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
nir_lower_doubles offers a wide variety of fp64 lowering, including lowering fmod@64. The version there also better handles imprecisions due to lowered frcp@64. Let's consolidate on one version. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
I don't think panfrost actually does doubles yet, but it at least claims to support PIPE_CAP_DOUBLES, so at least pretend to switch to the new lowering. Reviewedby: Marek Olšák <marek.olsak@amd.com> Ackedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Kenneth Graunke authored
We currently have two duplicate mechanisms for lowering fmod@64. One is a nir_opt_algebraic rule keyed off of options>lower_fmod64, and the other is nir_lower_doubles, which offers a full gamut of fp64 lowering. The latter works slightly better in some corner cases, so I'm trying to eliminate lower_fmod64 and drop the redundancy. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no fmod64 to be lowered. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Dylan Baker authored
