- 18 Nov, 2020 1 commit
-
-
Marek Olšák authored
This removes some overhead from tc_draw_vbo and increases the maximum number of draws per batch from 153 to 192 in u_threaded_context. Reviewed-by:
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <!7441>
-
- 26 Jun, 2020 1 commit
-
-
Connor Abbott authored
This was already done correctly for the indirect variants, and turnip was setting the correct value, but it seems freedreno missed the change in the non-indirect variant. Also, fix a misspelling of "indices" and add a type to INDX_SIZE. Part-of: <!5644>
-
- 11 Jun, 2019 1 commit
-
-
Eduardo Lima Mitev authored
The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at 79180a05. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by:
Rob Clark <robdclark@gmail.com>
-
- 19 Jun, 2018 1 commit
-
-
Rob Clark authored
The scratch registers move again in a6xx.. so for post-a4xx let's just move this into the backend, and move the one place it used to be needed in core into fd5_emit_ib(). For a6xx we will do similar, calling emit_marker6() from fd6_emit_ib(). Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 03 Dec, 2017 1 commit
-
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 25 Nov, 2017 1 commit
-
-
Ilia Mirkin authored
Signed-off-by:
Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by:
Rob Clark <robdclark@gmail.com>
-
- 14 Nov, 2017 1 commit
-
-
Rob Clark authored
A couple failures in piglit tests w/ TF or gl_VertexID + indirect draws. OTOH all the deqp tests (although they don't test those combinations). I suspect this could be fixed by a firmware update, but I don't think there is much we can do in mesa for that. Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 14 May, 2017 1 commit
-
-
Rob Clark authored
My fault for not having time to test Marek's patches while they were on list. Fixes: 330d0607 ("gallium: remove pipe_index_buffer and set_index_buffer") Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 10 May, 2017 1 commit
-
-
Marek Olšák authored
pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
-
- 06 Dec, 2016 1 commit
-
-
Rob Clark authored
gpuaddr of idx buffer is now two dwords (64b). Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 30 Nov, 2016 1 commit
-
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 27 Nov, 2016 1 commit
-
-
Rob Clark authored
Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 30 Jul, 2016 2 commits
-
-
Rob Clark authored
This is also used in gmem code, which executes from the "bottom half" (ie. from the flush_queue worker thread), so it cannot be in fd_context. Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
Rob Clark authored
To flush batches out of order, the gmem code needs to not depend on state from fd_context (since that may apply to a more recent batch). So this all moves into batch. The one exception is the gmem/pipe/tile state itself. But this is only used from gmem code (and batches are flushed serially). The alternative would be having to re-calculate GMEM layout on every batch, even if the dimensions of the render targets are the same. Note: This opens up the possibility of pushing gmem/submit into a helper thread. Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 02 Jul, 2016 1 commit
-
-
Rob Clark authored
This will be useful in a following patch. Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 02 Jun, 2016 1 commit
-
-
Rob Clark authored
a4xx has it's own enum, different from a2xx/a3xx. Spotted by coverity: CID 1362458, 1362459 Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 13 Mar, 2016 1 commit
-
-
Rob Clark authored
No need to open-code this. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 18 Nov, 2015 1 commit
-
-
Rob Clark authored
point_size_per_vertex is always TRUE for GLES, causing us to configure the hw as if gl_PointSize was written, even if it was not. Which makes for grumpy hw. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 12 Aug, 2015 1 commit
-
-
Rob Clark authored
a4xx needs similar treatment as 995f55a6 Also fixup a few point-size and vpsrepl issues and drop fix_blit_fp() hack previously needed for mem2gmem. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 24 Feb, 2015 1 commit
-
-
Rob Clark authored
a4xx has it's own draw packet, so needs equivalent update to what a3xx already got. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 02 Dec, 2014 1 commit
-
-
Rob Clark authored
Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 15 Nov, 2014 1 commit
-
-
Rob Clark authored
Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 15 Oct, 2014 1 commit
-
-
Rob Clark authored
Manual LTO Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 25 Jul, 2014 1 commit
-
-
Rob Clark authored
It seems like for the most part, different behaviors, workarounds, etc, should be conditional on GPU patch revision (ie. a320.0 vs a320.2) rather than GPU id (a320 vs a330). Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 01 Feb, 2014 1 commit
-
-
Rob Clark authored
Updates to non-banked registers, CP_LOAD_STATE, etc, need a WFI if there is potentially pending rendering. Track this better, and add fd_wfi() calls everywhere that might potentially need CP_WAIT_FOR_IDLE. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 08 Jan, 2014 2 commits
-
-
Rob Clark authored
Since we now have the cmdstream patch mechanism needed for hw binning, might as well also use it for RB_RENDER_CONTROL updates. This avoids the need to use RMW (and associated WFI) to update RB_RENDER_CONTROL. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
Rob Clark authored
The binning pass sorts vertices into which bins/tiles they apply to. The visibility information generated during the binning pass can be used to speed up the rendering pass by filtering out vertices which do not apply to the current tile. See: https://github.com/freedreno/freedreno/wiki/Adreno-tiling#optimized-approach This brings a significant fps boost. A rough assortment of tests (supertuxkart, etracer, tremulous, glmark2 'build' test, etc) seems to yield a ~35-45% fps improvement. For now, to be conservative, the binning pass is not enabled yet by default. To enable it use: FD_MESA_DEBUG=binning So far I haven't found anything that breaks with binning enabled, but I'd like a bit more testing before I enable it as default. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 26 Dec, 2013 1 commit
-
-
Rob Clark authored
Using RMW on banked context registers is not safe. The value read could be the wrong one. So if there has been a DRAW_IDX launched, the RMW must be preceded by a WAIT_FOR_IDLE to ensure the read part of RMW sees the correct value. To avoid unnecessary WFI's, keep track if there is a need for WFI, and only emit one if needed. Furthermore, keep track if we even need to update the register in the first place. And to cut down on the amount of RMW to avoid excessive WFI's, at the tiling/GMEM level we can always overwrite RB_RENDER_CONTROL, as the state at beginning of draw/clear cmds (which we IB to) is always undefined. In the draw/clear commands, we always still use RMW (with WFI if needed), but only if the register value actually changes. (At points where the current value cannot be known, the saved value is reset to ~0, which includes bits outside of RBRC_DRAW_STATE, so there never is chance for confusion.) Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 14 Dec, 2013 1 commit
-
-
Rob Clark authored
Fixes gpu lockups in supertuxkart. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 14 Sep, 2013 2 commits
-
-
Rob Clark authored
Emit markers by writing to scratch registers in order to "triangulate" gpu lockup position from post-mortem register dump. By comparing register values in post-mortem dump to command-stream, it is possible to narrow down which DRAW_INDX caused the lockup. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
Rob Clark authored
Have a single helper that all draws come through.. mainly for a convenient debug and instrumentation point. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 08 Jun, 2013 1 commit
-
-
Rob Clark authored
Split the parts that are specific to adreno a2xx series GPUs from the parts that will be in common with a3xx, so that a3xx support can be added more cleanly. Signed-off-by:
Rob Clark <robclark@freedesktop.org>
-
- 12 Mar, 2013 1 commit
-
-
Rob Clark authored
Currently works on a220. Others in the a2xx family look pretty similar and should be pretty straightforward to support with the same driver. The a3xx has a new shader ISA, and while many registers appear similar, the register addresses have been completely shuffled around. I am not sure yet whether it is best to support with the same driver, but different compiler, or whether it should be split into a different driver. v1: original v2: build file updates from review comments, and remove GPL licensed header files from msm kernel v3: smarter temp/pred register assignment, fix clear and depth/stencil format issues, resource_transfer fixes, scissor fixes Signed-off-by:
Rob Clark <robdclark@gmail.com>
-
- 12 Nov, 2012 1 commit
-
-
Eric Anholt authored
Mesa's chaining hash table for object names is slow, and this should be much faster. I namespaced the functions under _mesa_*, to avoid visibility troubles that we may have had before with hash_table_* functions. v2: Move .c file to main/, const a few things, clean up loop conditions, add/extend some comments. Reviewed-by:
Brian Paul <brianp@vmware.com> Reviewed-by:
Chad Versace <chad.versace@linux.intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
-
- 23 May, 2012 1 commit
-
-
Eric Anholt authored
Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Reviewed-by:
Chad Versace <chad.versace@linux.intel.com>
-
- 15 May, 2012 1 commit
-
-
Paul Berry authored
This patch groups together the parameters used by the HiZ functions into a new data structure, brw_hiz_resolve_params, rather than passing each parameter individually between the HiZ functions. This data structure is a subclass of brw_blorp_params, which represents the parameters of a general-purpose blit or resolve operation. A future patch will add another subclass for blits. In addition, this patch generalizes the (width, height) parameters to a full rect (x0, y0, x1, y1), since blitting operations will need to be able to operate on arbitrary rectangles. Also, it renames several of the HiZ functions to reflect the expanded role they will serve. v2: Rename brw_hiz_resolve_params to brw_hiz_op_params. Move gen{6,7}_blorp_exec() functions back into gen{6,7}_blorp.h. Reviewed-by:
Chad Versace <chad.versace@linux.intel.com>
-
- 10 May, 2012 2 commits
-
-
Paul Berry authored
This patch renames the gen6_hiz.h and gen7_hiz.h files to correspond to the renames of the corresponding .cpp files (see previous commit). Reviewed-by:
Chad Versace <chad.versace@linux.intel.com>
-
Paul Berry authored
These declarations are necessary to allow C++ code to call C code without causing unresolved symbols (which would make the driver fail to load). Reviewed-by:
Chad Versace <chad.versace@linux.intel.com>
-
- 07 Feb, 2012 1 commit
-
-
Chad Versace authored
The HiZ op was implemented as a meta-op. This patch reimplements it by emitting a special HiZ batch. This fixes several known bugs, and likely a lot of undiscovered ones too. ==== Why the HiZ meta-op needed to die ==== The HiZ op was implemented as a meta-op, which caused lots of trouble. All other meta-ops occur as a result of some GL call (for example, glClear and glGenerateMipmap), but the HiZ meta-op was special. It was called in places that Mesa (in particular, the vbo and swrast modules) did not expect---and were not prepared for---state changes to occur (for example: glDraw; glCallList; within glBegin/End blocks; and within swrast_prepare_render as a result of intel_miptree_map). In an attempt to work around these unexpected state changes, I added two hooks in i965: - A hook for glDraw, located in brw_predraw_resolve_buffers (which is called in the glDraw path). This hook detected if a predraw resolve meta-op had occurred, and would hackishly repropagate some GL state if necessary. This ensured that the meta-op state changes would not intefere with the vbo module's subsequent execution of glDraw. - A hook for glBegin, implemented by brwPrepareExecBegin. This hook resolved all buffers before entering a glBegin/End block, thus preventing an infinitely recurring call to vbo_exec_FlushVertices. The vbo module calls vbo_exec_FlushVertices to flush its vertex queue in response to GL state changes. Unfortunately, these hooks were not sufficient. The meta-op state changes still interacted badly with glPopAttrib (as discovered in bug 44927) and with swrast rendering (as discovered by debugging gen6's swrast fallback for glBitmap). I expect there are more undiscovered bugs. Rather than play whack-a-mole in a minefield, the sane approach is to replace the HiZ meta-op with something safer. ==== How it was killed ==== This patch consists of several logical components: 1. Rewrite the HiZ op by replacing function gen6_resolve_slice with gen6_hiz_exec and gen7_hiz_exec. The new functions do not call a meta-op, but instead manually construct and emit a batch to "draw" the HiZ op's rectangle primitive. The new functions alter no GL state. 2. Add fields to brw_context::hiz for the new HiZ op. 3. Emit a workaround flush when toggling 3DSTATE_VS.VsFunctionEnable. 4. Kill all dead HiZ code: - the function gen6_resolve_slice - the dirty flag BRW_NEW_HIZ - the dead fields in brw_context::hiz - the state packet manipulation triggered by the now removed brw_context::hiz::op - the meta-op workaround in brw_predraw_resolve_buffers (discussed above) - the meta-op workaround brwPrepareExecBegin (discussed above) Note: This is a candidate for the 8.0 branch. Reviewed-by:
Eric Anholt <eric@anholt.net> Reviewed-by:
Kenneth Graunke <kenneth@whitecape.org> Acked-by:
Paul Berry <stereotype441@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43327 Reported-by: xunx.fang@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44927 Reported-by: chao.a.chen@intel.com Signed-off-by:
Chad Versace <chad.versace@linux.intel.com>
-
- 22 Nov, 2011 1 commit
-
-
Chad Versace authored
Now that intel_renderbuffer::region has been replaced with a miptree, the HiZ functions region parameter must be replaced with a miptree parameter. Change the return type from bool to void. Rename the 'depth' parameter to 'layer', because it will correspond to irb->mt_layer. Reviewed-by:
Eric Anholt <eric@anholt.net> Signed-off-by:
Chad Versace <chad.versace@linux.intel.com>
-