tu: implement sysmem vs gmem autotuner (!12128) · Merge requests · Mesa / mesa

Danylo Piliaiev requested to merge Danil/mesa:turnip/perf/sysmem-vs-gmem-autotune into main Jul 29, 2021

The implementation is separate from Freedreno due to multithreading support.

In Vulkan application may fill command buffer from many threads and expect no locking to occur. We do introduce the possibility of locking on renderpass end, however assuming that application doesn't have a huge amount of slightly different renderpasses, there would be minimal to none contention.

Other assumptions are:

Application doesn't create one-time-submit command buffers to hold them indefinitely without submission.
Application does submit command buffers soon after their creation.

Breaking the above may lead to some decrease in performance or autotuner turning itself off.

The heuristic is too simplistic at the moment. We should account for load/stores/clears/resolves especially
with low drawcall count and ~fb_size samples passed, in D3D11 games we are seeing many renderpasses like:

color attachment load
single fullscreen draw
color attachment store

To make a good heuristic we would have to run a bunch of traces with and without forced sysmem and gather
statistics how sysmem vs gmem performance depends on renderpass parameters we could gather.

This would be my next step.

Edited Aug 03, 2021 by Danylo Piliaiev

Admin message

tu: implement sysmem vs gmem autotuner

Merge request reports