1. 11 May, 2020 25 commits
  2. 10 May, 2020 2 commits
  3. 09 May, 2020 13 commits
    • maurossi's avatar
      freedreno: android: add adreno-pm4-pack.xml.h generation to android build · a92a483f
      maurossi authored
      Fixes the following building errors:
      
      In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:40:
      external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
               ^~~~~~~~~~~~~~~~~~~~~~~
      1 error generated.
      
      In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blend.c:36:
      external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
               ^~~~~~~~~~~~~~~~~~~~~~~
      1 error generated.
      
      In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_const.c:26:
      external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
               ^~~~~~~~~~~~~~~~~~~~~~~
      1 error generated.
      
      Fixes: ee293160
      
       "freedreno/a6xx: add OUT_PKT()"
      Signed-off-by: maurossi's avatarMauro Rossi <issor.oruam@gmail.com>
      Part-of: <mesa/mesa!4973>
      a92a483f
    • maurossi's avatar
      freedreno/drm: android: add libfreedreno_registers static dependency · 5dc3b22d
      maurossi authored
      The dependency is required to get the necessary generated headers
      
      Fixes the following building error:
      
      In file included from external/mesa/src/freedreno/drm/msm_bo.c:27:
      In file included from external/mesa/src/freedreno/drm/msm_priv.h:30:
      In file included from external/mesa/src/freedreno/drm/freedreno_priv.h:51:
      external/mesa/src/freedreno/drm/freedreno_ringbuffer.h:35:10: fatal error: 'adreno_common.xml.h' file not found
      #include "adreno_common.xml.h"
               ^~~~~~~~~~~~~~~~~~~~~
      1 error generated.
      
      Fixes: 6c688ae8
      
       ("freedreno: Deduplicate ringbuffer macros with computerator/fdperf")
      Signed-off-by: maurossi's avatarMauro Rossi <issor.oruam@gmail.com>
      Part-of: <!4973>
      5dc3b22d
    • Erico Nunes's avatar
      lima/ppir: rework select conditions · e622e010
      Erico Nunes authored
      
      
      This is yet another simple optimization that attemts to save the
      insertion of an unnecessary mov for a large number of cases.
      If the node outputting the condition for select satisfies a few
      requirements (which are common in the case of comparison conditions),
      it can just be changed to pipeline output and used directly.
      In case of difficult corner cases, just fall back to the mov as before.
      
      The sel_cond op is removed as the scheduler can be smart enough to place
      nodes that output to ^fmul in the ALU_SCL_MUL slot, and as there can be
      alu ops other than just mov.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4632>
      e622e010
    • Erico Nunes's avatar
      lima/ppir: add fallback mov option for const scheduler · a0c58867
      Erico Nunes authored
      
      
      It turns out that with more aggressive combining, there can be cases
      where the available const slots are not enough for one instruction.
      In particular, fcsel can take up to two consts, and a previous alu slot,
      such as a comparison condition, might require an additional const.
      So add a fallback for it like for uniforms.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4632>
      a0c58867
    • Erico Nunes's avatar
      lima/ppir: rework store output · 8c476407
      Erico Nunes authored
      
      
      In many cases, it is possible to avoid creating a mov for the store
      output node.
      Additionally, nodes other than alu, such as load varying, can be valid
      store output nodes too.
      
      This is another small optimization, but helps a vast majority of
      programs by 1 instruction.
      Shaders with discard easily become complicated to handle properly.
      Some example issues: ppir has to rely on instruction ordering; or a
      node with ssa output could be required both before a discard_if (as a
      condition) and after it (as the instruction with the 'stop' bit set).
      So don't try to handle them here.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4632>
      8c476407
    • Erico Nunes's avatar
      lima/ppir: rework emit nir to ppir · 570f1420
      Erico Nunes authored
      
      
      The previous code assumed that a ppir node would be created for each nir
      instr and used that to add it to the list of nodes and verify success.
      This didn't make much sense anymore since some emit paths create
      multiple nodes anyway, and this didn't allow for an emit call to not
      create any new ppir node while still returning success.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4632>
      570f1420
    • Erico Nunes's avatar
      lima/ppir: remove unused clone functions · 6b21b771
      Erico Nunes authored
      
      
      With the previous refactors moving these lowering steps to a nir pass,
      these are no longer needed.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      6b21b771
    • Erico Nunes's avatar
      lima/ppir: duplicate consts in nir · 8c415713
      Erico Nunes authored
      
      
      Move the duplicate consts step to a nir pass.
      This makes the nir representation closer to what ppir will have in the
      result.
      Additionally, it handles the case where a const is used multiple times
      by a single node (which can happen in instructions like fcsel). The new
      implementation will only emit a single load const for that case.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      8c415713
    • Erico Nunes's avatar
      lima/ppir: duplicate intrinsics in nir · 5e6c3861
      Erico Nunes authored
      
      
      Move the duplicate uniform and varying steps to a nir pass, along with
      some changes in the duplicating strategy.
      
      Node duplication is now done per user of the varying/uniform. This is
      inspired by what the offline shader compiler seems to usually do, and as
      usual aims to reduce register pressure and better utilize the ld_uni and
      ld_var instruction slots.
      It is worth noting that due to a bug/feature, ppir was already
      duplicating uniforms per successor in ppir_node_add_src even if the
      comment indicated it was meant to be per-block.
      Additionally, ppir was duplicating load uniform nodes twice for nodes
      that use the same uniform in more than one source, resulting in one
      unnecessary (and unpipelineable) load. This new implementation in nir
      only creates one load in that case.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      5e6c3861
    • Erico Nunes's avatar
      lima/ppir: combine varying loads in node_to_instr · 09003ba0
      Erico Nunes authored
      
      
      Varying loads with a single successor have a high potential to be
      combined with its successor node, like ppir does for uniforms, rather
      than being in a separate instruction.
      Even if ppir becomes capable of combining instructions in a separate
      step, combining varying loads during node_to_instr is trivial enough
      that it seems to be worth doing it in this stage, and this benefits
      pretty much every program that uses varyings.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      09003ba0
    • Erico Nunes's avatar
      lima/ppir: do not assume single src for pipeline outputs · c6a3987f
      Erico Nunes authored
      
      
      Even if a node has pipeline output and a single successor, it is still
      valid for that successor to have multiple references to that pipeline
      node. A trivial example is add(u.x,u.y) where u is a uniform.
      It is even possible for this to occur with consts as operands of fcsel.
      So remove uses of ppir_node_get_src_for_pred as that would assume a
      single src in the node that uses the pipeline.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      c6a3987f
    • Erico Nunes's avatar
      lima/ppir: fix lod bias register codegen · 741aa343
      Erico Nunes authored
      
      
      The lod bias register is correctly run through the entire compilation
      process, but in the end its allocated register value was never being
      added to the instruction.
      It seems that most programs were lucky enough that lod bias was assigned
      register 0.x so that things worked anyway.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      741aa343
    • Erico Nunes's avatar
      lima/ppir: introduce liveness internal live set · cef1c736
      Erico Nunes authored
      
      
      The current solution for handling registers that live and die within a
      single instruction does not handle all cases. In particular, these
      intra-instruction use register also conflict with registers that are
      part of the live_in set.
      Unfortunately, adding them to the live_in set is not an easy solution as
      that would cause them to be propagated upwards. So, add a separate set
      to handle these registers in the particular instructions, without
      propagating them.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      cef1c736