1. 13 May, 2020 1 commit
  2. 09 May, 2020 2 commits
    • Erico Nunes's avatar
      lima/ppir: duplicate consts in nir · 8c415713
      Erico Nunes authored
      Move the duplicate consts step to a nir pass.
      This makes the nir representation closer to what ppir will have in the
      result.
      Additionally, it handles the case where a const is used multiple times
      by a single node (which can happen in instructions like fcsel). The new
      implementation will only emit a single load const for that case.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <mesa/mesa!4535>
      8c415713
    • Erico Nunes's avatar
      lima/ppir: duplicate intrinsics in nir · 5e6c3861
      Erico Nunes authored
      Move the duplicate uniform and varying steps to a nir pass, along with
      some changes in the duplicating strategy.
      
      Node duplication is now done per user of the varying/uniform. This is
      inspired by what the offline shader compiler seems to usually do, and as
      usual aims to reduce register pressure and better utilize the ld_uni and
      ld_var instruction slots.
      It is worth noting that due to a bug/feature, ppir was already
      duplicating uniforms per successor in ppir_node_add_src even if the
      comment indicated it was meant to be per-block.
      Additionally, ppir was duplicating load uniform nodes twice for nodes
      that use the same uniform in more than one source, resulting in one
      unnecessary (and unpipelineable) load. This new implementation in nir
      only creates one load in that case.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      Part-of: <!4535>
      5e6c3861
  3. 20 Mar, 2020 1 commit
  4. 16 Mar, 2020 1 commit
  5. 17 Feb, 2020 2 commits
  6. 28 Jan, 2020 1 commit
  7. 11 Dec, 2019 1 commit
  8. 07 Nov, 2019 1 commit
    • Erico Nunes's avatar
      lima: fix nir shader memory leak · d939f5d4
      Erico Nunes authored
      Fix memory leak on allocation for nir shader, reported by valgrind.
      
      3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
         at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
         by 0x5750817: ralloc_size (ralloc.c:119)
         by 0x5750977: rzalloc_size (ralloc.c:151)
         by 0x575C173: nir_shader_create (nir.c:45)
         by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
         by 0x55D5003: st_create_fp_variant (st_program.c:1242)
         by 0x55D789F: st_get_fp_variant (st_program.c:1522)
         by 0x55D789F: st_get_fp_variant (st_program.c:1507)
         by 0x56400C3: st_update_fp (st_atom_shader.c:163)
         by 0x563D333: st_validate_state (st_atom.c:261)
         by 0x55D07CB: prepare_draw (st_draw.c:132)
         by 0x55D08DF: st_draw_vbo (st_draw.c:184)
         by 0x55576CB: _mesa_draw_arrays (draw.c:374)
         by 0x55576CB: _mesa_draw_arrays (draw.c:351)
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      d939f5d4
  9. 06 Nov, 2019 1 commit
  10. 27 Sep, 2019 1 commit
    • Vasily Khoruzhick's avatar
      lima/ppir: add NIR pass to split varying loads · 6dd0ad66
      Vasily Khoruzhick authored
      NIR may emit a single instrinsic to load several packed varyings,
      but that's suboptimal for Utgard PP for several reasons:
      - varyings that are used as sampler inputs can be passed using
        pipeline register with increased precision
      - we have small number of regs, so using a vec4 regs for storing
        two vec2 varyings increases reg pressure.
      
      Add NIR pass to split a single load into several loads and utilize
      it in lima.
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      Signed-off-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      6dd0ad66
  11. 23 Sep, 2019 1 commit
    • Vasily Khoruzhick's avatar
      lima: implement BO cache · d2147787
      Vasily Khoruzhick authored
      Allocating BOs is expensive, so we should avoid doing that by caching
      freed BOs.
      
      BO cache is modelled after one in v3d driver and works as follows:
      
      - in lima_bo_create() check if we have matching BO in cache and return
        it if there's one, allocate new BO otherwise.
      - in lima_bo_unreference() (renamed from lima_bo_free()): put BO in
        cache instead of freeing it and remove all stale BOs from cache
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      Signed-off-by: Vasily Khoruzhick's avatarVasily Khoruzhick <anarsoul@gmail.com>
      d2147787
  12. 09 Sep, 2019 2 commits
  13. 06 Sep, 2019 3 commits
  14. 25 Aug, 2019 3 commits
  15. 06 Aug, 2019 2 commits
  16. 05 Aug, 2019 2 commits
  17. 04 Aug, 2019 1 commit
  18. 31 Jul, 2019 4 commits
    • Erico Nunes's avatar
      lima: enable lower_bitops in ppir · 82bf5a8a
      Erico Nunes authored
      The mali pp doesn't support integers and some nir_algebraic
      optimizations may result in ops that are not easily lowerable to floats,
      so disable optimizations resulting in bitops.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Jonathan Marek's avatarJonathan Marek <jonathan@marek.ca>
      82bf5a8a
    • Erico Nunes's avatar
      nir/algebraic: rename lower_bitshift to lower_bitops · b3676a65
      Erico Nunes authored
      Optimizations that insert bitshift or bitwise operations should not be
      applied on GPUs that don't support integer operations.
      The .lower_bitshift could be used to control the bitshift related ones,
      but there was also one bitwise optimization uncovered.
      Since only lima and freedreno use this option and the use case is that
      no bit operations are wanted, let's rename it to .lower_bitops and use
      it to control all bitops related optimizations.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Jonathan Marek's avatarJonathan Marek <jonathan@marek.ca>
      b3676a65
    • Erico Nunes's avatar
      lima/ppir: lower fdot in nir_opt_algebraic · 99c956fb
      Erico Nunes authored
      Now that we have fsum in nir, we can move fdot lowering there.
      This helps reduce ppir complexity and enables the lowered ops to be part
      of other nir optimizations in the optimization loop.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      99c956fb
    • Erico Nunes's avatar
      lima/ppir: lower texture projection · d2901de0
      Erico Nunes authored
      Lower texture projection in ppir using nir_lower_tex and nir_lower_tex.
      This will insert a mul with the coordinate division before the load
      varying.
      
      Even though the lima pp supports projection in the load varying
      instruction while loading the coordinates (from a register or a
      varying), it requires that both the coordinates and projector be
      components in a single register.
      nir currently handles them in separate ssa, and attempting to merge them
      manually may end up in worse code than just doing the coordinate
      division manually. So for now let's just lower the projection to add
      support for it in lima.
      In the future, an optimization pass may be implemented in lima to ensure
      that both coords and projector come in the same register, then this
      lowering may be disabled and in this case lima may use the built-in
      projection and save the mul instruction from lowering.
      Signed-off-by: Erico Nunes's avatarErico Nunes <nunes.erico@gmail.com>
      Reviewed-by: Qiang Yu's avatarQiang Yu <yuq825@gmail.com>
      d2901de0
  19. 28 Jul, 2019 1 commit
  20. 24 Jul, 2019 1 commit
  21. 18 Jul, 2019 1 commit
  22. 01 Jul, 2019 1 commit
  23. 24 Jun, 2019 1 commit
  24. 16 Jun, 2019 1 commit
  25. 31 May, 2019 2 commits
  26. 20 May, 2019 1 commit
  27. 10 May, 2019 1 commit