Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 3,070
    • Issues 3,070
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 1,000
    • Merge requests 1,000
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • MesaMesa
  • mesamesa
  • Merge requests
  • !11925

aco: Optimize spilling

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Tony Wasserka requested to merge neobrain/mesa:opt_spiller into main Jul 16, 2021
  • Overview 36
  • Commits 18
  • Pipelines 40
  • Changes 4

Doubles the speed of ACO's spiller, i.e. cutting time spent spilling by 40-50% depending on the shader. There is more potential to unlock here using C++17's std::monotonic_buffer_resource, which I'll submit in a separate MR.

Summary of the changes made:

  • Massively reduces overall number of allocations
    • std::vector<CONTAINER>-like objects now are never shrunk so that the contained containers don't need to re-allocate their object storage all the time
    • Two scratch containers have been moved to spill_ctx so they don't need to re-allocate object storage for each block
    • Copying containers is avoided unless necessary
  • Replaces std::map with more efficient containers
    • std::unordered_map for faster lookup
    • std::vector for efficient copying and faster iteration when slow element addition/removal is acceptable
  • Avoids redundant map lookup by combining element checking and access into a single map::find call
  • Applies 71aab960 to the spiller

Valgrind benchmark for one example shader (previously spending 80% of the total aco_compile_shader runtime in the spiller, now only 62%):

Commit aco::spill runtime (cycles) Difference to previous Relative change (normalized) Cumulative change
(baseline) 503861690
aco/spill: Avoid unneeded copies when iterating over maps 503057792 0,803,898 0.2% 0.2%
aco: Use std::vector for the underlying container of std::stack 502243562 0,814,230 0.2% 0.3%
aco/spill: Remove unused container 502209832 0,033,730 0.0% 0.3%
aco/spill: Replace map[] with map::insert 500808314 1,401,518 0.3% 0.6%
aco/spill: Avoid copying next_use maps more often than needed 486598328 14,209,986 2.8% 3.4%
aco/spill: Persist memory allocations of local next use maps 434403228 52,195,100 10.4% 13.8%
aco/spill: Avoid destroying local next use maps over-eagerly 408126850 26,276,378 5.2% 19.0%
aco/spill: Replace vector with vector for local_next_use 354467946 53,658,904 10.6% 29.6%
aco/spill: Prefer unordered_map over map for next use distances 327301820 27,166,126 5.4% 35.0%
aco/spill: Avoid copying current_spills when not needed 322136920 5,164,900 1.0% 36.1%
aco/spill: Reduce redundant std::map lookups 321065650 1,071,270 0.2% 36.3%
aco/spill: Replace an std::map to booleans with std::set 320742138 0,323,512 0.1% 36.3%
aco/spill: Store remat list in an std::unordered_map instead of std::map 318450040 2,292,098 0.5% 36.8%
aco/spill: Change worklist to a single integer 313510223 4,939,817 1.0% 37.8%
aco/spill: Reduce allocations in next_uses_per_block 276328251 37,181,972 7.4% 45.2%
aco/spill: Clarify use of long-lived references by adding const 276317478 0,010,773 0.0% 45.2%
aco/spill: Use unordered_map for spills_exit 269587489 6,729,989 1.3% 46.5%
aco/spill: Use std::unordered_map for spills_entry 251309451 18,278,038 3.6% 50.1%
Edited Jul 21, 2021 by Tony Wasserka
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: opt_spiller