Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2,771
    • Issues 2,771
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 933
    • Merge requests 933
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Mesa
  • mesamesa
  • Merge requests
  • !15580

util: Remove util_cpu_detect

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Alyssa Rosenzweig requested to merge alyssa/mesa:cpu2 into main Mar 25, 2022
  • Overview 22
  • Commits 1
  • Pipelines 20
  • Changes 25

util_cpu_detect is an anti-pattern: it relies on callers high up in the call chain initializing a local implementation detail. As a real example, I added:

...a Mali compiler unit test ...that called bi_imm_f16() to construct an FP16 immediate ...that calls _mesa_float_to_half internally ...that calls util_get_cpu_caps internally, but only on x86_64! ...that relies on util_cpu_detect having been called before.

As a consequence, this unit test:

...crashes on x86_64 with USE_X86_64_ASM set ...passes on every other architecture ...works on my local arm64 workstation and on my test board ...failed CI which runs on x86_64 ...needed to have a random util_cpu_detect() call sprinkled in.

This is a bad design decision. It pollutes the tree with magic, it causes mysterious CI failures especially for non-x86_64 developers, and it is not justified by a micro-optimization.

Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding the footgun where it fails to be called. This cleans up Mesa's design, simplifies the tree, and avoids a class of a (possibly platform-specific) failures. To mitigate the added overhead, inline util_cpu_detect into util_get_cpu_caps now that it has a single caller.

In principle, this adds only a single check of overhead to the happy path (for the call once). In practice, this overhead might be worse than a single load+branch due to multi-threading. If this is an issue, the CPU caps data structure could be duplicated per thread (thread_local) and populated independently by each thread to avoid that. Nevertheless, given the overwhelmeing design problems with the status quo, this change is required; if you need additional optimization, the onus is on you to do so in a non-invasive way and to provide real data justifying the change.

Bifrost shader-db on my Apple M1 (bare metal Linux) runtime is hurt by <0.5%

Signed-off-by: Alyssa Rosenzweig alyssa@collabora.com

Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: cpu2