Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mesa mesa
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2,879
    • Issues 2,879
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 910
    • Merge requests 910
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Mesa
  • mesamesa
  • Issues
  • #803

Closed
Open
Created Sep 18, 2019 by Bugzilla Migration User@bugzilla-migration

i965/fs generates slow code for vector comparisons

Submitted by Matt Turner @mattst88

Assigned to Ian Romanick

Link to original bug (#77456)

Description

Created attachment 97371 t.shader_test

The fragment shader runs in scalar mode, so to do vec4 comparisons we generate multiple compares and join them together using and or ors, depending on the comparison.

INTEL_DEBUG=fs,no16 bin/shader_runner t.shader_test -auto

generates:

      cmp.e.f0(8)     g3`<1>`D          g2.3<0,1,0>F    g2.7<0,1,0>F
      cmp.e.f0(8)     g4`<1>`D          g2.2<0,1,0>F    g2.6<0,1,0>F
      cmp.e.f0(8)     g5`<1>`D          g2.1<0,1,0>F    g2.5<0,1,0>F
      cmp.e.f0(8)     g6`<1>`D          g2<0,1,0>F      g2.4<0,1,0>F
      and(8)          g7`<1>`D          g5<8,8,1>D      g6<8,8,1>D
      and(8)          g8`<1>`D          g4<8,8,1>D      g7<8,8,1>D
      and(8)          g9`<1>`D          g3<8,8,1>D      g8<8,8,1>D
      and.ne.f0(8)    null            g9<8,8,1>D      1D        
...
(+f0) sel ...

We could have just predicated all but the first cmp instruction and skipped the and instructions completely:

      cmp.e.f0(8)     g3`<1>`D          g2.3<0,1,0>F    g2.7<0,1,0>F
(+f0) cmp.e.f0(8)     g4`<1>`D          g2.2<0,1,0>F    g2.6<0,1,0>F
(+f0) cmp.e.f0(8)     g5`<1>`D          g2.1<0,1,0>F    g2.5<0,1,0>F
(+f0) cmp.e.f0(8)     g6`<1>`D          g2<0,1,0>F      g2.4<0,1,0>F
...
(+f0) sel ...

I think a similar thing can be done for !=, where the join operation is or.

Attachment 97371, "t.shader_test":
t.shader_test

Blocking

  • Bug 77547
Edited May 19, 2022 by Ian Romanick
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking