intel: Relax scheduling dependencies for ARF access.
We were treating them as scheduling barriers, but since they weren't
is_scheduling_barrier(), it meant that each access would add
deps from the beginning to the end of the block, for some awful O(n^2)
Reduces runtime of
from 9.9s to 2.0s on my SKL system, and should let us take the test group
off the skip list in Chrome OS due to timeouts.
shader-db: total instructions in shared programs: 9043717 -> 9043725 (<.01%) instructions in affected programs: 948 -> 956 (0.84%) total cycles in shared programs: 403834893 -> 403766806 (-0.02%) cycles in affected programs: 103629173 -> 103561086 (-0.07%) total spills in shared programs: 4036 -> 4037 (0.02%) spills in affected programs: 28 -> 29 (3.57%) total fills in shared programs: 3221 -> 3224 (0.09%) fills in affected programs: 89 -> 92 (3.37%)
Closes: #4648 (closed)