Skip to content

aux/pb: add a tolerance for reclaim failure

Mike Blumenkrantz requested to merge zmike/mesa:pb-reclaim into main

originally, a slab attempts to reclaim a single bo. there are two outcomes to this which can occur:

  • the bo is reclaimed
  • the bo is not reclaimed

if the bo is reclaimed, great.

if the bo is not reclaimed, it remains at the head of the list until it can be reclaimed. this means that any bo with a "long" work queue which makes it into a slab will effectively kill the entire slab. in a benchmarking scenario, this can occur in rapid succession, and every slab will get 1-2 suballocations before it reaches a bo that blocks long enough for a new slab to be needed.

the inevitable result of this scenario is that all memory is depleted almost instantly, all because pb assumes that if the first bo in the reclaim list isn't ready, none of them can be ready

for drivers like radeonsi, this happens to be a fine assumption

for drivers like zink, this is entirely not workable and explodes the gpu

Cc: mesa-stable

Merge request reports