anv/allocator: try harder to memset() before anv_gem_close()
The big comment above the memset() talks about how having that memset() before the anv_gem_close() call makes us safer. I can't see a reason why gcc wouldn't be allowed to actually swap the order of those two things, so add a memory barrier in order to try to enforce our intention.
I'm not entirely sure what race conditions we're trying to prevent here, especially since we have the mutex around everything. But still, let's make the code do what the comment tries to say it does.
Signed-off-by: Paulo Zanoni
Cc: @jekstrand
Merge request reports
Activity
2053 2053 * any data they may write in this BO. 2054 2054 */ 2055 2055 memset(bo, 0, sizeof(*bo)); 2056 __asm__ volatile("" ::: "memory"); My understanding is that while the asm is just a compiler barrier, __sync_synchronize() is a a hardware barrier, affects every thread and emits instructions. It seems to me that all we need here is a compiler barrier since all we need is to guarantee the order of memset and anv_gem_close.
Regarding the ARM comment, I genuinely don't know, but the asm code I wrote there looks pretty portable :). Also, why would someone want to compile anv on ARM? (this is a genuine question, not a rethorical one)
I should also point that the asm memory barrier is also used by gallium/drivers/swr and is called _ReadWriteBarrier(). Does that run on ARM?
But your last point that the syscall acts as a memory barrier is a great observation, I didn't think about it before. I think it does remove the need for the barrier, making this patch useless. Perhaps if we drop the decide to go without the barrier we may just want to add a comment to prevent the next Paulo from getting confused here.
What would you prefer we do?
Thanks for the review.
Regarding the ARM comment, I genuinely don't know, but the asm code I wrote there looks pretty portable :). Also, why would someone want to compile anv on ARM? (this is a genuine question, not a rethorical one)
We're about to start shipping discrete parts which people can plug into whatever they want. I'm not too worried about most archs like BE PPC but, given the way things are trending, someone plugging one into an ARM desktop sounds like a thing that very much could happen in future.
But your last point that the syscall acts as a memory barrier is a great observation, I didn't think about it before. I think it does remove the need for the barrier, making this patch useless. Perhaps if we drop the decide to go without the barrier we may just want to add a comment to prevent the next Paulo from getting confused here.
Yeah, I think a comment is probably sufficient here.
Compiler can reorder operations only if it can prove that there's no dependency between those operations. In this case compiler can't know whether system call will access the contents of bo (directly or indirectly - doesn't matter), so it can't do any reordering. It's easy to make a mistake and just assume that a function call to another module is a compiler barrier (compiler cannot possibly see what's in another file, right? ;), but introduction of LTOs changed that...
added ANV label