st_framebuffer might leak on certain circumstances
Submitted by Yong Zhang
Assigned to mes..@..op.org
Link to original bug (#108382)
Description
By design, framebuffer can be marked as obsoleted by GL calls (such as eglDestroySurface) at any time, but gets released only when it's no longer used (by calls such as eglMakeCurrent).
st_manager uses two structs to achieve this, one is hash table stfbi_ht, which tracks st_framebuffer_iface of every active (non-obsolete) st_framebuffers, another one is linked list winsys_buffer, which keeps all currently used st_framebuffers.
Consider following call sequence (assuming we are using dri backend): EGLSurface surf1 = eglCreateWindowSurface(...); eglMakeCurrent(..., surf1, surf1, ...); // do rendering.... eglDestroySurface(..., surf1); EGLSurface surf2 = eglCreateWindowSurface(...); eglMakeCurrent(..., surf2, surf2, ...);
When first eglMakeCurrent is called, st_api_make_current is called, st_framebuffer is created, its iface is pointed to the actual dri_drawable of surf1. st_framebuffer is added into winsys_buffer and iface is added into stfbi_ht. then st_framebuffers_purge is called, it traverses winsys_buffer, searches iface in stfbi_ht, found all framebuffers are active.
When eglDestroySurface is called, st_api_destroy_drawable is called, only st_framebuffer_iface (which is actually dri_drawable of surf1) is removed from stfbi_ht.
When second eglMakeCurrent is called, st_api_make_current is called, st_framebuffer is created, its iface is pointed to the actual dri_drawable of surf2, st_framebuffer is added into winsys_buffer and iface is added into stfbi_ht. then st_framebuffers_purge is called, it traverses winsys_buffers, found st_framebuffer->iface of surf1 is not in stfbi_ht, which means surf1 is obsolete, at last st_framebuffer is released and removed from winsys_buffers.
This mechanism depends on an assumption: iface (which are dri_drawable of surf1 and surf2) are allocated on different locations on memory, so their address can be used as hash key. The assumption works for most cases, but on Android x86, due to different malloc implementation, dri_drawable of surf1 and surf2 are very likely to be allocated on the same address, which breaks st_framebuffers_purge, causes stale st_framebuffer never gets freed.
Test environment: Android x86 7.1.2-r36, AMD Radeon RX560 How to reproduce: keep switching any app between foreground and background. Observe /sys/kernel/debug/dri/0/amdgpu_gem_info will find gem usage increase.
Version: 18.2