Segfault in pushbuf_kref on nv50 when rendering from multiple threads
Submitted by Gabriele Svelto
Assigned to Nouveau Project
Created attachment 118838 kernel log
I've encountered an easily reproducible segfault using the Firefox OS emulator while I was hacking the said operating. The Firefox OS emulator  is a fork of the Android emulator which is in turn a fork of qemu. In both cases the graphics part is untouched so it might be possible to reproduce the same issue in qemu even though I didn't have the time to try it.
Here's my full STR:
- Build the Firefox OS emulator using the emulator-x86-kk target device ( git clone https://github.com/mozilla-b2g/B2G.git ; cd B2G ; ./config.sh emulator-x86-kk ; ./build.sh )
- Launch it from the tree using the run-emulator.sh script
- Once Firefox OS has started quickly click on any application and keep clicking on buttons / input boxes / etc... The segfault will normally happen in a matter of seconds
I've reproduced the bug both on Fedora 22 and Gentoo so it doesn't look like distro-specific, these are the versions number taken from my Gentoo installation:
xf86-video-nouveau 1.0.11 libdrm 2.4.59 mesa 10.3.7 xorg-server 1.16.4 kernel 4.0.5
I've captured a stack trace of the segfault with gdb:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xc3dfeb40 (LWP 9387)] 0xf689a323 in pushbuf_kref () from /usr/lib32/libdrm_nouveau.so.2 (gdb) bt #0 0xf689a323 in pushbuf_kref () from /usr/lib32/libdrm_nouveau.so.2 #1 0xf689ab9f in pushbuf_validate () from /usr/lib32/libdrm_nouveau.so.2 #2 0xf6ce47e8 in nv50_state_validate () from /usr/lib32/dri/nouveau_dri.so #3 0xf6cf0a49 in nv50_draw_vbo () from /usr/lib32/dri/nouveau_dri.so #4 0xf6b3846d in cso_draw_vbo () from /usr/lib32/dri/nouveau_dri.so #5 0xf6a5f29e in st_draw_vbo () from /usr/lib32/dri/nouveau_dri.so #6 0xf6a30cd3 in vbo_draw_arrays () from /usr/lib32/dri/nouveau_dri.so #7 0xf6a30f37 in vbo_exec_DrawArrays () from /usr/lib32/dri/nouveau_dri.so #8 0xf72ca52b in glDrawArrays (mode=4, first=0, count=6) at sdk/emulator/opengl/host/libs/Translator/GLES_V2/GLESv2Imp.cpp:576 #9 0xf74b9965 in gl2_decoder_context_t::decode (this=0xc3dfdfd4, buf=0xc47ff008, len=5452, stream=0xc6400768) at out/host/linux-x86/obj/STATIC_LIBRARIES/libGLESv2_dec_intermediates/gl2_dec.cpp:565 #10 0xf74b662c in RenderThread::Main (this=0xc6400788) at sdk/emulator/opengl/host/libs/libOpenglRender/RenderThread.cpp:128 #11 0xf74cdc3d in osUtils::Thread::thread_main (p_arg=0xc6400788) at sdk/emulator/opengl/shared/OpenglOsUtils/osThreadUnix.cpp:83 #12 0xf7f9711f in start_thread () from /lib32/libpthread.so.0 #13 0xf7d5f79e in clone () from /lib32/libc.so.6
I'm attaching the kernel log and the X log. Those may be "polluted" by other stuff as my machine has been running for some time since I've hit the bug. I'll try to provide cleaner ones right after I hit the bug. If more detailed information is needed (e.g. a backtrace with finer-grained debug information, etc...) I can provide it given some time to gather it.
Attachment 118838, "kernel log":