Skip to content

mesa + glthread: many optimizations

Marek Olšák requested to merge mareko/mesa:mesa-opts-glthread-series3 into master

A reviewed subset merged here: !4466 (merged), !4758 (merged)

This is a big change for both Mesa and glthread performance.

Mesa:

  • Removal of NullBufferObj and _mesa_is_bufferobj (for 5% perf improvement in "torcs")
  • Faster VAO initialization
  • Faster glPush/PopClientAttrib
  • State change improvements
  • Dynamic VAOs skip most of validation and don't compute interleaved arrays (st/mesa has a separate codepath for this)
  • vbo_context is inlined in gl_context for faster glBegin/End
  • New feature: Ability to create struct gl_buffer_object and map it from any thread (for glthread)
  • New glInternal* functions that take struct gl_buffer_object * to execute a buffer copy and set vertex and index buffers (for glthread)

glthread:

  • glBufferSubData is asynchronous for any size (implemented as a buffer upload in the main thread + glInternalBufferSubDataCopy in the driver thread)
  • non-VBO vertices and indices are uploaded for all non-Indirect non-IBM draws

glthread now performs well with apps using non-VBO data, and scales better with apps using glBufferSubData too much.

Mesa driver overhead is still important if the Mesa thread is the busiest. Enable the gallium thread for better driver overhead distribution.

Performance improvement of this MR in the game "torcs":

  • +16% by default vs master
  • +40% after enabling glthread vs master

glthread requires these CAPs for user data uploads:

  • PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE
  • PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION
  • PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET
Edited by Marek Olšák

Merge request reports