info: Code size and complexity increase

With the latest addition of additionaly checking the category threshold before actually evaluating debug statements, the code size/complexity has increased everywhere.

Previously a debug line would:

Just check if the level of that debug statement was equal to, or below, _gst_debug_min (i.e. the worst level activated)
- This would end up being a locally stored variable in any compiled function (i.e. only loaded once)
- The check is small/efficient (a simple compare/jmp): 5 instruction bytes (on amd64) and one branch

Since !403 (merged) the following happens in addition:

Load the category (extra load)
Call gst_debug_category_get_threshold() (which is never inlined)
- That call does an atomic load \o/
Finally come back and check the level
That results in an increase of 30 instruction bytes (on amd64), an extra function call and yet-another-branch

This is increasing both:

The size of all code (risk of not being able to load most code in cache)
The number of branches (risk of overloading branch prediction in cpu)

I have tried to replace locally the call to gst_debug_category_get_threshold() by the direct atomic_int_get, which reduces the code size slightly, but still results in extra branches.

Proposal

Revert !403 (merged)
For debug statements where evaluating arguments is potentially expensive, add guards to make sure they are only called if that particular category threshold is exceeded.

A good example is how the debugging is handled in gst-plugins-good/gst/isomp4/qtdemux_dump.c:

#ifndef GST_DISABLE_GST_DEBUG
  /* Only traverse/dump if we know it will be outputted in the end */
  if (qtdemux_debug->threshold < GST_LEVEL_LOG)
    return TRUE;

  g_node_traverse (node, G_PRE_ORDER, G_TRAVERSE_ALL, -1,
      qtdemux_node_dump_foreach, qtdemux);
#endif

Edited May 26, 2020 by Edward Hervey