core/db: add db custom backend (!212) · Merge requests · gfx-ci / CI Bug Log

Piotr Kira requested to merge uf_materialized_view into master Jul 23, 2024

This patch improves performance of matching filters. Most of the additional info is in commit messages, here is a copy-paste info of the most important commit with benchmarks.

    Creating IssueFilters involves filtering UnknownFailures in Database.
    It's one of the most common thing that real users (non bots) do on
    CIBugLog. It should be fast but sometimes it's not, due to multiple
    joins that are done on Database side. In case of simple filter
    "status_name = 'fail' AND dmesg ~= 'error'" PostgreSQL first filter
    results by dmesg, and later joins TextStatus table and filters it.
    Anyway, the point is that it's complicated and not that fast.

    It's really tricky problem, one solution that I had and tested with
    success is using MaterializedViews.
    https://www.postgresql.org/docs/current/rules-materializedviews.html

    I've created materialized view for UnknownFailures which contains only
    post merge failures (non-temporary) and reduces joins as much as
    possible and sensible.

    Here are benchmarks I've done to vouch for merging that patch.

    *Notes:*

    1. I've run benchmark on production size database, which is huge, and
    contained about 3250 unknown failures.
    2. I've copied generated SQL query by CIBugLog, and used it with
       PostgreSQL's EXPLAIN ANALYZE to get timings.

    Case 1 (simple query):
      Query: status_name = 'fail'
      Timings:
        Before: 1.29s, 1.23s, 1.32s, 1.23s, 1.27s
        After: 5ms, 8ms, 7ms, 7ms, 5ms, 4ms

    Case 2 (dmesg regex query):
      Query: status_name = 'fail' AND dmesg ~= 'error'
      Timings:
        Before: 3.8s, 3.8s, 3.7s
        After: 800ms, 700ms, 720ms

    Case 3 (advanced query):
      Query: status_name = 'fail' AND dmesg ~= 'error' AND runconfig_tag =
      'xe' AND machine_tag IS IN ['BMG', 'DG2', 'ADL'] AND test_name ~=
      'igt@kms*'
      Timings:
        Before: 4s, 3.9s, 3.9s, 4.2s, 3.9s
        After: 920ms, 990ms, 875ms, 880ms

    Summary:

    As we can see materialized view based queries perform much faster, due
    to reduced number of joins, especially nested ones. In the case 2 and 3
    materialized view performs about 4times faster which would be the case
    of typical issue filter. In the best case scenario (case 1) materialized
    view queries perform about 240times faster, 1200ms vs 5ms. But the
    noticeable difference for real person would be about 1s which is not as
    breath-taking.

    Cons:

    Unfortunatelly there are some cons to that patch:

    1. Unknown failures materialized view with it's indexes takes some
       space, in my case of 3250 failures it takes about 150MB with Indexes.
       For 3250 failures it's really big number but we have to keep in mind
       that these failures contain usually pretty big dmesg/stderr log, in
       many cases 10k+ lines.
    2. Materialized views have to be refreshed when data changes and we want
       it to be up-to-date. In this case it's refreshed in two situations:
        * new unknown failures are uploaded
        * unknown failures are deleted (assigned to known failures)

Admin message

core/db: add db custom backend

Merge request reports