Skip to content
  • Martin Roukala's avatar
    intel-ci: add a pre-merge blacklist to reduce the testing queue · 3e686098
    Martin Roukala authored and Petri Latvala's avatar Petri Latvala committed
    
    
    When arriving at the office on Monday morning, the reported queue
    size was ~100 hours. This defeats the point of pre-merge testing and
    vastly exceeds our target of ~6 hours.
    
    We have a lot of work needed to reduce testing time, but this patches
    reduces the reported run time by 15-30% depending on the platforms:
    
     - shard-skl: 23.9 -> 18.2 minutes (18.5%)
     - shard-kbl: 21.2 -> 16.2 minutes (20%)
     - shard-apl: 25.9 -> 18.5 minutes (24.3%)
     - shard-glk: 24.7 -> 17.6 minutes (24.8%)
     - shard-icl: 25.1 -> 16.7 minutes (28.7%)
     - shard-tgl: 28.2 -> 19.6 minutes (26.4%)
    
    The reason why the reported runtime is so low compared to the
    actual time is due to:
    
     - Unaccounted time spent outside of the IGT subtests (exec(), fixtures)
     - Unaccounted time spent in suspend (monotonic clock, 20s / suspend)
     - Boot time / extra reboots between shards to workaround kernel failures
     - Intel GFX CI shard scheduling overhead
     - More?
    
    Tomi and Petri are working on reducing these overheads by detecting the
    bad conditions and rebooting the machine only at this point rather than
    between every single shard, and increasing the size of the shard test
    lists to reduce the per-shard CI overhead.
    
    Because of this, the actual savings are way smaller in percentage
    but still compound over the tens of executions we do per week:
    
     - shard-skl: ~58 -> ~52 minutes
     - shard-kbl: ~50 -> ~45 minutes
     - shard-apl: ~53 -> ~46 minutes
     - shard-glk: ~38 -> ~31 minutes
     - shard-icl: ~47 -> ~39 minutes
     - shard-tgl: ~60 -> ~51 minutes
    
    More work needed, but we'll get there :)
    
    v2:
     - Avoid using | in the regular expressions (Petri Latvala)
     - Update the description for igt@gem_pwrite@big-.* (Chris Wilson)
     - Drop igt@sw_sync@sync_expired_merge (fixed by Chris Wilson)
     - Drop igt@gem_eio@kms (fixed by Chris Wilson)
     - Drop igt@perf@gen12-mi-rpc as it is serious kernel bug (Chris Wilson)
     - Add links to issues tracking this for all blacklisted item
    
    NOTICE: The above numbers have not been edited for the v2 since
            blacklisting or improving the runtime dramatically yields the
            same results, and only igt@perf@gen12-mi-rpc is back to being
            slow.
    
    v3 (Petri):
     - Install the blacklist files
    
    Signed-off-by: default avatarMartin Peres <martin.peres@linux.intel.com>
    Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: default avatarPetri Latvala <petri.latvala@intel.com>
    3e686098