Skip to content
  • Boris Brezillon's avatar
    panfrost: Fix fencing · 29f938a0
    Boris Brezillon authored
    Commit 64d6f56a ("panfrost: Allocate syncobjs in panfrost_flush")
    aimed at optimizing the fencing logic but it looks it also broke the
    fence-based synchronization in subtle ways.
    
    Indeed, now that the fence only waits on a single syncobj, we're not
    guaranteed that all jobs queued in panfrost_flush_all_batches() will
    be done when the fence is signaled, because jobs at the top level
    (those stored in the batches hashmap) have not inter-dependencies.
    
    Commit 9e397956 ("panfrost: signal syncobj if nothing is going to
    be flushed") made this even more apparent by signaling the fence right
    away if nothing was left to be drawn in the current context, thus
    ignoring any of the batches left to flushed in the ->batches map.
    
    If we want to keep relying the existing kernel APIs there's clearly no
    ideal solution here. We can either go back to the original fencing
    mechanism where each fence contained an array of syncobjs to be tested
    or serialize jobs that have no explicit dependencies so we know the last
    submitted job will also be the last one to return. The orginal approach
    has proven to add quite a significant overhead (caused by the amount of
    ioctls and the time spent in kernel space to gather dma fences attached
    to those syncobjs and test them). So let's go for the simple solution
    where we have a single syncobj bound to the context which we update to
    point to the last job out_sync every time we submit a top-level job.
    
    This approach implies reworking the way we create fences since we
    need to capture the syncobj state at the time the fence is created.
    Unfortunately, there's not SYNCOBJ_CLONE ioctl, which forces us to
    export/create/import a fence so we have a new object that's not
    subject to changes done to the context syncobj.
    
    If we want to further optimize the logic, we should probably explore
    some of those options:
    
    1/ Adding array based SYNCOBJ ioctls (SYNCOBJ_{CREATE,DESTROY,CLONE}_ARRAY)
       so we can mitigate the cost of ioctls when we need to manipulate
       arrays of syncobjs
    2/ Support synchronization jobs. That is, jobs that have a NULL job chain
       but an array of sync_in and a sync_out to allow creating
       synchronization points
    3/ Add syncobj aggregators so we only have to wait on one syncobj from
       userspace. The syncobj aggregator would wait for all sub syncobjs to
       be signaled before signaling the top-level one.
    
    Fixes: 64d6f56a
    
     ("panfrost: Allocate syncobjs in panfrost_flush")
    Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
    Reviewed-by: default avatarAlyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
    Part-of: <!7831>
    29f938a0