Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
xserver
xserver
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 889
    • Issues 889
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 96
    • Merge Requests 96
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • xorg
  • xserverxserver
  • Issues
  • #1056

Closed
Open
Opened Aug 08, 2020 by Izumi Tsutsui@tsutsui

1bpp server performance regression

Please consider the following change that causes serious performance regression on 1bpp (monochrome) servers on tiling small patterns.
commit e572bcc7

fb: Remove even/odd tile slow-pathing

Again, clearly meant to be a fast path, but this turns out not to be the case.

Details

NetBSD still supports several monochrome framebuffers like Sun3 and Omron LUNA. After updates to Xorg 1.20.5 in the NetBSD tree I noticed extreme slowness on filling root_weave bitmap when screen saver was activated.

  • Xorg 1.10 based Xsun server on Sun3/60 bwtwo:
    https://twitter.com/tsutsuii/status/1289451828036300800
    -> Drawing time is not measurable by eyes.

  • Xorg 1.20 based Xsun server on Sun3/60 bwtwo:
    https://twitter.com/tsutsuii/status/1289437204654075907
    -> It takes >10 seconds to fill root window.

  • Xorg 1.18 based Xsun server on Sun3/60 bwtwo:
    https://twitter.com/tsutsuii/status/1291000288862560256
    -> Same as 1.20.

  • Xorg 1.20 server + xf86-video-wsfb driver on LUNA using single plane:
    https://twitter.com/tsutsuii/status/1291772031525179392
    -> Also >10 seconds even on the xf86-video-wsfb driver.

With several investigation, it turns out the above changes to fb/fbtile.c cause this regression:
e572bcc7

I'm not sure how the "not to be the case" in the log was concluded, but the "fast path" of the removed fbEvenTile() function was only called if FbEvenTile(tileWidth) was true:
https://gitlab.freedesktop.org/xorg/xserver/-/blob/836bb27726441e048bb300664343a136bc596a5b/fb/fbtile.c#L145

void
fbTile(FbBits * dst,
       FbStride dstStride,
       int dstX,
       int width,
       int height,
       FbBits * tile,
       FbStride tileStride,
       int tileWidth,
       int tileHeight, int alu, FbBits pm, int bpp, int xRot, int yRot)
{
    if (FbEvenTile(tileWidth))
        fbEvenTile(dst, dstStride, dstX, width, height,
                   tile, tileStride, tileHeight, alu, pm, xRot, yRot);

FbEvenTile() is defined in fb/fb.h:
https://gitlab.freedesktop.org/xorg/xserver/-/blob/e572bcc7f4236b7e0f23ab762f225b3bce37db59/fb/fb.h#L543

/*
 * Accelerated tiles are power of 2 width <= FB_UNIT
 */
#define FbEvenTile(w)       ((w) <= FB_UNIT && FbPowerOfTwo(w))

FB_UNIT is 32 here, so the "fast path" is activiated only if tileWidth arg is 32 or smaller (i.e. 1, 2, 4, 8, or 16).

The main caller of fbTile() is fbFill() with FillTiled op in fb/fbfill.c:
https://gitlab.freedesktop.org/xorg/xserver/-/blob/7430fdb689678b98ac63f5a8dad13719bac777e0/fb/fbfill.c#L164

        fbTile(dst + (y + dstYoff) * dstStride,
               dstStride,
               (x + dstXoff) * dstBpp,
               width * dstBpp, height,
               tile,
               tileStride,
               tileWidth * tileBpp,
               tileHeight,
               pGC->alu,
               pPriv->pm,
               dstBpp,
               (pGC->patOrg.x + pDrawable->x + dstXoff) * dstBpp,
               pGC->patOrg.y + pDrawable->y - y);

The argument tileWidth of fbTile() includes bpp, so the "fast path" fbEvenTile() won't be called on 32bpp servers.

On the other hand, 1bpp server uses it for 32x32 or smaller bitmaps.

Reverting the above "fb: Remove even/odd tile slow-pathing" change significantly improves speed of filling the root_weave and other pattern of Xorg 1bpp server as before:

  • Patched Xsun server on Sun3/60 bwtwo:
    https://twitter.com/tsutsuii/status/1291061688762957826

  • Patched Xorg + xf86-video-wsfb server on LUNA:
    https://twitter.com/tsutsuii/status/1291773410964463617

Edited Aug 09, 2020 by Izumi Tsutsui
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: xorg/xserver#1056