Skip to content

staging: Add ext-placement protocol for window positioning in "zones" (v2)

Hello everyone!

This is a new attempt to resolve the issues clients designed for stacking window managers are facing when they want to set their own window positions in a specific order. Please check out !247 for a rationale with examples, and #72 for the (old) original feature request.

Prior approaches

The initial protocol proposal which used absolute monitor-based coordinates was NACK'ed by Weston (and GNOME later), so it is in the ext namespace now (see !247). Before pursuing that protocol for inclusion in staging under the ext namespace, I would really like to find a more universally applicable solution for this problem though - maybe we can even improve the design instead of doing what X11 did.

So, a different attempt was born, using relative positioning, where windows position themselves in relation to other windows: !249 (closed) - This got a lot of good feedback that I had to think about, with a bunch of corner case issues as well as more fundamental problems raised. At the same time I also got exposed to a few more styles of application that people would like to have supported, and ultimately after trying to implement a bit of that protocol, I came to the conclusion that it would just make absolutely everybody unhappy with a compromise-solution:

  • Application porters from other OSes and X11 would be unhappy as there would be no way to really map existing behavior to the Wayland world
  • App developers would be unhappy as it limits their application's design to what paradigms the compositor supports with regards to window placement (e.g. "top", "left" anchor semantics - what if the app wanted to place two windows centered at the top?)
  • Compositor developers would be unhappy as they would need to implement a lot of style logic and alignment paradigms in their compositor, while still having no idea what the application was actually trying to achieve.

The protocol proved to be pretty good at implementing a GIMP-ish clone, but fell apart as soon as I tried to do anything more complex. I don't think the idea is dead overall, placing windows in relation to each other is probably still useful, but I think this new protocol may be a better solution overall.

Introducing "zones"

This explanation has been changed to more closely match the current proposal and give a better introduction, as it has been heavily altered during review (and will for sure continue to be edited, please read the patch to be up to date!

The new protocol in this MR introduces the concept of a "zone", a new per-client coordinate system, provided by the compositor and attached to one output, in which it can place its windows. The client can only know window positions relative to the zone the compositor has assigned to it, and a zone can be a defined rectangle with fixed dimensions, or an infinite space without any limits.

A zone can be reshaped by the compositor at any time, but must always include all windows of the client that were assigned to it. Every client may only have one zone, and can share the zone with trusted other processes by sharing its handle with them. That way, any external process that is contributing a window can create its window in relation to the other windows by sharing the same coordinate system (as long as the clients trust each other to exchange tokens).

A zone is a per-client entity, and clients must not assume it reflects any real object like the monitor geometry.

[ I am not sold on the "zone" name - it's better than the previous name, "workspace", as that term was overloaded with meaning already, but if you have a better idea than "zone", please let me know! "Window group" was another option, but that also already has different specific connotations. ]

So, here is how this could look like with different configurations:

multiwin-zones-example

Simplest Case: In this case, where there is only a single monitor and a stacking window manager, the client would just receive a zone rectangle that encompasses the usable screen area, so any space that is not restricted for usage by windows due to shell elements being in the way. The green rectangle is the zone, with the yellow star being its coordinate origin (0, 0).

Simple Multi-Monitor Case: In this case we have two monitors, which may have different resolutions. Any zone is always attached to one output, so in order for the client to control window placement, it must create a second zone on the second monitor and handle the transition of a window between monitors (see below). A compositor can prevent zones from being created on specific output, in which case the client could not move windows to the respective output (and windows can only be moved there manually by the user or by the compositor).

Multiple Client Zones Case: This is an interesting new idea where the compositor could determine that the current desktop is already too cluttered and carve out a smaller box of it for the new application to place its windows in. The application will then try to fill that designated space. If there are multiple multi-window apps, they may get multiple zone rectangles. I expect this to be useful for ultrawide or other extremely large displays (and potentially for tilers as well, although those might simply not want to implement this protocol at all).

Infinite Zone Case: In case we have a scrolling compositor, we may have a finite height but infinite width (or vice versa). In that case, the compositor can communicate that fact to the client by leaving the zone open in one direction. That way, the application can stretch out a bit and use the extra space if it needs it. Of course, such a compositor may also just restrict the zone to 1x the current monitor geometry instead. In every case, the client can not expect a position request to be followed exactly, so the compositor is allowed to reign in protocol abuse the an app placing a window insanely far away from the user's current position.

Window movement "edge cases"

multiwin-zones-move

In the first example, the user moves a window out of the current monitor onto a second monitor, but the application still wants to position other windows relative to it / needs it in a zone. For that to happen, the following steps happen:

  • The compositor emits a zone_left event to notify the client that a window has left its assigned zone and the zone association is broken
  • The client could ignore this if it does not care about positioning anymore, but in this case it creates a new zone for the respective output via get_zone
  • If it received a valid zone, it associates the just moved window with the zone on the second monitor via get_position. This will also give it the position of the moved window relative to its new zone.

In the second example a window is moved out of the zone on the same monitor, but the client still wants to know the window's position. In this case, the client can request get_position using its zone on the output and thereby make the compositor extend the zone to once again encompass all of the application's windows again. If the workspace can not be extended (if it already hits the top-left window border and the window is moved outside of that boundary by the user), the positions returned by position events in response to get_position might be negative.

Advantages

The advantages of this protocol are:

  • No global coordinate system
  • Multi-process GUI applications can easily cooperate in window placement across their processes
  • No limitations on the window layouts applications can come up with (Clients can easily construct their initial window layout and do not require the compositor to make assumptions about it)
  • Relatively easy to port existing applications to the new protocol from Windows/macOS/X11
  • Context for compositors: They now know the explicit layout of windows a client has created, and that these windows belong together, so can decide to e.g. allow the user to move them as a cluster between virtual zones, or represent them as one in a tiling WM and potentially only expand them if selected.
  • Clients have a better idea about the usable space available to them and might make much better placement decisions than on X11
  • Still, compositors have the final say about window placement and application hints are only strong recommendations

Disadvantages

  • The protocol is a lot more complex now, and the compositor will have to juggle more coordinate systems
  • If the user moves a window outside of the compositor's selected zone bounds, the compositor needs to adapt the zone size, so the client does not receive invalid window locations and so the zone still encompasses all the client's windows. This means every single window in the zone changes is position (from the client's POV) if the window was moved out of the left or top side of the zone, which is a bit messy. But applications should be able to handle this, as they already have to deal with similar cases on X11.
  • Within a zone, the compositor still does not have a lot of context of what the windows actually do - but I do not think there is any viable scalable solution to convey that information.

Open Questions

There's a few things that I was wondering about, where feedback would be helpful:

  • What if a zone spans two monitors with different scaling factors? How should the coordinate system be organized in that case? (This has to have some solution on Wayland for windows already, and the compositor could always include some smart logic to morph coordinates appropriately, but we should probably mention this in the protocol if there is a good solution for it)
  • Should there be a request to explicitly add a window to a zone, or is the current implicit method acceptable?
  • Should we emit a position event for all windows as soon as a zone geometry has changed? Or should we just wait for the app to request new positions? Most apps do not move windows around after the initial setup, so most would likely not care about that change.
  • Is a rectangular zone enough? I could make it a polygon, but that would IMHO be insane overkill for currently nonexistent future compositors. What happens with more complex multi-monitor geometries (like L-shapes, etc)? Would that lead to bad window placement?
  • Should we let the client request some properties for its new zone, e.g. "must not be spread over multiple monitors" or "should be at least WxH big"?

...and there is probably a lot more that I haven't thought of yet ;-)

Please let me know what you think! I am aware that this protocol will be just as controversial as the other one, but I do think this one would be a better compromise than the previous relative alignment approach.

Edited by Matthias Klumpp

Merge request reports