A security context implementation for authenticating and/or authorizing clients

Links to relevant discussions in wlroots/sway:

https://github.com/swaywm/sway/pull/2375

https://github.com/swaywm/wlr-protocols/pull/27

Summary of my thoughts:

Security can be addressed reasonably well by locking down access to secured globals. Any deeper than that is going to be hard to figure out and probably unnecessary.
I've mostly settled on the WAYLAND_SOCKET-based approach, though I'm also interested in using the file descriptor if it can be done portably.
I'd like to standardize a way of declaring required permissions at install time, so that distros can package pre-sandboxed applications (e.g. in a chroot) with a permissions manifest installed to a trusted location, so that an application which requires elevated permissions can be configured by installing the package - the assumption being that packages from the repositories are trusted and by installing them the user consented. Leave it up to the distros to extend their package managers to explicitly clarify certain permissions and such.
I want to address this entirely with Wayland protocols or files in a standard format on disk - not with dbus. This is a dealbreaker from the wlroots perspective, and we'll end up making a competing standard if we can't agree on this.
We needn't solve every problem now, so long as we don't paint ourselves into a corner.

So, there are at least three components here:

A mechanism for authentication of clients (ie: is this client the program it says it is)
A mechanism for authorisation of clients (ie: should this client be able to do X)
A mechanism for distributing authorisation policy (ie: this program should be allowed to do X)

I think (1) is just a matter for each compositor; there's no platform-independent way of doing this - you'll need to support AppArmor on Ubuntu & derivatives, SELinux on Fedora and derivatives, SMACK on SuSEishes. There's presumably some way to do this on *BSD, but they're going to be even more different. Flatpak and Snap have two different mechanisms, and so on.

I guess in an ideal world this would be done as a library that compositors can use - possibly even folded into libwayland, if it turns out shells have a sufficiently common idea of what “what program is this client” means, but for a first cut just doing it in Weston is fine. This doesn't need discussion outside of Weston developens.

(2) has some sub-components:

dynamic permissions (ie: a protocol extension for a client to request a privileged global); this obviously needs to go through wayland-protocols
a libweston-side API for compositors to deal with authorisation The former obviously requires discussion in a different venue, the latter is purely Weston-internal.

(3) seems pretty out of scope for the Weston project. The xdg mailing list is probably the best place for this? It's not quite dead!

I agree with @RAOF's categorization here. When I wrote this issue, I was primarily interested in point 1 and even just a subset of it: how to identify a returning client as the same app as before, not necessarily to even guarantee the client is who it says it is. I would include point 2 only if it is somehow directly resulting from point 1, but it might be a good idea to keep the two separate.

Everything else mentioned is necessary too, but not on focus in this issue. Once a client can be identified, then everything else on top of that will be fairly straightforward, I believe.

This is specifically a Weston issue to investigate the problem space, not seeking a common solution over all Wayland. Yet.

assigned to @marius.vlad0

mentioned in issue #236 (closed)

unassigned @marius.vlad0

Another possible method is inspecting the client process' executable path, and using that as the client's identity. This involves fetching the process ID from the unix socket connection, and then looking at e.g. /proc/$pid/exe.

Here is a proof-of-concept attack: https://paste.sr.ht/%7Eemersion/541da9f278b267944999dc8295b356cb06618c07

There is an implementation of a compositor checking client credentials via SO_PEERCRED (as used by wl_client_get_credentials), a simple client connecting to it, and a malicious client that tricks the compositor into identifying it as the regular client.

The malicious client can easily be modified to perform a PID re-use attack. This works pretty well due to the predictable PID allocation and the low maximum PID value:

> cat /proc/sys/kernel/pid_max
32768

SO_PEERCRED gives the credentials at socket or connect time. Note that there is another mechanism called SCM_CREDENTIALS which is similar but gives the credentials at sendmsg time. This one is also vulnerable to races and PID re-use.

One idea I have heard is that all untrusted applications should be executed inside jails or containers

This requires the compositor (either the real compositor or a proxy) to be aware of these jails/containers. Privileged globals are exposed to clients outside the jail, and filtered for clients inside jails. The main question remains: who is responsible for setting up the jail/container and decide what are its Wayland permissions?

If this is the real compositor, then this is pretty similar to the WAYLAND_SOCKET way of doing things.

If this is a separate process (ala Flatpak) then:

Either this process should implement a Wayland proxy and do access control on its own
Either this process should have a way to manage permissions for a Wayland socket, e.g. via a protocol. The protocol could for instance provide a request to create a new Wayland socket with only a set of globals exposed.

Thinking about all of this, I've been wondering if there was another way to make wl_client_get_credentials work. The reason why it fails is that PIDs are not security tokens. However we still have UID and GID.

This approach assumes all clients have restricted access to globals, and that only clients that are authenticated gain access to more globals.

One way would be have one user/group per privileged app and one suid binary per app which opens the Wayland connection. For instance /usr/bin/wl-screenshooter could be an executable owned by wl-screenshooter and have a suid bit. The compositor could then call wl_client_get_credentials on the socket, notice it's owned by wl-screenshooter and expose more globals. This is not great because this requires a lot of users and each client needs a special setup.

So instead, we could have a single suid helper that opens privileged connections. Disclaimer: it sounds like a hack. Let's call it /usr/bin/wl-privileged.

/usr/bin/wl-privileged is a suid executable owned by a wl-privileged user. It takes an executable name as an argument.

When a regular client that needs additional permissions is started, it exec's /usr/bin/wl-privileged. For instance /usr/bin/wl-screenshooter does this:

exec("/usr/bin/wl-privileged", "/usr/bin/wl-screenshooter", NULL);

/usr/bin/wl-privileged gets started as user wl-privileged. /usr/bin/wl-privileged opens a Wayland connection and sets it up for the executable passed as an argument. The compositor can identify the /usr/bin/wl-privileged client because its socket has UID set to wl-privileged. Setting up the new connection could be done with the protocol described in the previous section (with a "request to create a new Wayland socket with a set of globals exposed"). In our case the new Wayland socket has all unprivileged globals plus the screenshooter global.

Then /usr/bin/wl-privileged sets WAYLAND_SOCKET and exec's the executable name passed as an argument:

exec(argv[1], NULL);

At this point /usr/bin/wl-screenshooter gets spawned again, with an existing privileged Wayland socket.

Android uses per-app UIDs. I suspect it might not be feasible for the desktop though.

If you have SUID apps or wrapper, how do you make sure they cannot be launched or abused by a malicious app, but they can be launched by whoever is supposed to use them? Where shall the security policy live?

Btw. I think I've heard about "process fd" which might be a token securely referencing a process. Maybe that could be something interesting?

If you have SUID apps or wrapper, how do you make sure they cannot be launched or abused by a malicious app, but they can be launched by whoever is supposed to use them?

Because the SUID helper re-executes the binary. So the SUID helper knows exactly which client is going to be started (just like the compositor when it forks-and-exec's).

The idea is that the SUID helper can provide the same security level as a fork-and-exec'ing compositor, while preserving the process context (aka. "where the process is started": environment variables, umask, inside a service manager, inside a jail/container, etc).

Where shall the security policy live?

This is @RAOF's questions (2) and (3). The decision can be made either by the helper or by the compositor. The first mechanism I'm targeting is a collection of policy files living in /etc that describe which protocols each privileged client has access to.

I think I've heard about "process fd" which might be a token securely referencing a process. Maybe that could be something interesting?

Indeed, that would be interesting. I wonder how process FDs work with e.g. exec. I can't find anything online though, do you have links?

No, I have no information about "process fd", I just thought I saw it mentioned somewhere.

Evaluating your proposal would take a lot more time than I can spare.

I just hope you would leverage https://github.com/mupuf/libwsm as much as you can. I think they used a lot of time to design everything else assuming reliable client identification already worked.

Evaluating your proposal would take a lot more time than I can spare.

Yeah, no worries.

I just hope you would leverage https://github.com/mupuf/libwsm as much as you can. I think they used a lot of time to design everything else assuming reliable client identification already worked.

(As a side note, they use the unreliable /proc/<pid>/exe method)

The bad news is that I think this proposal is too complicated and the integration with multiple backends is unnecessary. The good news is that I'm already discussing with @mupuf and we'll work together on this.

@mupuf, thoughts about #206 (comment 176699)?

I think https://lwn.net/Articles/784831/ is currently the best available documentation.

Thanks.

It doesn't seem like pidfds would solve the issue, as the call to pidfd_open is still racy, and the goal is to support processes that the compositor doesn't fork/clone.

What if the client itself, maybe even without any wrappers, were to send the pidfd of itself to the compositor for authentication? Would it be possible for the compositor to verify the client indeed send its own pidfd and not referring to some other process?

The malicious client could do this in a child process and then exec a privileged client. The compositor would then read /proc/<pid>/exe and grant elevated privileges to the malicious connection.

So the malicious client would keep the Wayland connection fd still open, exec a legit client, and then race with the legit client, trying to win the race where the compositor first checks the executable and authorizes, then the malicous program submits Wayland requests before the legit client submits its first request. I see.

Maybe that could be prevented by a little bit of hand-shake:

client sends pidfd
compositor checks pidfd and sends an authorized cookie
the client sends the cookie back
compositor authorizes the connection

Would be really nice if we could leverage SCM_CREDENTIALS too...

Btw. any method that relies on the executable path is going to not work with interpreted programs.

Interpreted programs are really hard to work with. Even SELinux is struggling with it, although it should be doable to do a domain transition upon opening certain files. Not easy to write for sure!

We have been chatting with Simon over a ton of tea and we are gravitating around the idea that the kernel should store all the information related to the client process at the time connect() is called. This information would be stored in the peer cred struct.

We are now trying to figure out what should be there without having to duplicate all the information about the process there... With LSM, we could use the process' ID at the time of the call, but without it I guess we would be stuck with just the path to the executable or an fd to the executable.

Maybe it is time for a simple LSM module that would just provide a reliable identification for processes?

then the malicous program submits Wayland requests before the legit client submits its first request.

No need to win a race with the legit client to submit the first request. The legit client doesn't need to know about the Wayland connection opened by the malicious one. The malicious client could even close it before exec.

Maybe that could be prevented by a little bit of hand-shake

That doesn't help:

Malicious client creates a Wayland connection and forks: there's now a parent process and a child process. Both have a FD to the connection.
Malicious client creates pidfd in a child process, sends it and exec's a privileged process
Compositor checks pidfd: it's the child process' (which is now a privileged executable), so the check succeeds. It sends a cookie.
The malicious client still has a handle to the Wayland connection FD in the parent process, gets the cookie, replies back.
Compositor authorizes malicious connection

Would be really nice if we could leverage SCM_CREDENTIALS too...

SCM_CREDENTIALS doesn't really solve anything. It's still racy.

Btw. any method that relies on the executable path is going to not work with interpreted programs.

Correction: any method that relies on /proc/<pid>/exe is going to not work with interpreted programs.

Making the compositor or a helper process exec a path is still going to work properly.

For what it's worth the “privileged client” idea is roughly how Unity8 did(/does) it, although rather more draconian (all applications needed to be launched by the ubuntu-app-launch daemon, which then vouched for display server connections)

I believe focusing on the case where every client which is not from a container (flatpak, snap, …) has full privileges might be easier and arguably more important than integration with LSMs. The reason for it is that it's a solved problem for dbus (https://github.com/flatpak/xdg-dbus-proxy). Now the whole filtering part is something we might or might not want but the interesting bit is that the proxy sets the credentials for the connection which solves the authentication problem (not too different from @emersion idea and the ubuntu-app-launch daemon @RAOF mentioned).

With that in place another question is if we want a protocol to request globals or if we simply only ever expose globals to the containerized apps which are container-aware (i.e. don't expose sensitive information without requesting it first and make all sensitive requests fail-able when unauthorized).

further reading: https://smcv.pseudorandom.co.uk/2017/dbus_and_containers/

Time for a new brain dump.

I've discussed a little bit more with @mupuf about using containers/jails vs. ad-hoc filtering.

10:02 <jadahl> security can be outsourced to sandboxing mechanisms, leaving the wayland protocol out of it
10:03 <jadahl> without sandboxes, security is just for show anyway
10:04 <mvlad> yeah, pretty much reached the same conclusion

That makes sense. However the fact that by default everything is exposed is a little bit annoying. So we'd like to keep exploring the "no sandbox" approach. It could be useful on a hardened system (where e.g. LD_* env variables are not implemented and only root-owned executables can run).

That said, this approach would likely require new kernel API to replace the hacks I've described earlier. Also, if a CLI client has access to a privileged protocol other executables could just invoke it instead of accessing directly the Wayland interface (e.g. a malicious executable could invoke a CLI screenshooting tool to capture the screen). One other thing is that such a hardened system probably won't be the default anytime soon.

I'll let @mupuf expand on this if he wants to. :)

Assuming a containers/jails approach, we could make the sandboxing mechanism create a Wayland socket inside the sandbox and proxy/forward it to the outer Wayland compositor. If possible, an approach that doesn't involve a real proxy is preferred.

I'll first expand on the unveil(2)-like protocol I've suggested above. The idea is to restrict the set of globals accessible on a Wayland connection. The sandboxing mechanism could accept connections from the inner Wayland socket, send the connection FDs to the outer Wayland compositor and restrict the set of globals accessible. Very rough protocol draft:

<protocol name="unveil">
	<interface name="unveil_manager">
		<request name="create_client_from_fd">
			<description>
				Create a new unveil_client given an opened socket file
				descriptor.
			</description>
			<arg name="id" type="new_id" interface="unveil_client"/>
			<arg name="socket" type="fd"/>
		</request>
	</interface>

	<interface name="unveil_client">
		<description>
			Unveil clients are clients which only have access to a subset of
			all available globals.
		</description>

		<request name="unveil">
			<description>
				Opt-in for access to a global.
			</description>
			<arg name="interface" type="string"/>
		</request>

		<request name="commit">
			<description>
				This request finalizes the unveil.

				The compositor should call wl_client_create and make sure all
				globals that haven't been opted-in are filtered out.
			</description>
		</request>
	</interface>
</protocol>

An alternative would be to move the policy inside the compositor, with a protocol similar to:

<protocol name="sandbox">
	<interface name="sandbox">
		<request name="create_client">
			<description>
				Create a new sandbox_client given an opened socket file
				descriptor.
			</description>
			<arg name="id" type="new_id" interface="sandbox_client"/>
			<arg name="socket" type="fd"/>
		</request>
	</interface>

	<interface name="sandbox_client">
		<description>
			The compositor might decide the restrict globals to which a sandboxed
			client has access.
		</description>

		<request name="set_executable_path">
			<description>
				TODO: may replace with something else, e.g. app_id.
			</description>
			<arg name="path" type="string"/>
		</request>

		<request name="commit">
			<description>
				This request finalizes the sandbox client.

				The compositor should call wl_client_create and may filter out
				some globals for this connection.
			</description>
		</request>
	</interface>
</protocol>

I think I prefer the first approach better, because the policy can be chosen by the sandboxing mechanism. However the first proposal only allows for static policies, while the second proposals allows for both static and dynamic policies.

Note: by "static policy" I mean that the permissions of a given client are decided at connect time and are frozen. By "dynamic policy" I mean that the permissions of a given client can change over time, e.g. when the user acknowledges a permission dialog.

With that in place another question is if we want a protocol to request globals or if we simply only ever expose globals to the containerized apps which are container-aware (i.e. don't expose sensitive information without requesting it first and make all sensitive requests fail-able when unauthorized).

That's a good question. For wlroots privileged protocols we made sure to always have a way to reject client requests, although the client may not be able to figure out that access has been denied (e.g. it would just receive a generic failure event).

If a compositor wants dynamic policies I think the globals filtering approach won't work well. For instance dynamically revoking access to a global is very tricky.

I also think the first approach is better because I also believe that we should make protocols container-aware (for the lack of a better word) so that filtering is only required for the few protocols which are not container-aware (yet).

The compositor probably should not remove all globals by default and then let the client unveil each and every one that's supposed to be visible in the container, which would require the container solution to keep up with new protocols, but rather let the compositor unveil all unprivileged and container-aware protocols to start with.

Another thing the protocol should probably do is authenticate the new client by setting the identity triple (container technology, technology-specific app ID, container ID).

I also think the first approach is better because I also believe that we should make protocols container-aware (for the lack of a better word) so that filtering is only required for the few protocols which are not container-aware (yet).

Hmm, I think there's some misunderstanding here. I'm not suggesting the unveil protocol as a temporary solution -- but as a final solution. In the context of software repositories where each package is reviewed by a maintainer (read: distribution repos), this seems like a good fit. In the context of software repositories where anyone can push an app (read: Flathub and friends, see also: Android), this is a bad fit.

let the compositor unveil all unprivileged and container-aware protocols to start with

The point of the unveil protocol is to have this policy outside of the compositor.

If the compositor begins to do policy decisions, the sandbox protocol is probably a better idea.

Another thing the protocol should probably do is authenticate the new client by setting the identity triple (container technology, technology-specific app ID, container ID).

What is the use-case for this? FWIW Mutter only extracts the app_id from the Flatpak metadata.

Hmm, I think there's some misunderstanding here.

Seems like it. You were talking about inside and outside of sandboxing mechanisms which is why I assumed flatpak, snap etc.

I'm not suggesting the unveil protocol as a temporary solution -- but as a final solution

In your solution the only security policy there is is exposing or hiding the global and once it is exposed it just works without further policy decisions?

The point of the unveil protocol is to have this policy outside of the compositor.

Mh, I don't think I've seen a rationale why the compositor should not be in charge of policy decisions especially because it can outsource it to another process if it wants to and has more information related to the protocols than any other process.

If the compositor begins to do policy decisions, the sandbox protocol is probably a better idea.

Which sandbox idea are you talking about? This thread really gets confusing (and I'm at fault for it, too).

What is the use-case for this? FWIW Mutter only extracts the app_id from the Flatpak metadata.

The container technology is required since the app id of e.g. a flatpak app and a snap app could be the same. The container id is required since you can run multiple instances of an app possibly with different versions.

In your solution the only security policy there is is exposing or hiding the global and once it is exposed it just works without further policy decisions?

Yes. This is what I call "static policy".

Mh, I don't think I've seen a rationale why the compositor should not be in charge of policy decisions especially because it can outsource it to another process if it wants to and has more information related to the protocols than any other process.

Not having the policy in the compositor allows to have the policy in the sandboxing mechanism (e.g. Flatpak). This has several advantages:

The sandboxing mechanism already makes policy decisions (what files does the app has access to? what IPC mechanisms? and so on).
The compositor would need to support all sandboxing mechanisms if it makes policy decisions. Compositors will need to add support for Flatpak, snap, FreeBSD jails, and so on.

Which sandbox idea are you talking about? This thread really gets confusing (and I'm at fault for it, too).

The "sandboxing protocol" is the second protocol outline I've posted in the parent comment.

Yeah, the whole discussion is a little bit hard to follow.

The container technology is required since the app id of e.g. a flatpak app and a snap app could be the same. The container id is required since you can run multiple instances of an app possibly with different versions.

As said above, this requires the compositor to support each of those sandboxing mechanisms.

Yes. This is what I call "static policy".

Okay, got it.

The sandboxing mechanism already makes policy decisions (what files does the app has access to? what IPC mechanisms? and so on).

I can only talk about flatpak and yes, flatpak does make policy decisions but that's mostly for backwards compatibility where there is no sandbox-aware portal solution yet. Those portals are in charge of actual security policy then and ideally flatpak will get rid of what you describe as static policy.

Adding more static policy where we could have dynamic policy is a step backwards in my opinion.

The compositor would need to support all sandboxing mechanisms if it makes policy decisions. Compositors will need to add support for Flatpak, snap, FreeBSD jails, and so on.

I don't think that's true. Each sandboxing mechanism would only be responsible for authenticating a connection and the compositor doesn't have to care which mechanism authenticated the connection.

All of that obviously assumes that the compositor only exposes protocols which are unprivileged or can handle authorization failure.

I can only talk about flatpak and yes, flatpak does make policy decisions but that's mostly for backwards compatibility where there is no sandbox-aware portal solution yet. Those portals are in charge of actual security policy then and ideally flatpak will get rid of what you describe as static policy.

As I've understood, Flatpak also does policy decisions via xdg-desktop-portal, and this part won't go away.

Adding more static policy where we could have dynamic policy is a step backwards in my opinion.

Depends what your goal is. For Flatpak, sure, a dynamic policy is better.

Each sandboxing mechanism would only be responsible for authenticating a connection and the compositor doesn't have to care which mechanism authenticated the connection.

Then why do you need to tell the compositor about the sandboxing mechanism at all?

As I've understood, Flatpak also does policy decisions via xdg-desktop-portal, and this part won't go away.

Just checked again and you're right, I somehow was under the impression that the backend was in charge of that but that's wrong.

Then why do you need to tell the compositor about the sandboxing mechanism at all?

It's part of the identity of the container. If you define the identity as the application id and you have two sandboxing mechanisms, each having an application with the same application id the identity is ambiguous.

So, if we're going to do design work here…

I think both proposed sandboxing interfaces are insufficient. Particularly: because they both apply only before connection setup they don't allow requesting privileges at the time of use - you can't have an “AppFoo requests permission to capture the screen: Allow, Allow this time, Reject” dialog pop up when the user presses the “take screenshot” button; you either have permission at application startup or you never will.

This is not a huge problem for a dedicated screenshot tool, because no one is going to start one of those up without wanting to take a screenshot. It's a bit more of a problem for, say, the GIMP, where the vast majority of its functionality does not require screenshot privileges, and many users could complete their workflow without ever touching the code that requires screenshotting.

Android has been progressively walking back from its initial “app must request all permissions it might possibly use at install time” model because it's not a very good model; we should not bake that model into our compositors.

Supporting this necessarily requires the compositor to be able to query the sandbox at runtime, so how about:

<protocol name="sandbox">
	<interface name="sandbox">
		<request name="create_client">
			<description>
				Connect a sandboxed Wayland client and associate it with
				a sandbox_client.
			</description>
			<arg name="id" type="new_id" interface="sandbox_client"/>
			<arg name="socket" type="fd"/>
		</request>
	</interface>

	<interface name="sandbox_client">
		<description>
			The sandbox_client interface allows the compositor to query the
			sandbox whether the client should have access to specific interfaces.
		</description>

		<event name="access_request">
			<description>
				The sandboxed client is requesting access to a privileged
				global.
			</description>
			<arg name="serial" type=uint32_t/>
			<arg name="requested_global" type="string"/>
		</event>

		<request name="access_grant">
			<description>
				Tell the compositor to allow this client access to the specified
				global interface.
			</description>
			<arg name="serial" type=uint32_t allow-none=true/> ### I don't think we support allow-none for non-pointer-types?
			<arg name="requested_global" type="string"/>
		</request>

		### And a destructor, quit event, etc
	</interface>
</protocol>

With a cut down unveil interface, but used by clients to request that globals be published:

<protocol name="unveil">
	<interface name="unveil_manager">
		<request name="unveil">
			<description>
				Request for access to a global.
			</description>
			<arg name="interface" type="string"/>
		</request>
		### Probably a good idea to have an event for a denial, etc

	</interface>
</protocol>

There's obviously the bootstrapping problem that we've now got a privileged sandbox interface, but assuming that we're only trying to sandbox clients which are otherwise sandboxed it should be fine.

So, if we're going to do design work here…

Yeah, this wasn't really the intention… :\

I think both proposed sandboxing interfaces are insufficient. Particularly: because they both apply only before connection setup they don't allow requesting privileges at the time of use - you can't have an “AppFoo requests permission to capture the screen: Allow, Allow this time, Reject” dialog pop up when the user presses the “take screenshot” button; you either have permission at application startup or you never will.

So, let's reply per protocol:

For the first one, the unveil protocol, this is intentional. Static policy.
For the second one, the sandbox protocol: I does support dynamic policies (aka. ala-Android authorization dialogs). All the sandbox protocol does is attach app identification to the connection. The idea is that all privileged globals are exposed, but since those always have a way to gracefully deny access (with a "failed" event for instance), the compositor can still apply its policy. For instance, in the screenshooting protocol, the first time the app does a screenshot the compositor can pop a dialog, and send "failed" if denied.

Android has been progressively walking back from its initial “app must request all permissions it might possibly use at install time” model because it's not a very good model; we should not bake that model into our compositors.

As said above, I still believe that in the context of Linux distribution repositories, static policies are fine. Dynamic policies are required when untrusted publishers can upload their apps to a "store" (Flathub).

Supporting this necessarily requires the compositor to be able to query the sandbox at runtime

I'm not a fan of this. Dynamically creating and destroying globals is going to be painful. We need a way to revoke access too, without crashing the client because of a race.

Revoking access is important for "allow by default, but allow a quick undo" policies. For instance, a compositor might allow clients to e.g. grab the keyboard by default, but could show a button (and expose a special keybinding) to let the user deny the request. This is one of @mupuf's findings: authorization dialogs aren't always a good idea.

With a cut down unveil interface, but used by clients to request that globals be published:

Unveil is not designed for dynamic policies.

I'm clearly a bit confused about the proposals, as I thought (a) everyone agreed that attempting to enforce policy for applications not in a sandbox was a hopeless task, and (b) that we wanted the sandbox to be in charge of policy.

Should we take this to wayland-devel@lists.freedesktop.org?

I'm clearly a bit confused about the proposals, as I thought (a) everyone agreed that attempting to enforce policy for applications not in a sandbox was a hopeless task

My proposals don't try to enforce the policy for apps outside of the sandbox. It's just that there isn't just Flatpak and snap. Sandboxed apps could be distributed via your distribution's package manager, too.

and (b) that we wanted the sandbox to be in charge of policy.

Well, I agree with that. However as I've understood it, @swick has a different opinion.

Also note that we already have a privileged protocol in wayland-protocols: keyboard-shortcuts-inhibit. Mutter already takes policy decisions for it (currently it opens an authorization dialog).

Should we take this to wayland-devel@lists.freedesktop.org?

To be honest I'm not sure the mailing list archives will show the discussion in a more readable way. But maybe everything we're talking about here is irrelevant to the original issue posted by @pq.

To be honest I'm not sure the mailing list archives will show the discussion in a more readable way. But maybe everything we're talking about here is irrelevant to the original issue posted by @pq.

Maybe a new gitlab issue? I concur that the mailing list is not an excellent forum for these sort of discussions.

Proposal to fix the PID race condition: https://lore.kernel.org/lkml/go0RLOS7_DdxyAmfrDR38QPUloZuUtiFdXe2Ey3EkGGuvmW7z18Dvt4fY1qZ1k-Y75-YZSxqVWnZpWRGN7TZ6OPbDczfL7HI25bXLIYq1y4=@emersion.fr/T/#u

Let's try to sum up the list of open questions:

Is per-protocol/global granularity enough, or do we need something more precise?
- If we need something more precise, we need to come up with a set of standard to describe things to allow/deny
Do we want to enforce rules only inside of sandboxes, or do we want to enforce rules for all clients?
- It's difficult to enforce rules outside of sandboxes, because of races with PIDs (new kernel API needed to avoid these)
Is a static policy enough, or do we want to allow a dynamic policy? (ie. user can allow screenshots via a dialog)
Should the compositor decide on the policy, or should an external program decide?
- If we want the compositor to decide on the policy, the compositor will need to know details about the client (executable path? root FS? OS-specific info like Linux namespace?)

Once these are answered, I think it won't be complicated to come up with a plan.

Some additional thoughts:

Unless we want to rely on OS-specific security mechanisms (selinux labels via SO_PASSSEC, the possible future pidfd support in Linux), we need to have a mechanism to identify clients at the point the client socket is created. The accept() return in the server is too late to evaluate properties of the PID that opened the socket, and the properties universally maintained by the OS (UID and GID) are not sufficient to make useful decisions about graphical clients outside environments like Android where UID is significant. That means we need something else like the sandbox socket submission protocol.
Enforcing rules outside of sandboxes is possible to do securely, you just can't base the policy on PIDs or anything derived from them. Launching processes directly from the compositor is one useful method, and there are other OS-specific methods (security context, systems that don't run user-provided executables).
Even if it has no security benefit, it can still be useful to enforce rules outside sandboxes to deny unwanted behaviours of non-malicious but badly-behaved programs (for example, X applications that grab input). Keyboard shortcut inhibit might be a good example for this.
Static policies can be implemented before dynamic policies. In many cases, a dynamic policy can be added later by causing individual interface functions to either fail or become no-ops. I expect that dynamic policies are going to be wanted for things like screenshots. Even when dynamic policies are fully implemented, I would expect some interfaces to still be controlled via static policy, although a dynamic policy that always denies (or allows) without prompting the user is really the same thing in the end.
Dynamic policy will be more useful if it includes some kind of per-interface data support, possibly bidirectional: it would be helpful the "screenshot allow" dialog had the option to restrict screenshots to a specific output or region of the screen, and to later change this or deny it, and possibly to indicate when the permission is being used (notification icon).
Authentication of clients from a sandbox is likely to need the help of an external program, especially if we want to avoid having the sandbox itself proxy the connection. There's also the case of programs like waypipe that could choose to pass along information about remote clients that the compositor cannot see.
- Allowing sandbox programs to make permission requests when creating a client is likely to be required for the static policy case. A compositor could choose to trust sandboxing programs to only request permissions the user approves of (this seems to be flatpak's model).
- Relying on an external program to authenticate does not imply accepting its policy decisions. I would suggest adding a mandatory or highly-recommended ID for the application (for example, flatpak might pass "flatpak:com.example.app") in addition to any other restrictions (like mandatory app_id name). Then, either the compositor or an external program can decide what accesses to grant "flatpak:com.example.app", which may involve showing that ID to the user on the first use of a dynamic permission.

No matter what policy and protocol design we still need to identify the process on the other end of the connection (that's true even for non security related things like resource tracking and scheduling).

We could pull a xdg-dbus-proxy kind of design where we have a trusted proxy per sandboxed client where the compositor now sees the credentials of the trusted proxy instead. That's really not ideal running another process and pushing all data through userspace again.

One thing that was proposed on the kernel mailing list was proofing that a client is a certain client by sending a fd to the compositor which other clients do not have access to. Flatpak could for example create a file per app instance in the host mount namespace and only mount it in the sandbox of that instance. The client can then authenticate itself by sending the fd to the compositor.

The problem with that approach is that it is not backwards compatible. The compositor can not distinguish between a client which doesn't know how to authenticate itself and a client which chooses not to. Applications which used to work now suddenly don't.

For similar reasons the approach doesn't help for resource tracking and scheduling.

I think we should ask for a SO_PEERCRED equivalent for pidfd again and explain why other ideas fall short.

One question asked was how exactly the pidfd would be used in the compositor. Probing the pid's procfs for the exe name and similar things indeed seems like a bad idea. What we can rely on however is cgroups. Systemd already does resolution from pid to cgroup to service and a standard way to name the service of application. Both KDE and Gnome are already in the process of supporting that for resource tracking. If extend socket with SO_PEERPIDFD and hold the fd open while calling systemd's GetUnitByPID we get race free access to the application ID and instance. Obviously non-sandboxed apps still can escape all of that but I don't think that's unexpected or a blocker.

mentioned in issue dbus/dbus#274 (closed)

mentioned in commit emersion/wayland-protocols@1fe14900

mentioned in merge request wayland-protocols!68 (merged)

mentioned in commit emersion/wayland-protocols@65cb09d0

mentioned in commit emersion/wayland-protocols@22c2902f

Once again an excellent write-up from @smcv: https://lists.freedesktop.org/archives/wayland-devel/2021-July/041921.html starting with:

Before that can happen, you need a concept of identity: you can't say anything about what a client can or cannot do unless you can say which client it is. D-Bus, X11 and other AF_UNIX protocols have the same problem.

and continuing with a review of the current status around different technologies.