Skip to content

WIP: Support tapping with additional fingers while others are already on the touchpad

satrmb requested to merge satrmb/libinput:hold-and-tap into main

Closes #469

This is a draft implementation of a feature some users like me would like to have - power-users who use a touchpad for one reason or another (say, working on the go), and would like to get the absolute most out of it.

In particular, the equivalent of clicking with a mouse button while continuously moving the cursor or while holding another mouse button down is only possible by using separate buttons below the touchpad (whether actual mechanical buttons or emulated ones on a clickpad configured to use the "buttonareas" click method). Purely with tapping it's impossible, and if the touchpad is a clickpad configured to use the "clickfinger" or even "none" method ("clickfinger" is libinput's default for Apple clickpads!) the user may be out of luck too (clickfinger allows more-or-less continuous motion across a click at least).

There are other reasons to want this too, like migrating from MacOS which allegedly provides this feature already, see the feature requests for more info (469 and its duplicate both say so).


Implementing this in a way such that it feels seamless is a very hard task. The original requester in 469 would have liked anything to be doable on a touchpad while fingers are already down, leading to multiple concurrent gestures. I called "scope creep" on that though; renovating the tapping code is a big enough task, so that's what I'm focusing on here.

Edit: it seems I got carried away explaining the details of my changes, such that it's more confusing and overwhelming than helpful. I'm keeping the old text around (with light edits), but tucked away below to keep the reader's scrollwheel healthy. It might still be useful as an alternative to the commit messages if a reader finds those to be not understandable enough.

Let me try a much shorter summary of the changes instead:

  • Split the tapping state machine into two, a tapping state machine and a dragging one. This allows me to re-use the tap recognition provided by the tapping state machine in more places than just the start of the more or less linear state chain of the _dragging one. The end effect is that I can allow use of the MR's main feature in the middle of a drag with just a few lines of code. (2 commits, which I would like to squash into 1 because the first doesn't make any sense without the second)
  • A handful of misc changes (mostly infrastructure-related) that are meant to prepare for the main feature in one way or another, but are still separate enough that they could stand on their own. (3 commits)
  • Implementing the main feature. With the tap recognition contained in its own state machine, this is mostly as simple as restarting it from the TOUCH state in a number of situations. This is locked behind a new setting because it's fundamentally incompatible with a few mitigations for usability issues we had in the past. (2 commits; boilerplate for the setting, followed by the feature itself)
  • More misc changes, with a focus on things that would impact users who turn on this new feature. Particularly important are the first two: the first reinterprets taps that are aborted halfway through the finger release as taps with less fingers (the remaining fingers become "holding" fingers); the second relaxes tap-and-drag starting such that "holding" fingers can be released between the tap and the drag without disqualifying it from tap-and-drag. (5 commits)
old long-winded explanation (click to expand)

With the aforementioned limit on scope in place, the first thing that comes to mind is: we want taps to be able to happen not only from a state where any number of fingers are down, but also in the middle of a tap-and-drag. All that in addition to the existing ability to tap from a neutral state, from a state where a tap has been performed just a little bit earlier (i.e. a doubletap; if the fingers are not released for this fresh, it becomes a tap-and-drag), or from an empty touchpad waiting for a drag-lock continuation (a tap here would terminate drag-lock). Very duplicative, isn't it? How about separating out tap detection for easier maintenance?

That is how the first part of this MR came about. For review purposes, I split it into two commits: one merely swaps around functions, renames some stuff (functions and enum constants), and adds some unused dummy functions. The diff of that one is a gigantic mess because the diff algorithm latches onto identical lines in totally unrelated places (empty or just a closing curly brace, for the most part). The actual code changes are confined to the second commit, and with the first one giving diff something reasonable to latch onto, the changes should be somewhat understandable.

The basic design is that the half of the state machine reachable without going through TAPPED becomes the state machine that recognizes taps. I'm referring to that one as the tapping state machine from here on. The other half becomes the dragging state machine, which receives an IDLE state to stand in for the half that has been cut off, along with its own BUTTON state reserved for clickpad button presses. The tap detection left in the dragging state machine (doubletap and draglock-ending) is stripped out; instead, it receives new events corresponding to 1-, 2-, and 3-finger taps. The tapping state machine sends these where it used to emit mouse clicks, which are now entirely governed by the dragging state machine.

All other events are handled by both state machines. This is necessary, because e.g. a timeout after tap + finger back down needs to not only invalidate a tap in the tapping state machine, but also advance the dragging state machine from DRAGGING_OR_DOUBLETAP to a plain DRAGGING state. Note also that in the exact same scenario the tapping state machine will not transition into the HOLD, TOUCH_2_HOLD or TOUCH_3_HOLD states like it would without the previous tap: that part of #499 just doesn't make sense when the timeout converts the tap into a drag.

By the way: !568 (merged), which was filed and merged after this MR was made, reduced or eliminated the need for certain pieces of handling in the tapping state machine, like the check for an active drag before transitioning to HOLD and friends. However, as discussed in that MR's preceding issue, it is just a stop-gap and may get reverted at some point, if it gets in the way of another feature. For that reason I'm keeping some pieces of essentially dead code in this MR, just so it's easier to not miss some corner case if / when it does get reverted.

About the order of the two state machines in the event handling sequence: I chose to send events to the tapping state machine first, then the dragging state machine. That has a slight effect on the drag-checks mentioned in the previous paragraph, and it allows me to send events from one state machine to the other by simply calling the latter's event handler, while also having the dragging state machine receive taps before the events that caused them. This in turn simplifies some things in the dragging state machine, particularly when it comes to the other part of 499: three-finger taps that aren't fully released, but generate a mouse click anyway upon a timeout, motion, re-touch, or clickpad click. If that other event was sent before the tap, I'd have to handle the taps in the corresponding target states, making a mess of them. If the tap comes first, I can just handle that in the place where it actually makes sense, then let the other event get handled in the state reached via the tap (it conceptually comes after the incomplete tap anyway). The three-finger taps currently cannot occur during a drag because the drag ends when the third finger comes into contact, but that behaviour is one I'm dropping in a later commit.

All in all, this state machine split should have absolutely no effect on the behaviour observable from the outside (excluding debug logging), and I'm quite confident that is true.


Along the way I found a number of interesting tidbits. Some are significant enough to be addressed in separate commits (see below), and some are just unnecessary lines of code I'm silently removing; the thumb detection is small enough to slip into the commit splitting the state machine up, yet special enough to need a paragraph or two written about it:

The state machine has the ability to react to touches turning into thumbs via its THUMB event. Some thumbs don't even reach the state machine: if a touch is a thumb already when it begins, it's completely ignored - no TOUCH event, and no THUMB event to undo the TOUCH event's effects. However, the functions called to check for thumbs in the fresh touch vs. old touch cases are different: tp_thumb_ignored_for_tap is used on touch begin, tp_thumb_ignored is used for touches that actually generate THUMB events. That struck me as odd.

Another oddity is that the only states that actually handle THUMB events were TOUCH and HOLD, i.e. when only a single finger is down and no drag is in progress. I've preserved these conditions throughout all commits, but not having the prerequisite experience with the thumb code, I don't know if the conditions are any good, so I'm just bringing it to your attention.


The next commit is a fix for a bug I uncovered. Technically unrelated, but it came to my attention during the split; and for merge conflict reasons (touching the code of each and every state) I hesitate to submit it separately.

Imagine a palm triggering a clickpad's button, then release it again without lifting the palm off the touchpad (this probably produces a left-click, which is expected). Next, a finger is put down and lifted, which should be recognized as a tap and therefore produce another left-click - but it's ignored: bug alert! Another tap after that works fine.

The reason is that merely releasing the clickpad button doesn't produce any events in the tapping state machine (or state machines, since there are two now). We need a finger release or palm release to exit the DEAD state.

Another consequence is that you can trigger the clickpad button, hold it down with a palm, then release an actual finger ... and be in a state accepting taps even though the clickpad button is still down.

I'm solving this by introducing the missing event, and splitting the DEAD state into two parts (DEAD and BUTTON) to remember whether the button is currently down. I'm also throwing out the PALM_UP event: it only triggered the check whether it was time to leave DEAD for IDLE, but that check reads a counter variable that excludes palms anyway, so eliminating the bug that allowed the counter to be in the state required to leave DEAD makes it useless. Palms are supposed to be completely ignored anyway, so letting a palm release have an effect wasn't even that good of an idea.


The commit after that is dropping the two-finger limit during dragging. It would limit tapping during a drag to just a single finger, because the other one is occupied by the drag. That's too limiting for my taste, and I have yet to see a good argument for such a limit anyway.


Next up is a change that may be somewhat controversial. I'm concentrating much of the DEAD-marking of individual touches in a single place. So far, so good - but I'm specifically excluding the transitions toward the HOLD group of tapping states via TIMEOUT, because the fingers present in these cases can become part of a multi-finger tap again by adding a new finger (499 strikes once more). In a subsequent commit this individual finger state will be used for this MR's main feature, to tell apart new touches that can tap and old touches that can't.

In the current situation this change doesn't even make a difference, because the way I implemented multi-finger tap-and-drag means the only line of code (apart from debug logging) that paid attention to the difference between an individual finger's TOUCH and DEAD states was removed. Yes, that means the issue the check was presumably meant to prevent (a mouse click from a finger swap) did appear together with that feature. Well ... if just one of the two halves of 499 (timed-out but stationary touches can be "revived" for taps by adding another finger; 3-finger taps emit a click even when not fully released) was dropped, we wouldn't have this problem.


Now, finally, on to the main feature. I'm spending two commits on it: the first one is just boilerplate for the setting, the second one is the implementation of the feature proper. I know libinput is meant to have as few settings as possible, so have this as a reason for hiding this behind a setting: #382 (closed). The short unintended touches in that issue would clearly have to be interpreted as taps with one additional finger, leading to left-clicks with my new feature on. We don't want that issue's reporter to suffer unnecessarily, so he needs to be able to opt out of this. Combining this potential for accidental taps with the target audience I have in mind for this feature (power-users, i.e. people who know how to configure stuff), it's likely better to have this feature off by default.

The name I picked is "hold and tap", modelled after the venerable "tap and drag" option. I think it's fairly descriptive, though the "hold" part might be interpreted as "hold still" when I only mean "hold the finger(s) down"; movement is acceptable, as long as it's not in the new touches. As usual, it's a programmer's name for a feature, which means it can likely be improved. Feel free to do so, I'm not overly attached to this name.

The implementation is quite straight-forward, for the most part. The dragging state machine converts taps into mouse clicks, as usual (now in DRAGGING too, which would have been a bug before; and in IDLE it prevents taps from starting a drag if there are still holding fingers down). The tapping state machine now only reacts to thumbs, palms, and releases if they were marked as TOUCH (except in the HOLD group, which is only reachable when hold-and-tap is disabled, meaning the state accurately represents the number of non-ignored touches, all of which are TOUCH anyway; and also except DEAD which by definition has no touches marked as TOUCH).

A new finger down when we aren't on the IDLE -> TOUCH -> TOUCH_2 -> TOUCH_3 track (and we aren't locked into BUTTON by the clickpad button) is treated as the start of a new tap, starting over in TOUCH with any existing touches considered holding fingers. That means I have to kill off the half of 499 dealing with revived old touches at least for the case with hold-and-tap on (can't revive old touches with a new one as that's a new tap by itself). If hold-and-tap is off, 499 gets to survive in its entirety, which is why the HOLD group of states still exists (they would be folded into DEAD otherwise). I still think 499 should die: we haven't reached the last of the headaches it causes yet.


The remaining commits contain various enhancements associated with hold-and-tap. The first of them deals with how partially released taps are interpreted: I think they are taps with however many fingers were released, the remaining touches are holding fingers that just happened to be put down in close temporal proximity to the tapping fingers. This replaces the half of 499 that was still left in the "hold-and-tap is enabled" case.

This means we have to look at taps with more than three fingers as well, so we can tell apart e.g. four-finger taps (which do nothing, except maybe end drag-lock because we established in !463 (merged) that any tap shall end drag-lock) from three-finger taps that came with a new holding finger. At least we can aggregate them all into one new branch of tapping states, because after four releases we know that we're dealing with a tap beyond three fingers, which all have the same result - we can stay on that branch with a fifth or sixth finger down. We do have to recover the number of tapping fingers on a palm detection, so we know whether we have to switch to the three-finger-tap branch or stay in the four-plus-finger one, but that's rare enough that I simply count them in a loop (encased in a helper function).

I also allowed the no-hold-and-tap case into this four-plus-fingers branch. If we already have the branch, it's probably worth trading another new state (for the HOLD group) and a few additional checks in the other states of that branch for the ability to deal with 3-finger taps with an interfering palm that isn't detected on touch begin, and for the ability to end drag-lock with many-finger taps.


Next commit: A bit of leniency for tap-and-drag. When hold-and-tap is used, the tap may now start a drag anyway, on the condition that the holding fingers are released along with the tapping ones. You could already sort-of do that before, but only if you lifted the holding fingers before the tap was complete (i.e. the last finger up had to be a tapping finger). If a holding finger goes up just a frame too late, you only get a short click from the tap. I found that frustrating and inconsistent, because this could be quite useful: move the cursor onto an object you want to drag around, put down another finger, lift both, then put one down again and continue moving to actually drag.

In the case of starting from no active drag, the code changes are quite subtle: I'm moving the check for an empty touchpad from the IDLE -> TAPPED transition to the TAPPED -> DRAGGING_OR_DOUBLETAP one.

But why not allow this from a drag, too? That would allow quickly switching the mouse button held down in the drag. (If the tap has the same number of fingers as the one starting the drag, this boils down to something like a poor man's drag-lock; I'm allowing it for consistency, and who knows, maybe someone who has regular drag-lock disabled will find it useful). I make this work with a new dragging state, DRAGGING_OR_TAPPED (with 9 variants for the three possible finger counts at the start of the drag, times three possible finger counts in the new tap). The new state is reached with a tap in DRAGGING. If all fingers are released (or turn into palms), it transitions to the variant of TAPPED corresponding to the new tap's finger count. Any other event finishes the tap by emitting the mouse click up, then returns to DRAGGING.


Clicking on a clickpad with the click method set to "clickfinger" is quite similar to a tap. For that reason I'm adding a commit enhancing it. Apart from making it respect the tapping button map (why the hell didn't it do so before!?), this is focused on hold-and-tap: if only some fingers were recently put down when the clickpad is pushed, it's probably a safe assumption that the user means to click with that many fingers (i.e. it's almost a tap with that many fingers, except that the downward momentum of the fingers pushed the clickpad down hard enough to trigger a click; and the fingers may move or linger for longer than allowed in a tap).

This one might be somewhat controversial too.


In some situations, a mouse click in the middle of a continuous motion can be useful. For instance, someone playing an immersive 3D game (think Minecraft, or maybe a shooter if the user is crazy enough to try playing that with the inherent latency of tapping) may use the motion to track a target, and the click then translates to an attack command. Or you are just using a browser and want to open an entire list of links in background tabs (using middle-clicks on each).

The penultimate commit in this MR aims to allow just that. Even though the main feature commit already prevented MOTION events coming from holding fingers, simply being in a state qualifying for a tap prevented the gesture handling code from running, thereby preventing cursor motion (along with scrolling and all the other gestures).

We have a system for selectively filtering out motion from individual fingers: pinning. I'm revamping it to be able to pin some fingers but not others (previously it was only used to pin all fingers simultaneously on a clickpad button press, then unpin them one by one when they moved far enough), to forcibly unpin select or all fingers (i.e. not only depending on motion), and to specify a pinning reason (a finger might be pinned for more than one reason at the same time; the pinned state is simply a bitset of pinning reasons). With pinning tuned like that, it can outright replace the old motion filtering scheme (which skipped the entire gesture handling code if the tapping or button code requested it).

Since pinning comes with a small-movement threshold out of the box, I'm also replacing the tapping code's motion check with a simple "is it pinned?" question. (By the way, the thresholds were inconsistent: 1.3mm in the tapping code vs. 1.5mm for pinning.) Unfortunately that has a side effect: Because the HOLD group of tapping states react to motion, we have to pin the touches, otherwise high-resolution touchpads report any miniscule finger movement as MOTION event. That in turn means that (with hold-and-tap off at least, because enabling hold-and-tap disables 499, which is the whole reason the HOLD states exist) simply waiting for the timeout to expire doesn't make the movement-preventing barrier go away, and that barrier was a noticeable obstruction in my tests. The old motion filtering was simply not active in the HOLD states, so the cursor was allowed to jitter even though a larger movement would cause a state change. Tying all of it to the pinned state doesn't allow us to go down that middle path. We could restore the motion check to what it was before this commit (or maybe convert it to use the pinning location and threshold), or just drop 499 and the HOLD states with it.


The final commit is just a tiny change. I found that a touch moving quickly would mark fresh touches as thumbs. With hold-and-tap that may be intended though, particularly because hold-and-tap enables two-handed operation (moving the cursor with one hand, adding fingers to tap with the other), which can be a reason for large speed differences between the hands. Therefore hold-and-tap should disable speed-based thumb detection for hurting more than it helps. Loading so many valid interaction possibilities into event sequences implies that the user moves his/her hands consciously anyway, so we don't really need that much accident protection. As I wrote above, a hold-and-tap user should be a power-user.


Other thoughts:

If we assume !568 (merged) is reverted: As a consequence of using the same state machine for all instances of taps, combined with removing the two-finger limit during a drag, the tap ending drag-lock can now be a 499-style three-finger tap that doesn't have all fingers lifted. That tap does end drag-lock, and the remaining fingers can then do ... whatever. Move the cursor, scroll, or even re-use for another three-finger tap that actually produces a mouse click. I don't consider this a bug per se, but it's definitely quirky behaviour. If desired, I can prevent it by shutting off that part of 499 too if a drag is active. With hold-and-tap on, this behaviour is even common across all incomplete taps, and due to the "I know what I'm doing" premise of hold-and-tap, that should even be a feature.

The headaches related to 499 are not the only reason this MR is WIP. It's very much unfinished: all automated tests for hold-and-tap are missing, and believe me when I say that's a lot of tests waiting to be written; some independent manual testing wouldn't be a bad idea with changes as groundbreaking as this; and I have half a mind to strip down the tap-recognizing portion of the state machine further to merge all states in the three groups I like to call HOLD, RELEASE, and TOUCH (I'm sure you can tell which states they encompass, at least if I tell you that they don't overlap) into one state per group, relying just on a new counter variable or two for the fingers put down / lifted. A handful of if statements could take care of what little difference they have (such as 499's incomplete tap part only applying to three-finger, not two-finger taps), and it'd reduce the code volume that needs to be maintained going forward. (I'm especially thinking of the TOUCH_4_PLUS branch I added ... it's not exactly small, but it'd just vanish.)

So, that's it. Feedback of any kind is welcome from anyone - questions, suggestions, criticism; at code level or conceptual; all of it is fair game (as long as it's constructive, of course, but that ought to be obvious around here).

Edited by satrmb

Merge request reports