Semantic markers for prompts

mentioned in issue #3

The FinalTerm/iterm2 shell integration control sequences are rather limited as they don't provide a good way to specify options. I'd prefer a more extensible feature. As I mention in https://github.com/xtermjs/xterm.js/issues/576#issuecomment-455823697 there are at least 3 different syntax in use. I think the best would be a syntax that can take named options in JSON syntax.

I agree that having a way to handle nested shells would be nice. For example, Julia uses the finalterm codes. If there were a way to relate siblings then the tree structure would be exposed. I like the idea of a random ID.

It would also be useful to expose metadata (e.g., current directory) in the start-prompt code.

mentioned in issue bidi#8

An example of a useful option: A shell can specify that you can move the cursor with arrow keys. In that case, the terminal can translate a mouse click in the current input field to an error-key sequence to move the cursor to the clicked on spot. (Terminal.app does this if you press the Option key. It woudl be more natural if a regular no-modifier left click would do the same, but that requires the terminal to be told that doing so is safe and requested.)

Note that some shell and input libraries allow cursor motion in a multi-line input field, so that's another option it would be useful for the terminal to know.

A limitation of the FinalTerm/iterm2 escape sequences is that they don't seems to support secondary prompts (like sh's PS2, for continuation lines). At least neither the documentation nor the sample shell integration scripts cover this. (Extraterm doesn't seem to handle secondary prompts, either, from I can tell from the extraterm-commands scripts.) I think it is important to handle multi-line input commands.

If a shell wants to allow mouse clicks to move the cursor, it can simply turn on mouse support and use OSC 52 for clipboarding.

Turning on mouse support in a shell is a lot more complicated - and it requires actual native support in the shell (or input library). It's a lot simpler to just send an escape sequence "please translate mouse clicks to arrow-key sequences". Especially since the latter can be done by just tweaking a prompt string. As long as a shell or other REPL allows customization of the prompt string, and you use DomTerm, you can move the cursor with an intuitive mouse click.

Another problem with terming on application mouse support is that it also disables the terminal's other mouse support: middle-button paste; context menus; selections; hyperlinks. An application can implement most of that support, but it's a lot of work, and it will not be the same as the terminal's native support (for good or ill).

Simpler but a lot more broken. What if the input widget is in vi mode, where arrow keys dont move the cursor? What if the text contains unicode symbols that have different widths, the input library now has to match the wcswidth() implementation of the terminal to get cursor movement right.

Supporting mice should be done at the application level, not via arbitrary hacks.

There is a difference between full-screen applications (emacs, vi, mc) that typically use the alternate buffer, vs line-oriented console applications (most shells and language REPLs) that typically use the normal buffer. The former typically should and do enable mouse support. The latter do not have any mouse (or clipboard or selection) support. The feature I'm talking about is for typical shell/console/REPL applications. You may say these should enable mouse support - but very few do, because it's a lot of work, hard to get right, has some awkward downsides, and not really what they're about.

"What if the input widget is in vi mode, where arrow keys dont move the cursor?" Then either fix your widget to handle arrow keys (vi handles them), or don't send the escape sequence to enable mouse click translation.

"What if the text contains unicode symbols that have different widths, the input library now has to match the wcswidth() implementation of the terminal to get cursor movement right." And how does application-based mouse handling avoid this issue?

"Supporting mice should be done at the application level, not via arbitrary hacks." Perhaps, but commonly used shells and REPLs, don't support mice.

If an application wants to support mice, it should support mice, not expect its host environment to implement poorly thought out hacks for it. What does its being full screen or not have to do with anything?

So you are saying your hack will break for vi mode.

Application based mouse handling means the application to move the cursor to whatever cell is clicked on, without needing to compute wcswidth(). So your hack will break for non-ascii text.

Have you ever stopped to think that the reason shells dont support mice is that there is no real need to do so. People that use awkward interfaces like mice as part of editing text are perhaps not that big of a constituency for shells anyway.

"So you are saying your hack will break for vi mode."

No, Please read more carefully. To my knowledge "Vi mode" supports arrows keys as well as "emacs mode" does - but it of course depends on the input library used. (There is no universal "vi mode".) I see no reason "vi mode" should cause any problems. If for some strange reason vi mode for some input library doesn't handle arrow keys (which seems like something that should be fixed in itself), then you disable my "hack" (which is opt-in, based on escape sequences sent) - but that doesn't "break" it.

"Application based mouse handling means the application to move the cursor to whatever cell is clicked on, without needing to compute wcswidth()."

If the application and terminal don't agree on wcswidth, then the application can't correctly position double-width characters. Period. Terminal-based mouse handling doesn't change that.

In any case, mouse-click translation is not the focus of this issue. Rather it is an example of why a shell integration specification should be at least somewhat extensible, to the extent of allowing named options that can be passed in the begin-prompt or end-prompt escape sequence.

No in vi mode, arrow keys are commonly bound to different functions, for instance, I have them bound to indent/de-indent. Which means hacks that assume their function, will break. The point is they are a fragile bandaid that should not be implemented when a much better and more robust solution exists.

No applications can simply print out double width characters withoutknowing their width, they do it all the time.

If it is an example, it is a surpassingly bad one.

"The point is they are a fragile bandaid that should not be implemented when a much better and more robust solution exists."

I don't know of any better solution using xterm-style mouse reporting, which breaks selections and context menus. How would an application implement selection of text that was written by a no-longer-running program? I don't remember if xterm has a command to request the screen contents, but even if it does it would be complicated at best.

I agree xterm-style mouse reporting might be more robust if there were a way to either: (1) restrict it to left and middle clicks; or (2) not disable the default terminal actions. But that requires extensions to the protocol. It would still require a lot of work for lots of shells and REPLs

"No applications can simply print out double width characters withoutknowing their width, they do it all the time."

Wrong in the case of an editable input field, such as using GNU readline (which is the kind of application I'm talking about). The input editor has to know when a line wraps around, for example.

"If it is an example, it is a surpassingly bad one."

Yes, we get it: Your way of doing anything is the only good way, and you're smarter and more knowledgeable than anyone else.

Let me make a concrete proposal for a convention for encoding options in a possible escape sequence. I've earlier suggested JSON, but there is resistance to that. In that case, I think the iterm2 style of escape sequences with options may be a good route:

"\e]" number option * "\007"

where option is:

";" identifier

or:

";" identifier "=" value

and value is any sequence of characters that contains neither a semi-colon nor a control character. Complex values with control characters or semi-colons can be encoded as base64. It might be reasonable to allow value to be a quoted string with C-style escapes, but it's not strictly needed.

"I agree xterm-style mouse reporting might be more robust if there were a way to either: (1) restrict it to left and middle clicks; or (2) not disable the default terminal actions. But that requires extensions to the protocol"

You are asking for a whole new protocol here and trying to justify it with an example that is better served with a different solution.

"Wrong in the case of an editable input field, such as using GNU readline (which is the kind of application I'm talking about). The input editor has to know when a line wraps around, for example."

So your hack is only intended to work for "editable input fields". Once again, fragile and poorly thought out solution.

"Yes, we get it: Your way of doing anything is the only good way, and you're smarter and more knowledgeable than anyone else."

I'm certainly smarter and more knowledgeable than you. And kindly leave the personal attacks out of your posts. If you are arguing a losing position, just bow out.

Oh and let me clarify, I have nothing against a protocol to mark prompts, and IMO all protocols should be made extensible. I simply object to using it to hack mouse support.

Besides the lack of support for options, I see two other problems with the iterm2 shell integration protocol:

Lack of support for multi-line inputs, each with their own prompt (like shell's PS2). The protocol could be extended by allowing more than one pair of A (FTCS_PROMPT) and B (FTCS_COMMAND_START) pairs (one pair for each line) before the C (FTCS_COMMAND_EXECUTED) command.
The iterm2 protocol requires markers for start of output (FTCS_COMMAND_EXECUTED) and end of command (FTCS_COMMAND_FINISHED). Some shells/REPLs don't have a user callback/hook that can emit these commands, even if they have a user-setting prompt string. So it would be useful to have a variant of FTCS_COMMAND_START that is implicitly "closed" by end of line: The user-entered command is everything between FTCS_COMMAND_START and end-of-line. Similarly, FTCS_COMMAND_FINISHED could be implied by a new FTCS_PROMPT. Nested shells can be handled by an optional identifier (such as the pid of the shell): If a new prompt has the same key as a previous command, it implicitly marks that command as finished; it it's a new key, a new nested command is created.

One suggestion is to add some new subcommands:

OSC "133;E;k=" key other_options "\007"

This is start of command and start of prompt. If key matches a previous command, automatically finish (close) those commands first.

OSC "133;F" options "\007"

Start of secondary prompt (a continuation input line).

OSC "133;G" options "\007"

Start of user command (input), implicitly terminated by end-of-line.

I'm considering writing a concrete proposal, based on my more recent messages on this issue: Basically an extension of the iterm2/finalterm protocol, with support for options, multi-line inputs, and implicit input/command end (so it can be used if prompt strings are the only available "hooks"). Any comments (especially from @gnachman) before I do so?

Lack of support for multi-line inputs, each with their own prompt (like shell's PS2). The protocol could be extended by allowing more than one pair of A (FTCS_PROMPT) and B (FTCS_COMMAND_START) pairs (one pair for each line) before the C (FTCS_COMMAND_EXECUTED) command.

This is a good idea. I'd want to prove it's possible in bash, fish, and zsh first, though. Getting bash to do this stuff is a nightmare, which is why I've never attempted to do this.

The iterm2 protocol requires markers for start of output (FTCS_COMMAND_EXECUTED) and end of command (FTCS_COMMAND_FINISHED). Some shells/REPLs don't have a user callback/hook that can emit these commands, even if they have a user-setting prompt string. So it would be useful to have a variant of FTCS_COMMAND_START that is implicitly "closed" by end of line: The user-entered command is everything between FTCS_COMMAND_START and end-of-line. Similarly, FTCS_COMMAND_FINISHED could be implied by a new FTCS_PROMPT. Nested shells can be handled by an optional identifier (such as the pid of the shell): If a new prompt has the same key as a previous command, it implicitly marks that command as finished; it it's a new key, a new nested command is created.

This is the main source of pain in bash. An implicit EOL close might work, but you'd want to test the heck out of it with the various kinds of strange prompts that people use: right-hand side prompts, multi-line prompts, plus what happens when you ^C a command, etc. There are a lot of edge cases that are hard to find.

That nothwithstanding, nesting would be great. Opting in to features sounds interesting but I can't think of a solid use case for it offhand.

For your E code, what exactly is the key? Is it a way to tell which nested prompt you're talking to?

"For your E code, what exactly is the key?"

The E option would be used instead of A. The key could be any semi-unique string that identifies the "shell" instance - for example the process-id of the shell. The key can be empty for simplicity, though that would not handle nesting properly. (It might make sense to just use A with options instead of a new E code, but that seems likely to be ambiguous if the key is empty.)

"Is it a way to tell which nested prompt you're talking to?"

Yes. Plus it's a way to handle the case when you don't have explicit command termination (no easy reliable way to generate D codes). Before starting a new command, it ends (closes) any existing command with the same key. It creates a nested command if there is no existing open command with the same key.

Here is my first draft:

https://gitlab.com/PerBothner/terminal-specifications/blob/master/proposals/semantic-prompts.md

I took a peak at the ITerm2 sources and it looks like it supports extra undocumented sequences OSC 133;E, OSC 133;F, OSC 133;G, and OSC 133;H. So the OSC 133;E and OSC 133;F sequences in my proposal would conflict, so they probably need different letters.

It is also unclear if the extended OSC 133;D with options would cause a problem; I can't tell from reading the code. Perhaps that should have a new letter as well.

What do you think, @gnachman (and others)?

I made some updates to my draft. Specifically, I believe I've fixed incomatibilities with Iterm2.

I've continued to polish my proposal and in implementation (in the shell-integration branch of DomTerm).

I've written a (draft) blog article with a few screenshots. Suggestions appreciated.

This looks like a generally good improvement on shell integration. My main comments are:

Make sure it's actually possible to implement this on the currently popular shells (bash, zsh, and fish; and tcsh if you feel generous)
Avoid unnecessarily long values in control sequences, like click-move. If the user presses a key while echo is on, it can show up in the middle of a control sequence and break things. Shorter control sequences are less likely to have problems.
Consider not using OSC. Some terminals don't handle unrecognized OSCs well. I wish I had used some private bit of CSI instead.

I have fish and bash working pretty well, but still doing some fine-tuning. I think it may make sense to tweak the recommended protocol so the shell sends a separate "start of prompt" escape after the "start of command" escape. The reason is to better handle repaint-on-resize: The repaint command will re-send PS1 which should include the "start of prompt" escape, but it seems to be more robust if the repaint doesn't re-send the "start of command" escapes. This assumes you have a "pre-command" hook that can send the "start of command" (only).
"If the user presses a key while echo is on ..". Not sure when this would happen. You mean if input editing (a la readline) is disabled? Readline would of course disable tty echo.
Any suggestions? Which terminals don't handle unrecognized OSCs well? Do any of them include xterm in their TERM name? The related question: When should a shell enable these escape sequences? If operating systems start supporting shell integration in the default setup, what should we recommand? In the short term, this support will have to be enabled by each user in their .bashrc (or .config/fish or whatever) but perhaps once multiple terminals support it we could suggest builtin support in shells and/or system rc default files.

I have fish and bash working pretty well, but still doing some fine-tuning. I think it may make sense to tweak the recommended protocol so the shell sends a separate "start of prompt" escape after the "start of command" escape. The reason is to better handle repaint-on-resize: The repaint command will re-send PS1 which should include the "start of prompt" escape, but it seems to be more robust if the repaint doesn't re-send the "start of command" escapes. This assumes you have a "pre-command" hook that can send the "start of command" (only).

We're talking about a situation where this gets painted after a resize:

[start of prompt]Prompt[end of prompt]Command user is typing

Eventually the user hits enter, and you are left with

[start of prompt]Prompt[end of prompt]Command user is typing [beginning of output]Command output…*

So I'm a bit confused because both and should be part of PS1 and will just magically work. Not sure about right-hand side prompts though, I've never tried to deal with them. Multi-line prompts would work fine too, I expect. You'll want a way for the user to specify which line of the prompt they consider the first, because sometimes users have prompts that begin with a blank line.

"If the user presses a key while echo is on ..". Not sure when this would happen. You mean if input editing (a la readline) is disabled? Readline would of course disable tty echo.

The scenario is this: echo is on. PS1 begins with \e]133;lorem ipsum dolor sit amet\e\\. The shell does the equivalent of write(0, "\e]133;lorem ipsum dolor sit amet\e\\"). write returns 4 because a buffer somewhere in the kernel is full. The user presses X. The TTY driver adds X to the output buffer. The shell then calls write(0, "3;lorem ipsum dolor sit amet\e\\"). The terminal emulator receives \e]133X;lorem ipsum dolor sit amet\e\\ and the world is sad.

Any suggestions? Which terminals don't handle unrecognized OSCs well? Do any of them include xterm in their TERM name? The related question: When should a shell enable these escape sequences? If operating systems start supporting shell integration in the default setup, what should we recommand? In the short term, this support will have to be enabled by each user in their .bashrc (or .config/fish or whatever) but perhaps once multiple terminals support it we could suggest builtin support in shells and/or system rc default files.

VTE added a special case for this one, so you should be safe there. Last I checked SecureCRT still had trouble with unrecognized OSCs.

"We're talking about a situation where this gets painted after a resize: ..."

The problem is how does the terminal distinguish between the start of a new command vs repainting the existing command? True, the shell will delete the old prompt+input before repainting, but the problem is the shell doesn't know how many lines are being displayed on the terminal and where the cursor is: Due to race conditions, by the time the redisplay gets to the terminal, it may be wider or narrower than the shell expects. There may be another window-change notification on the way while the terminal is processing an already-obsolete repaint.

Many terminals will automatically re-flow lines when the width changes, which adds further complication: The terminal's local reflow duplicates and may confuse the re-display done by the shell. A partial solution is for the terminal to suppress re-flow for the active input line(s), but it doesn't fully solve the race condition: The re-display output may use fewer or more lines than the shell expects because the window width may have changed again in the meanwhile. So you end up with artifacts: Lines from previous output getting incorrectly erased, or parts of the input line displaying multiple times.

Getting this working robustly is tricky. Having the shell write a "start of command" escape that is not part of the prompt sequences makes it easier. (Though I'm not sure I have it quite right yet.)

"The scenario is this: echo is on. ..."

But this would only happen if shell input editing is disabled, right? (Which mode we should avoid breaking, of course.) Or do you mean a race condition where switching to raw mode hasn't fully taken effect in the tty driver, but the prompt has been partially written?

Regardless, I agree it's better to use shorter escape sequences.

this would only happen if shell input editing is disabled, right

I can't reproduce it often enough to debug, but I guess it's some kind of race where echo was not disabled in time. Come to think of it, this may be during PROMPT_COMMAND and not in the prompt.

I've made some updates to my proposal. It also includes scripts that seem to work pretty well for bash, fish, and zsh.

This seems to work pretty well - best with fish GitHub master (along with 2 not-yet merged pull requests, noted in the proposal).

One issue where I need help is compatibility with iterm2 "in the wild". The script uses a mix of "new" and "old" escape sequences that is likely to confuse an unmodified iTerm2 - most obviously the use of the OSC 133;Z escape sequence where iTerm2 uses OSC 133;D. The only reason to not use the latter is I'd like to pass in a aid option. How would sending OSC 133;D;_exit-code;aid=_identifier_ work - would iTerm2 be OK with that, or would the trailing ;aid=_identifier_ cause problems? (I can't test it since I don't have a Mac.)

Looking at the iTerm2 source code, it looks like OSC 133;D;_exit-code_;aid=_identifier_ would be ok.

I removed OSC 133;Z in favor of just using OSC 133;D with some other tweaks to not breaks things on legacy iTerm2.

It would be great if someone could tests the scripts (which I moved to the newproposals/prompts-data directory) on vanilla iTerm2.

Even better if someone could try implementing it on some other terminal beside DomTerm. While iTerm2 should have basic support, it would be nice to add Cursor motion from mouse clicks. Also, window resize can be tricky.

mentioned in merge request !6

I created a merge request !6.

No feedback? Nobody cares about shell integration?

Semantic markers for prompts

Designs

Child items ...

Activity

Admin message

Admin message

Semantic markers for prompts

Activity