bidi issueshttps://gitlab.freedesktop.org/terminal-wg/bidi/-/issues2019-02-07T11:55:23Zhttps://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/1Bidi mode (BDSM)2019-02-07T11:55:23ZminttyBidi mode (BDSM)Page: The escape sequences
Section: Implicit vs. explicit
It would be helpful to describe bidi enabling more clearly in the mode sequences:
* Implicit mode (enable bidi): CSI 8 h
* Explicit mode (disable bidi): CSI 8 l
because that’s t...Page: The escape sequences
Section: Implicit vs. explicit
It would be helpful to describe bidi enabling more clearly in the mode sequences:
* Implicit mode (enable bidi): CSI 8 h
* Explicit mode (disable bidi): CSI 8 l
because that’s the terminal point of view.
As I understand, these are global modes, i.e. the whole display is immediately affected (?).
Mintty implements these sequences.
It also has another (private mode) sequence to disable bidi on the current line only.
This may be helpful for an application to draw menus and other boxes but leave bidi lines outside their area untouched.https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/2Clarification of directional mode (SCP)2019-02-18T10:32:58ZminttyClarification of directional mode (SCP)From ECMA-48 it is not completely clear whether this is something like RLO
or more like a preset paragraph embedding level.
As RLO control exists in Unicode, the latter would be more useful,
so that assumption makes more sense.
Mintty...From ECMA-48 it is not completely clear whether this is something like RLO
or more like a preset paragraph embedding level.
As RLO control exists in Unicode, the latter would be more useful,
so that assumption makes more sense.
Mintty experimentally implements as follows:
* SCP 0 selects mintty default behaviour; this means each line is handled like a separate paragraph, UBA rule P2 is applied initially.
* SCP 1 invokes the bidi algorithm with paragraph embedding level 0, rule P2 is skipped.
* SCP 2 invokes the bidi algorithm with paragraph embedding level 1, rule P2 is skipped.https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/3Application of directional mode (SCP)2019-02-07T12:33:04ZminttyApplication of directional mode (SCP)ECMA-48 is a bit fuzzy, or inconsistent, about how to apply SCP.
It shall be applied to the current line immediately, but apparently not to another line when moving to it (e.g. with CUP)?
If that’s the correct interpretation, some clarif...ECMA-48 is a bit fuzzy, or inconsistent, about how to apply SCP.
It shall be applied to the current line immediately, but apparently not to another line when moving to it (e.g. with CUP)?
If that’s the correct interpretation, some clarification would help.
In the current mintty implementation, it is furthermore interpreted like this:
SCP is applied to the current line (as specified) so it is immediately updated on the screen; also SCP is set as a cursor property (dual behaviour);
SCP mode is retained until another SCP is sent (except for terminal reset (soft/hard) which resets to default).
However, if the cursor moves (linefeed, CUP etc) the SCP mode is *not* applied immediately, but only when a character is actually written to the current position (this was more straight-forward to implement).
As an exception (also due to straight-forward implementation), writing a combining character only (e.g. directly after CUP) would not change the line property yet. Also, as this is a cursor property, it would be restored with a Restore Cursor operation (DECRC and related controls).
With this interpretation, I think it is not necessary to define or implement any special handling for other escape sequences (IL, EL, SU, ED, ...).https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/4Arabic joining2019-02-22T14:28:58ZminttyArabic joiningIn Arabic shaping, I was told it's a strict requirement to join LAM and ALEF (and some combined variations) into a LAM/ALEF ligature.
Mlterm does this and appends following text directly to the ligature, for a seamless text flow.
Mintt...In Arabic shaping, I was told it's a strict requirement to join LAM and ALEF (and some combined variations) into a LAM/ALEF ligature.
Mlterm does this and appends following text directly to the ligature, for a seamless text flow.
Mintty also applies the glyph joining but leaves a space after it. While the result is ugly, it is consistent with the character-based locale-related width properties, which I think a terminal should maintain (also for emojis as handled by mintty).
There should be a statement about this in the recommendation, maybe even a switchable mode. The preference or default should be the locale-consistent behaviour, I think.https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/5Autodetection2019-02-18T11:41:36ZminttyAutodetectionChapter "The basic modes" section "Autodetecting the direction" says
"UAX #9 doesn’t specify an exact algorithm for autodetecting the paragraph direction."
That's not quite correct; it does, it just adds as an exception
"Whenever a high...Chapter "The basic modes" section "Autodetecting the direction" says
"UAX #9 doesn’t specify an exact algorithm for autodetecting the paragraph direction."
That's not quite correct; it does, it just adds as an exception
"Whenever a higher-level protocol specifies the paragraph level, rules P2 and P3 may be overridden."
In chapter "The escape sequences" section "Other output modes" presents "a few weak arguments for picking the disabled state as the default".
However, UBA rules P2 and P3, i.e. autodetection, is clearly the default suggested by UBA.
I think this is a strong point to define autodection to be enabled by default.
In any case, I think this is an essential parameter to influence the actual handling, so it should not be hidden under "Other output modes" in the description; also as I've suggested elsewhere I think a DEC private mode does not properly reflect the importance of this setting.
In the test file, please add an explicit escape to disable autodetection initially as test evaluation depends on this setting.https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/6SCP with new parameters for autodetection2019-02-23T11:15:13ZEgmont KoblingerSCP with new parameters for autodetectionFrom https://gitlab.freedesktop.org/terminal-wg/bidi/issues/2#note_116279:
We could think of adding a new (third) parameter to SCP to specify the autodetection mode(s), rather than separate DECSET(s). I generally like this idea.
One pr...From https://gitlab.freedesktop.org/terminal-wg/bidi/issues/2#note_116279:
We could think of adding a new (third) parameter to SCP to specify the autodetection mode(s), rather than separate DECSET(s). I generally like this idea.
One problematic area though is to distinguish "set the default direction" (currently SCP 0 or SCP without a parameter) from "leave the default direction intact" (when there'd only be a third parameter to the redesigned SCP).https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/7Rework shaping2019-02-22T14:58:02ZEgmont KoblingerRework shapingAs per Richard's feedback on the Unicode mailing list ([Jan](http://www.unicode.org/mail-arch/unicode-ml/y2019-m01/), [Feb](http://www.unicode.org/mail-arch/unicode-ml/y2019-m02/) '19 – there's not a concrete mail to link to):
Shaping s...As per Richard's feedback on the Unicode mailing list ([Jan](http://www.unicode.org/mail-arch/unicode-ml/y2019-m01/), [Feb](http://www.unicode.org/mail-arch/unicode-ml/y2019-m02/) '19 – there's not a concrete mail to link to):
Shaping shouldn't be done using presentation form characters (in explicit mode). They have pretty limited capabilities, and even though aren't formally deprecated/discouraged, they should be avoided.
Shaping needs to be done by the emulator, on the characters is displays, even in explicit mode.
This approach can't count for the neighboring character if it's outside of the terminal's area designated for displaying that particular word (i.e. either completely outside of the terminal's area, or this area is used for something else, like a sidebar of the application or a vertical tmux split etc.). If required, a future extension could specify the invisible "neighbor" characters for each cell, a character (or at least its basic shaping-related properties) to imagine to be on the left for shaping purposes, and another one to imagine to be on the right.
There might also be a need for break points (e.g. where two semantically different fields touch each other on the UI), we should see if ZWNJ is suitable here (and where exactly it should be put in explicit LTR / explicit RTL modes), and whether ZWNJ is needed at all or the more generic "imaginary left/right neighbors" would make it unnecessary.
Anyway, these are for future extensions based on real life demand. In the first round, terminals should just perform shaping on the visual contents.https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/8Paragraph dir autodetection on bigger scope2019-02-23T11:03:14ZEgmont KoblingerParagraph dir autodetection on bigger scopeAs per Eli's feedback on the Unicode mailing list ([Jan](http://www.unicode.org/mail-arch/unicode-ml/y2019-m01/), [Feb](http://www.unicode.org/mail-arch/unicode-ml/y2019-m02/) '19 – there's not a concrete mail to link to):
Many text fil...As per Eli's feedback on the Unicode mailing list ([Jan](http://www.unicode.org/mail-arch/unicode-ml/y2019-m01/), [Feb](http://www.unicode.org/mail-arch/unicode-ml/y2019-m02/) '19 – there's not a concrete mail to link to):
Many text files are formatted in a way that they use a certain margin (let's say 72 characters) (using explicit newline characters, obviously), and chunks of human-perceived paragraphs (typically 3–15 lines or so) are delimited by empty lines (two consecutive newline characters). Examples include most of the well-known license files, or [TUTORIAL.he](http://git.savannah.gnu.org/cgit/emacs.git/plain/etc/tutorials/TUTORIAL.he) as contained within Emacs's source tree, or Markdown files...
In order to have the best possible automatic behavior when `cat`'ing such a file, the paragraph direction should be autodetected only once for each such emptyline-delimited segment; plus before and after shell prompts, which depends on [Semantic markers for prompts](https://gitlab.freedesktop.org/terminal-wg/specifications/issues/4).
Now, there's a huge confusion around the terminology. Such a file can of course be `cat`'ed on a terminal of 60 columns. Lines of the text file don't map to lines of the terminal, they map to "paragraphs" of the terminal (as both our specification and the Unicode BiDi Algorithm defines the term "paragraph"). And similarly, freaking confusingly, "human-perceived paragraphs" (emptyline-delimited segments) of such text files don't map to "paragraphs" of the terminal and the UBA. So perhaps let's call the emptyline-delimited and prompt-delimited human-perceived paragraphs "segments".
So, for each "segment", one single "paragraph direction" would be autodetected. Then this autodetected value would be applied on all the "paragraphs" (text file's lines) of this "segment".
(In the mean time, the idea of defining a "paragraph" as the emptyline-delimited parts, and running UBA on this as a whole, an idea that I present as a possible future extension, is utterly broken, as I agrue in one of the mails on the Unicode list, and should be dropped, or let's say superseded by this new mode.)
In this new mode, the fallback paragraph direction would matter less often than in the per-paragraph autodetection mode. There'd still be cases where it makes a difference, though. Emacs then uses the previous section's direction (or LTR at the very top). Not sure if we should also do so; or maybe even say that the shell prompt is a hard break where we shouldn't look back any further, while at empty lines it's fair to go back.
---
Note however that we're talking about a field where there's no clear definition, and pretty much all implementations differ. E.g. the aforelinked TUTORIAL.he shows up in three different ways in Emacs, Firefox and Chromium. (The contents of the file, with regard to necessary BiDi control chars around embedded English terms, were also built up with Emacs's rendering in mind, so in that sense the file is Emacs-specific.) I don't think it can reasonably expected from such a legacy world like terminal emulation to suddenly do something better than the mixture that these other apps do.
It might make sense to say that Emacs's is the most reasonable approach, however, it's not a strict rule. And let's keep in mind that the terminal emulator has to count for many vastly different use cases, `cat`'ing text files formatted in this particular way is just one of them. However, a utility outputting most of its output in one particular language resembles this use case pretty much.
With these in mind, and the yet unresolved dependency on shell prompt detection, I'm really uncertain if we should define such a mode, let alone make it the default. On the other hand, probably this mode would provide the best out-of-the-box user experience, so it sounds reasonable to make it the default.
But once we can detect the shell prompt, isn't it even better to go bigger by yet another step, and autodetect the directionality on the utility's entire output as one, not caring about empty vs. nonempty lines?
---
Yet another thing to ponder about is for terminal emulators to add options to their right-click menu that retrospectively alters the given paragraph's direction. So the user does a "cat file", notices its bad paragraph direction, right-clicks, picks "RTL", and it's repaired! Again something that should work on larger scope, presumably a utility's entire output, and thus depends on the semantical prompt feature.
---
Another implication of this new mode is that in VTE we'd practically have to switch to fullscreen repaints all the time; aiming for any smaller scope (which would mean "segments") just significantly overcomplicates everything (having to detect whenever an empty lines becomes nonempty or the other way around, or other crazy sorts of optimizations) for marginal benefits (still repainting a lot).https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/9"The escape sequences" -> "DCSM"2019-05-28T12:17:54ZEgmont Koblinger"The escape sequences" -> "DCSM"Add such a brief subsection stating that this escape sequence has been revoked by this spec, DCSM forced to DATA (SET state).Add such a brief subsection stating that this escape sequence has been revoked by this spec, DCSM forced to DATA (SET state).https://gitlab.freedesktop.org/terminal-wg/bidi/-/issues/10DECSTBM and soft/hard wrap2019-06-29T22:06:45ZChristian PerschDECSTBM and soft/hard wrapA thought I had while reviewing the vte bidi branch is that DECSTBM might set lines to hard wrap at the boundaries directly upon /entering/ restricted scrolling mode, instead of this only happening later when actual restricted scrolling ...A thought I had while reviewing the vte bidi branch is that DECSTBM might set lines to hard wrap at the boundaries directly upon /entering/ restricted scrolling mode, instead of this only happening later when actual restricted scrolling occurs.