Text selection is broken in RTL scripts
Submitted by Germán Poo-Caamaño
Assigned to poppler-bugs
Created attachment 82000 PDF Test case
I am reporting to poppler-glib, though it might be a general issue on how PDF with RTL scripts are handled in poppler. As per reported in GNOME Bugzilla:
"When I try to select text written in an RTL script (Hebrew, in my case) the selection algorithm treats it like LTR text. This means that if I start a selection in the midlle of a line in an RTL paragraph and move the cursor to the middle of the next line, the selection spans to the right on the first line and from the left up to the cursor in the next line. This is okay for LTR, but in RTL scripts it should work exactly the opposite. The expected behaviour is just like the GtkTextView widget's."
I tested it also with poppler-glib-demo. It is reproducible when selecting multiples lines. One of the comments has a clue of the problem: https://bugzilla.gnome.org/show_bug.cgi?id=326083#c5
"PDF stores Hebrew internally as visual Hebrew (LTR), so Envice will have to detect Hebrew text and work around this. It seems like Envice Evince 0.5.2 Using poppler 0.5.1 (splash) correctly reverses Hebrew when pasting to gedit, but does not do the selection properly (like Yaron noted)."
Attachment 82000, "PDF Test case":