glib: mismatch between find text results coordinates and their corresponding utf8 characters in text
The bug happens in many pdf's (but not all, depends on the text), I could even reproduce it with one of the poppler test pdf files. See below.
How to reproduce:
- Open searchAcrossLines.pdf in Evince.
- Search for "cubo" text, one single match should appear on the first page.
- Notice how the matched text shown in bold in the sidebar is wrong, i.e. instead of a bold cubo (which is the matched text) it shows bo M in bold.
The problem is that Evince, having the graphical coordinates of the matched text, is unable to correctly locate it in the text from
I could tracked the bug to be caused by my commit d6cccfb8, and the reason for that is because that logic change (respecting
spaceAfter property) in the
TextSelectionDumper::getText() code, must also be mimicked in the similar code of
poppler_page_get_text_attributes_for_area() functions, because those are used together by Evince to map between graphical coordinates of text and their corresponding position in the utf8 text from
So, I'm sending a MR with the mentioned fix and with a glib test that catches this bug.