Searching for two words only works in single lines with some pdf files
Summary
Searching for two words only works in single lines with some pdf files
Description
I found that while searching for two (or more) words Evince will not show results where the first word is at the and of a line and the second is at the beginning at a new line.
This surely happens with files exported from LibreOffice, but these files can be correctly searched in Okular and Qoppa PDF Studio.
I attached an example pdf. Try searching in it for:
take steps
refused protection
Solution
The problem is the search code which Poppler's glib uses TextTextPage::findText()
currently does not support matching across two lines when the second line falls in the next paragraph. And pdf files exported from Libreoffice docs with line spacing > 1.5 are interpreted by Poppler as each line being a paragraph itself (due to line spacing).
Regardless of Poppler's paragraph detecting code could be improved, an obvious fix is to make TextTextPage::findText()
to also work from last line of a paragraph to first line of next paragraph, that's what the MR submitted does.