Text node broken into multiple parts
In this PDF file a text node is broken into two parts and a character is left out
This string is found in the first invoice line
0,89MM TRÅD - 6 MASKER PR. TOMME
The string is broken into two and the Å
is left out
0,89MM TR
D - 6 MASKER PR. TOMME
The same happens with this text node NSH BINDETRÅD-VINDSELTRD 0,55
pdftohtml -s -i -nodrm -xml test.pdf out.xml
poppler 0.72