pdftohtml: fakebold and dropshadow duplicated text
Submitted by Jason Crain
Assigned to poppler-bugs
Link to original bug (#101807)
Description
If you run pdftohtml on the PDF in bug #101770 (https://bugs.freedesktop.org/attachment.cgi?id=132659) It results in duplicated and jumbled characters.
Some PDFs draw text multiple times to emulate bold text or drop shadows. The main TextOutputDev goes to a lot of trouble to remove this duplicated text. pdftohtml should do this too.