pdftotext -htmlmeta should quote text content
Submitted by Jean-Francois Dockes
Assigned to poppler-bugs
Special HTML characters (
<>&"') inside the main text or PDF metadata (e.g.: title) are not escaped in the HTML output, possibly resulting in invalid HTML.
This is trivial to reproduce, but, if you need a sample doc, just ask...