Skip to content

pdftohtml: single-page HTML files using data-urls, stdout-capable

Greg Knight requested to merge lyngvi/poppler:dataurls into master

These changes add the command line argument -dataurls to the pdftohtml utility. When used with -s, this allows a user to write to a single HTML file while preserving imagery from the PDF. Images are stored in the HTML as data URLs (RFC 2937). The automatic squashing of the stdout flag based on -s was removed, as it works quite well and makes sense for this application.

In order to avoid extensive rework against ImgWriter and its friends, I am using the GLIBC-specific fopencookie method. Header-guards prevent this feature from being activated on Android or MinGW, where fopencookie is not available.

I tested against 0.70.1 and the master; the current master (8315a1234) is subject to a bug fixed in my branch goostring-fromint-fix (see other pull request).

Please let me know if any further information is needed.

Edited by Greg Knight

Merge request reports