pdfimages can't extract PNGs losslessly
When pdfimages
is used to extract images in supposedly lossless formats (e.g. PNG), it performs a destructive color space conversion, making it impossible to recover the original pixel data.
Paradoxically, this means that pdfimages
fares better in recovering JPEGs (which is a lossy format) than PNGs, as the JPEG data from a DCTDecode
-filtered stream is reproduced as is.
I think pdfimages
should either:
- Embed the color profile of the image object in the output PNG. (i.e. an
ICCBased
color profile can simply be embedded in the PNG as aniCCP
chunk) - Warn the user that a destructive color space conversion will take place.