[pdftoppm] Inconsistent PPM-root page numbering/output file naming
When extracting pages with the following command:
pdftoppm -f 1 -l 1 -cropbox -png myfile.pdf myfile
, the program is not consistent with the output file numbering. If myfile.pdf
is less than 10 pages, the output file will be myfile-1.png
, but if it has 10 or more pages, it will output myfile-01.png
.
I believe this is a bug since it makes it hard to programmatically access the resulting files. It would make more sense in my opinion if the output file's page number wasn't prefixed with 0
for numbers 1 to 9.
I think the issue is here: https://gitlab.freedesktop.org/poppler/poppler/-/blob/master/utils/pdftoppm.cc#L683-689 although I don't know C enough to understand why it varies. I suspect pg_num_len
is set to two digits for page counts between 10 and 99 pages, but to one for page counts between 1 and 9, and so the sprintf
formatting pads with zeroes (from https://gitlab.freedesktop.org/poppler/poppler/-/blob/master/utils/pdftoppm.cc#L619)
$ pdftoppm -v
pdftoppm version 0.86.1 Copyright 2005-2020 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011 Glyph & Cog, LLC