Allow page ranges in pdftohtml
Submitted by ulatekh
Assigned to poppler-bugs
Link to original bug (#107419)
Description
Created attachment 140875 Patch to add functionality
I'm using pdftohtml to extract information from PDFs and organize the results into a database, so I had a chance to dig through the code.
The patch adds a "-pg" command-line option to pdftohtml, to allow noncontiguous ranges of pages to be specified.
I don't know what the policy is on using Boost inside of poppler, but I can hand-write a simple integer interval-set if it's a problem.
The "-pg" command-line option may be useful in other utilities, e.g. pdfseparate.
Patch 140875, "Patch to add functionality":
0004-Now-pdftohtml-takes-a-pg-parameter-with-a-list-of-pa.patch