Poppler fails to open some password encrypted PDF which other readers can open (after repairing)
In this downstream Evince issue, a reporter mailed me a Password protected PDF document generated by his turkish online bank which he could not attach to gitlab as it contained personal data.
The PDF metadata shows it was created by JasperReports Library version 6.9.0
and the PDF opens fine in Acrobat Reader, Firefox, and in Xpdf 4.04
.
Xpdf 4.04
shows the PDF had to be reconstructed to be opened, the Xpdf error window shows:
Syntax Error: Couldn't read xref table
Syntax Warning: PDF file is damaged -attempting to reconstruct xref table...
Xpdf 4.04
is able to open this file because it has an updated "repairing code", my assumption comes from the fact that Poppler
's XRef.cc
has the comment shown below while XRef.cc
in Xpdf 4.04
does not have that comment anymore, and the code seems refactored to support the aforementioned repairing from inside object streams.
// Attempt to construct an xref table for a damaged file.
// Warning: Reconstruction of files where last XRef section is a stream
// or where some objects are defined inside an object stream is not yet supported.
// Existing data in XRef::entries may get corrupted if applied anyway.
bool XRef::constructXRef(bool *wasReconstructed, bool needCatalogDict)
{
//bla bla
}
We found where Firefox PDF.js
fixed this same bug:
and good news they had a testcase pdf file which we could use publicly:
-
issue15893_reduced.pdf password is
test
Anyway, I'm sure our Evince user Sabri Ünal would mail privatly the bank PDF file to any Poppler developer interested in trying that too.
So, I went ahead and took the time to locate the involved Xpdf 4.04 code and migrate it to Poppler, with sucessful results in opening the problematic password encrypted files. I'll be sending MR's.