pdftohtml very slow if pdf uses tiling pattern fill
Submitted by Arne de Bruijn
Assigned to poppler-bugs
Description
Created attachment 91677 slow pdf
pdftohtml is very slow if the pdf uses tiling pattern fill. The attached pdf takes many hours to process.
Some backtraces show this is caused by Gfx::doTilingPatternFill/Gfx::drawForm/Gfx::pushResources which reloads all fonts many times.
I saw the ImageOutputDev backend has an empty tilingPatternFill method to "avoid the potentially slow loop in Gfx.cc". Since HtmlOutputDev has similar image handling to ImageOutputDev it seems appropriate to add this empty method to HtmlOutputDev as well. This indeed solves the slowness.
Attachment 91677, "slow pdf":
IPOL-JOIN_ET_2013_510979_ANN02__EN.pdf