pdftohtml: Cannot parse text from OCRed document (tesseract 4.0.0)

Using poppler-utils 0.75.0, my OS is Ubuntu 18.04

Images that were converted by tesseract to searchable pdfs cannot be transformed to html, only the images are rendered and the text is ignored. Have attached example which was produced using http://www.orimi.com/pdf-test.pdf and the following order of commands:

convert pdf-test test.jpg
tesseract test.jpg test pdf
pdftohtml test.pdf

test.pdf pdftohtml_output.zip

Edited Mar 25, 2019 by Vassilis Lemonidis

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information