Extracting all images present in the PDF

Issue Description Using the pdftohtml generated an XML from the PDF. But the boxes containing the text 'Test' are not extracted as separate images

Working Files:

PDf used PHT3.pdf for conversion.
Generated XML file: PHT3.xml
Extracted Images: PHT3.zip

Would be great if you could share your advice on how to resolve this.

Attaching more information just in case it helps.

System Information:

Windows 10 Home
Version: 2004

Poppler Details:

pdftohtml version 0.68.0
Copyright 2005-2018 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2011 Glyph & Cog, LLC

Thanks!

Edited Aug 19, 2021 by Nikhil Ranka

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information