pdftohtml -xml type 3 font without size / height on result

I am facing a zero height font and bounding box situation when converting a PDF with embedded type 3 fonts. It is clear what the root cause of this incorrect height is - it is called out with a comment in the code (and these fonts do not include an embedded 'm' character, so the current hack fails) - https://gitlab.freedesktop.org/poppler/poppler/-/blob/master/utils/HtmlOutputDev.cc#L306 :

...
        // This is a hack which makes it possible to deal with some Type 3
        // fonts.  The problem is that it's impossible to know what the
        // base coordinate system used in the font is without actually
        // rendering the font.  This code tries to guess by looking at the
        // width of the character 'm' (which breaks if the font is a
        // subset that doesn't contain 'm').
        for (code = 0; code < 256; ++code) {
            if ((name = ((Gfx8BitFont *)font)->getCharName(code)) && name[0] == 'm' && name[1] == '\0') {
                break;
            }
        }
        if (code < 256) {
            w = ((Gfx8BitFont *)font)->getWidth(code);
            if (w != 0) {
                // 600 is a generic average 'm' width -- yes, this is a hack
                fontSize *= w / 0.6;
            }
        }
...

I am wondering if I should expand this sort of hack to include more characters to give it a better chance to set this fontSize value, or possibly use the glyphs themselves to produce this fontSize value?

Edited May 16, 2022 by Brian Rosenfield

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information