Questionable Fallbacks and Rendering Errors with Microsoft PS/MT Fonts
Affected file: 2020_General_Information_Handbook.pdf
Also available at: http://jetprogramme.org/wp-content/MAIN-PAGE/COMMON/publications/2020GIH_e.pdf
Two sections with obviously incorrect rendering, when using evince. I was recommended in #evince over at GIMPNet to report these issues here. I believe the issues stem from the same cause, so I'll first summarise the symptoms, then talk about my system setup, and then what I think the problem might be.
1) The macron diacritic, e.g. in words like jūsho. Look at the bottom-left of page 2.
How it looks in evince:
How it looks in Firefox (pdf.js) or mupdf:
As far as it's possible to tell, the firefox/mupdf rendering is essentially correct.
I ran pdftoppm on the affected page and it exhibits the same issue: page-02.ppm This is also the case when using pdftocairo, though I'm not sure what output format would be most useful.
2) The section header "Overview of the JET Programme" on the top of page 92 / 94 (depending on how you count the pages) consists of the same text overlaid twice for visual emphasis.
How it looks in evince:
How it looks in Firefox (pdf.js) or mupdf:
The evince rendering is obviously wrong - the firefox/mupdf rendering is obviously not what the document creator intended but is obviously better.
I ran pdftoppm on the affected page and it exhibits the same issue: page-94.ppm This is also the case when using pdftocairo, though I'm not sure what output format would be most useful.
I'm running Arch Linux with the ttf-ms-win10 fonts package installed - all the fonts were sourced from a Windows 10 install on another machine I own.
poppler 0.88.0-1
evince 3.36.4-1
firefox 77.0.1-1
mupdf-gl 1.17.0-1
I think the problem comes down to how some of the fonts in the PDF are not embedded, and that the fonts are being resolved to alternatives that are not sensible.
TimesNewRomanPSMT -> Times New Roman [looks good]
TimesNewRomanPS-BoldMT -> Verdana Bold [very silly, one is serif and the other is sans-serif]
TimesNewRomanPS-ItalicMT -> Verdana Italic [same problem]
TimesNewRomanPS-BoldItalicMT -> Verdana Bold Italic [same problem]
As far as I can tell, this substitution isn't coming from the fontconfig sections that are on my system or provided by the ttf-ms-win10 package; and it also doesn't seem possible to force-override these substitutions to just point to Times New Roman (though I would appreciate any assistance on IRC with trying to achieve this).
The issue looks very similar to this old unresolved one with fontconfig: https://bugs.launchpad.net/ubuntu/+source/fontconfig/+bug/1693709
I would be more than happy to provide more details either here or on IRC (Freenode #poppler) as aphirst
.