FontConfig use the wrong encoding to decode NameRecord

Currently, FontConfig doesn't use the right encoding to decode the NameRecord for the microsoft platform: https://gitlab.freedesktop.org/fontconfig/fontconfig/-/blob/d863f6778915f7dd224c98c814247ec292904e30/src/fcfreetype.c#L88-103

Here is how GDI and DirectWrite does: https://github.com/MicrosoftDocs/typography-issues/issues/956#issuecomment-1205678068

In brief, if you want to emulate GDI, here is the logic in python:

if platformID == TT_PLATFORM_MICROSOFT:
	if platEncID == TT_MS_ID_PRC:
		return "cp936" #
	elif platEncID == TT_MS_ID_BIG_5:
		if nameID == TT_NAME_ID_FONT_SUBFAMILY:
			return "utf_16_be"
		else:
			return "cp950"
	elif platEncID == TT_MS_ID_WANSUNG:
		if nameID == TT_NAME_ID_FONT_SUBFAMILY:
			return "utf_16_be"
		else:
			return "cp949"
	else:
		return "utf_16_be"

Important to note, in GDI and DirectWrite, when the encoding is not "utf_16_be", it removes the leading zeros for each Double Byte.

Example in python:

# This bytes is from the font 文鼎中特廣告體 - Download here: http://fonts.top/Arphic-Fonts/41459.html
string = b"\x00\xa4\x00\xe5\x00\xb9\x00\xa9\x00\xa4\x00\xa4\x00\xaf\x00S\x00\xbc\x00s\x00\xa7\x00i\x00\xc5\x00\xe9"
if platformID == TT_PLATFORM_MICROSOFT and encoding != "utf_16_be":
	name_to_decode = string.replace(b"\x00", b"")

Just for information, the logic is a bit different for the CMAP:

if platformID == TT_PLATFORM_MICROSOFT:
	if platEncID == TT_MS_ID_SYMBOL_CS || platEncID == TT_MS_ID_UNICODE_CS || platEncID == TT_MS_ID_UCS_4:
		return "utf_16_be"
	elif platEncID == TT_MS_ID_SJIS:
		return "cp932"
	elif platEncID == TT_MS_ID_PRC:
		return "cp936"
	elif platEncID == TT_MS_ID_BIG_5:
		return "cp950"
	elif platEncID == TT_MS_ID_WANSUNG:
		return "cp949"
	elif platEncID == TT_MS_ID_JOHAB:
		return "cp1361"

Edited Aug 08, 2023 by moi15moi

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information