Incorrect implementation of prependUnicodeMarker()
The main purpose of the BOM is to inform a reader about a byte order. So this isn't correct -
void GooString::prependUnicodeMarker()
{
insert(0, "\xFE\xFF", 2);
}
Let me demonstrate the consequences:
std::string s8 = u8"test";
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;
std::u16string s16 = converter.from_bytes(s8);
GooString gooStr = GooString((const char *) s16.c_str(), s16.length() * 2);
gooStr.prependUnicodeMarker(); // gooStr's content is FE FF 74 00 65 00 73 00 74 00.
QString qStr = UnicodeParsedString(&gooStr);
printf("qStr=%s\n", qStr.toUtf8().constData()); // Prints "qStr=琀攀猀琀".
My implementation:
void GooString::prependUnicodeMarker()
{
static const uint16_t BOM = 0xFEFF;
insert(0, (const char *)&BOM, sizeof(BOM));
}
Now gooStr
's content is FF FE 74 00 65 00 73 00 74 00 and printf()
prints "qStr=test".