Some characters have multiple Cangjie 3 codes. For any such character, the codes we have are ordered alphabetically. This comes from the original data we got when we started working on this with Wan Leung; the whole data was indexed by code, alphabetically:
However, we are about to split multiple codes for any given character so that only the first one has the non-zero frequency, and all additional codes have a frequency of 0. (see #104)
A prerequisite to that is that the multiple codes are actually ordered correctly.
This commit fixes the ordering of Cangjie 3 codes for many Chinese characters with more than one of them.
The changes to the data in this commit were made manually, painstakingly comparing our results with the ones from Windows, which we take as the reference implementation for Cangjie 3.
For example, on Windows 沉 only has code
ebhu. This means that in
Cangjie 3 we should have
ebhu as the primary code for that character.
ebhn but it should come second, so that it doesn't interfere
with the expected ordering when we actually implement #104.
All other changes in this commit went through the same process of comparison.