Some characters have surprising x-disambiguation codes
I've been staring a lot at our data lately, doing some cleanups and thinking about #55, #91 and #104.
If I understood everything right, the x-disambiguation works as follows:
- characters A and B both have code
abc
- since A is more frequent, B is given an additional x (
abcx
Cangjie 3,xabc
in Cangjie 5) - as a result, B will have codes
abc
andabcx
. (orxabc
in Cangjie 5)
The above stands true for pretty much all of our data, except for 8 characters, which have an x-disambiguated code without the corresponding non-x code:
- 亟 has CJ3 codes
mem
andnemx
- 妒 has CJ3 codes
vhs
andvisx
- 扁 has CJ3 codes
hsbt
andisbtx
- 毋 has CJ3 codes
wj
andwkx
- 袍 has CJ3 codes
fprux
andlpru
- 鼎 has CJ3 codes
buux
andbuvml
- 鼐 has CJ3 codes
nhbux
andnsbul
- 覇 has CJ5 codes
mwtjb
andxmbtj
I'm not sure what to do with these. Is something wrong with them? Or are they just fine and my assumptions were unfounded?
@yookoala do you have any idea?