Skip to content

fix for issue #39 (gb18030 encoding test)

Pedro López-Cabanillas requested to merge plcl/uchardet:devel into master

The gb18030 test fails, reporting the sample text as Macedonian language encoded with windows-1251. This is because 1: the Macedonian language model is very optimistic and reports high confidence with the given sample, and 2: the original sample text is extremely short and lacks language variety.

By simply adding a good amount of real Chinese literature to the sample file, the test no longer fails.

This text has been extracted from Wikipedia:

Merge request reports