Wrong detected encoding utf-8 instead of cp855
Hello!
I found an example where uchardet wrong detected encoding utf-8 instead of cp855.
I debuged my usage (in this for-loop):
- multi-byte prober: utf-8 with confidence = 0.752499998
- single-byte prober: cp855 with confidence = 0.685687244
Using a different the implementation by UTF Unknown, I got the expected result:
- multi-byte prober: only check and get GB18030Prober object with confidence = 0.01
- single-byte prober: cp855 with confidence = 0.8776797