Recommended checking order does not reflect the reference implementation (nor the test suite).

Hi,

I'm currently writing an implementation for the shared-mime-info specification, using the test suite in this repository.

In the recommended checking order section, the spec says:

Keep only globs with the biggest weight. If the patterns are different, keep only globs with the longest pattern, as previously discussed. If after this, there is one or more matching glob, and all the matching globs result in the same mimetype, use that mimetype as the result.

However implementing this particular behaviour causes a lot of tests to fail.

For example, the rules for the text/x-python types:

<mime-type type="text/x-python3">
  <sub-class-of type='text/x-python'/>
  <magic priority="60"><!-- higher priority than text/x-python -->
    <!-- magic rules omitted for brevity -->
  </magic>
  <glob pattern="*.py"/><!-- lower priority than in text/x-python -->
  <!-- other globs omitted for brevity -->
</mime-type>
<mime-type type="text/x-python">
  <magic>
    <!-- magic rules omitted for brevity -->
  </magic>
  <glob pattern="*.py" weight="60"/>
  <!-- other globs omitted for brevity -->
</mime-type>

when tested against the test3.py file using the recommended algorithm would always return text/x-python.

Both *.py globs would match, but the glob for text/x-python3 would be discarded for having a lower weight, leaving the glob for text/x-python as the only result, therefore skipping magic sniffing entirely.

So it seems to me that either:

the spec should be fixed to reflect the acual behaviour of the reference implementation, or
the reference implementation (including test suite and database files) should be fixed to reflect the spec behaviour.

Edited Jun 11, 2022 by ju1ius