Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
Seems to be an issue with the "magic" for the MPEG-2 TS entry, it haspriority "50". The Linguist entry also has an implicit priority of 50(the default priority)If the MPEG-2 TS priority is increased to 70 in an "override.xml" file(in /usr/share/mime/packages folder), this fixes this issueLogged in KDE as: https://bugs.kde.org/show_bug.cgi?id=440631originally: https://bugs.kde.org/show_bug.cgi?id=420939Format For Printing - XML - Clone This Bug - Top of page Home | New | Browse | Search |
[?] | Reports | Requests | Help | New Account | Log In | Forgot Password
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Which version of shared-mime-info is this with? It does not reproduce with all files (we have samples in our test suite for which it doesn't), so we'd need a reproducer file to be able to diagnose further.
I'm using the freedesktop.org.xml included in Neon Testing.
Sorry, I don't see a version number in the file, but looking through your archives it seems to match your 2.0 version (I did a vimdiff of freedesktop.org.xml.in and /usr/share/mime/packages/freedesktop.org.xml). I guess Neon should move up to 2.1, although I'm not sure this will affect this issue.
It may be that I'm not grasping this properly, but if I try to follow the "Recommended Checking Order" in:
If I have a file test.m2ts, this will match the glob ("*.m2ts") for video/mp2t
If I rename the file to test.ts, it will match the globs for both video/mp2t and text/vnd.qt.linguist. Neither have an explicit weight, the situation is therefore "multiple conflicting mimetypes".
In that case look at the "magic" and there is a "magic" match for video/mp2t
What should happen now?
... If any of the mimetypes resulting from a glob match is equal to or a subclass of the result from the magic sniffing, use this as the result ...
There is a mimetype from the "glob match" that is equal to the "result from the magic sniffing", so I would have thought it'd return "video/mp2t"...
Looking at a MPEG-2 TS, test.ts, misidentified as text/vnd.qt.linguist
Whereas when trying to construct a "reproducer", these returned video/mp2t
However progressively truncating test.ts...
when truncated to 4445 bytes, still misidentified as text/vnd.qt.linguist
one byte further, to 4444 bytes, it is correctly identified as video/mp2t
That's interesting, and Ah!!! There's a sequence "dX %" at position 4441 of test.4445.ts
This is the 0x64582025 magic value (priority 60) for audio/vnd.dts.hd
The test.ts sample embedded some DTS HD audio, that's another level of complexity...
So, as before:
If I rename the file to test.ts, it will match the globs for both video/mp2t and text/vnd.qt.linguist. Neither have an explicit weight, the situation is therefore "multiple conflicting mimetypes".
and then:
Look at the "magic" and there are matches for video/mp2t and audio/vnd.dts.hd
So there's something pretty strange happening; there are mimetypes from the "glob match"
video/mp2t
text/vnd.qt.linguist
and results from the "magic"
video/mp2t
audio/vnd.dts.hd,
and the xdg-mime query returns text/vnd.qt.linguist
Attached two artificial "reproducers", test.4444.ts (that is correctly identified) and test.4445.ts (that is misidentified as text/vnd.qt.linguist)
Hope this helps (and I do hope I've got this right...)
I'm not sure what "Neon Testing" is, but shared-mime-info 2.1 includes the already mentioned commit 6b207a2c which will decrease false linguist positives. With it, neither of the test files provided here result in linguist magic match. The "file" match still results in linguist due to fallbacks though. (Did not check if older versions do. There are still name matches due to tie with mp2t, but not sure if there's anything to be done about that, anyway let's see about the magics first.)
The issue with 4445 magic match resulting in vnd.dts.hd persists though. I could not locate a reference where the magic of that type would be specified, and the IANA registration has "None" for it: https://www.iana.org/assignments/media-types/audio/vnd.dts.hd
It's the "raw" version of KDE Neon, https://neon.kde.org/download, things should be fixed there before being rolled out further. It's based on Ubuntu 20.04 LTS so it could be that's where the freedesktop.org.xml comes from
... shared-mime-info 2.1 includes the already mentioned commit 6b207a2c which will decrease false linguist positives ...
OK, I see this change included in Fedora 34. I'll check there...
... With it, neither of the test files provided here result in linguist magic match
The test files stripped out everything except the "minimum" video/mp2t and audio/vnd.dts.hd bytes. You can check with "od -c test.4445.ts". There's no "<TS" (as is, or as "<TS " or "<TS>"). Would mean there's some strange leakage from the glob match.
But I'll get back after looking on F34 - and maybe I can copy the freedesktop.org.xml file from that back to my test system.
The issue with 4445 magic match resulting in vnd.dts.hd persists though
This seems to overlap with issue #167 (closed) where there the "magic" gives matches for both the container and the embedded audio.
I could not locate a reference where the magic of that type would be specified
Get some hits when searching for 0x64582025, but as to whether any of these are reliable sources :-)
There's no code in KDE Frameworks for mimetype determination. It's all done by Qt Core.
In doubt, please use kmimetypefinder5 to test what KDE/Qt thinks the mimetype is.
Testing xdg-mime query is not ideal since it forwards to various different implementations.
There's just one interaction with KF5, it's the fact that kcoreaddons installs a mimetype definition file called kde5.xml, but I don't see anything related in it (and it hasn't changed in years).
If I copy the freedesktop.org.xml from F34 to Neon Testing, I get:
Doesn't this amount to copying an old file (without the linguist fix) onto a newer file (since Neon Testing is likely to be much more bleeding edge than Fedora 34)? shared-mime-info version numbers would avoid guessing, in any case.
Doesn't this amount to copying an old file (without the linguist fix) onto a newer file (since Neon Testing is likely to be much more bleeding edge than Fedora 34)? shared-mime-info version numbers would avoid guessing, in any case.
Yes, I would have said so, and I have looked for a version number in the freedesktop.org.xml file. I feel I am missing something obvious 8-/
Short sequence of bytes within the first 18725 in a file, priority higher than default, somewhat doubtful references to the "specifications" for the magic. I guess the intent of the priority was to make it take precedence over audio/vnd.dts, but implemented like this it is prone to false positives. It should have copied over the magic matches of audio/vnd.dts and applied its additional one on top of those, they don't inherit via sub-class-of. Will file a MR for this.
It is very clear to me that your Neon Testing distribution is using the older shared-mime-info 1.15-1.
There's no need to include version numbers in each and every file installed by a package. The package manager knows which version is installed, as you saw.
This is likely a naive test and probably not awfully helpful..
I was not able to sort out what was happening when I tried to build "reproducers" from scratch (the test.4444.ts and test.4445.ts files mentioned above).
I still see strange behaviour, particularly with the test.4445.ts file and KDE (querying with kmimetypefinder), but at the moment that probably fits better in https://bugs.kde.org/show_bug.cgi?id=440631