video .ts files are not well recognized and mistaken for text/vnd.trolltech.linguist
Submitted by Jörg Höhle
Assigned to Shared Mime Info group
Description
Hi,
my set top box produces .ts (Transport Stream) files that the MIME database does not recognize correctly. It mistakes them for text files. The MIME DB was changed in 2010 to distinguish MPEG videos mistaken as Trolltech/Linguist files: see bug #14276 and lp: #502642 at https://bugs.launchpad.net/ubuntu/+source/shared-mime-info/+bug/502642 However the filter defined back then appears too restrictive in light of the files produced by my STB (a Dreambox from Dream Multimedia GmbH, famous for the mod'able Enigma Python engine that drives the TV GUI and web interface).
This is my proposed change:
diff -U3 /rofs/usr/share/mime/packages/ /usr/share/mime/packages/freedesktop.org.xml
--- /rofs/usr/share/mime/packages/freedesktop.org.xml 2015-10-23 23:52:35.000000000 +0200
+++ /usr/share/mime/packages/freedesktop.org.xml 2017-01-23 01:44:11.218484934 +0100
@@ -33852,7 +33852,7 @@
<acronym>
MPEG-2 TS</acronym>
<expanded-acronym>
Moving Picture Experts Group 2 Transport Stream</expanded-acronym>
<magic priority="50">
-
<match value="0x47400010" type="big32" offset="0" mask="0xff4000df"/>
-
<match value="0x47000000" type="big32" offset="0" mask="0xffa000c0"/>
</magic>
For reference, see https://en.wikipedia.org/wiki/MPEG_transport_stream about what bits I expect unset (0) or leave unchecked (=) in the header: 0 : Transport error indicator = : Payload Unit Start Indicator -- apparently cannot count on that bit 0 : Transport Priority -- might be set, but I've not seen it = : PID 0 : Transport Scrambling Control (unscrambled, as before) = : Adaptation Field Control -- why require payload? = : Continuity Counter -- it's appears plainly wrong to expect 0 here.
These rules still prevent recogizing 'GIMP' as video (cf. launchbad bug above).
Following (as root)
update-mime-database -n -V /usr/share/mime
the above change causes all files from my STB (over 1000) to be recognized as videos (no logout required). Linux mint/cinnamon's file manager nemo started to create and display thumbnails, yeah! Alternatively, one can use the CLI tools gnomevfs-info, gvfs-info, xdg-mime or mimetype to check what the MIME DB thinks of sample files.
Here's a selection of "hexdump -C | head -1" from my set top box: 00000000 47 40 00 1e 00 00 b0 21 04 1b e1 00 00 00 00 e0 |G@.....!........| 00000000 47 42 94 14 00 00 01 e0 00 00 8c 80 05 2f 03 c1 |GB.........../..| 00000000 47 53 f7 38 07 70 bf 5e ce 09 fe 68 00 00 01 e0 |GS.8.p.^...h....| 00000000 47 00 24 11 02 2c ca e4 26 92 04 b9 29 b9 e9 b6 |G.$..,..&...)...| 00000000 47 02 a8 1f 02 2c d6 e4 31 1c e0 2a 86 e6 04 ce |G....,..1..*....| 00000000 47 00 e6 12 02 2c ea e4 31 92 61 8c 9d 5d 8c 9d |G....,..1.a..]..|
Hmm, based on these samples, the "payload present" bit could be required too, just as it did before. (I'll check that later against my 1000 files). That would yield:
-
<match value="0x47000010" type="big32" offset="0" mask="0xffa000d0"/>
About testing: Given that the rule applies to 4 bytes only, I believe reasonable test vectors would consist of a single frame of 188 bytes only, or perhaps 188 + 4 bytes from the next frame. There's no need for 10MB single test files (as seen on some bug trackers). I can put together a collection of such samples from my STB, if you like.
Regards, Jörg Höhle