id3v2mux: Transcoding from flac to mp3, UTF-8 tags get mangled
Submitted by Clarke Wixon
I'm using gstreamer to transcode flac files to mp3 while keeping the metadata, using the following pipeline:
gst-launch filesrc location="foo.flac" ! decodebin ! audioconvert ! \
lamemp3enc target=quality quality=4 encoding-engine-quality=2 ! xingmux ! id3v2mux ! \
It looks like UTF-8 encoded tags in the flac files are being treated as 8-bit ISO-8859-1 tags, so each byte of any multi-byte UTF-8 characters in the original tag will mistakenly be re-encoded into UTF-8, resulting in garbage.
For example, I have foo.flac containing the following metadata (according to metaflac --list):
comment: producer=Peter Tägtgren
That's in UTF-8, so the "ä" character is encoded as two bytes, C3 A4.
After the pipeline, I get an bar.mp3 containing this (according to id3demux ! fakesink -t):
That's also in UTF-8, so the original single "ä" character is now four hideous bytes, C3 83 C2 A4.
Should (or can) id3v2mux identify what encoding is used by incoming metadata? If not, is there a manual workaround?