Support XMP metadata for title, author etc.
Submitted by Reuben Thomas
Assigned to poppler-bugs
Link to original bug (#103530)
Description
At present, poppler doesn't use the XMP metadata. This is a shame, not just because it means that in Evince, Okular etc. some documents appear to lack metadata, but because this particularly disadvantages PDF/A-compliant documents, for example. As an author, I'd like to produce documents compliant with 10-year-old standards, and not have my users complain they're broken!
The most obvious fix would seem to be to modify the various getDocInfo* methods (ignoring for the moment the fact that this will make the names somewhat misleading, as they will no longer look only in the DOCINFO dictionary), to look in the XMP metadata, according to the relevant PDF specs (I haven't yet determined what these are and what conditions they impose).
Since the XMP is XML, it seems this will need libxml2 (or equivalent). The poppler maintainers might desire this to be an optional dependency, at least at first.
Also worth considering: might clients of poppler want to know the source of a particular piece of metadata (i.e. whether it comes from DOCINFO or XMP)? If so, is that best achieved by adding new "finer-grained" APIs, or by leaving the existing APIs unaltered, and adding new ones?
Is there any need to provide low-level access to the XMP metadata through public APIs, or, since it's all XML with well-known schemas, is that redundant?