poppler-glib: poppler_document_new_from_data() receives data as UTF-8 string not binary
Submitted by okimoto
Assigned to poppler-bugs
Link to original bug (#104961)
Description
Created attachment 137181 Add type annotation to poppler_document_new_from_data()
In Poppler-0.18.gir poppler_document_new_from_data() is defined as following:
<constructor name="new_from_data"
c:identifier="poppler_document_new_from_data"
throws="1">
<doc xml:space="preserve">Creates a new #PopplerDocument. If %NULL is returned, then @error will be
set. Possible errors include those in the #POPPLER_ERROR and #G_FILE_ERROR
domains.</doc>
<return-value transfer-ownership="full">
A newly created #PopplerDocument, or %NULL</doc>
</return-value>
<parameters>
<parameter name="data" transfer-ownership="none">
the pdf data contained in a char array</doc>
</parameter>
<parameter name="length" transfer-ownership="none">
the length of #data</doc>
</parameter>
password to unlock the file with, or %NULL</doc>
</parameter>
</parameters>
</constructor>
This definition validates data as UTF-8 string when read a PDF data using poppler_document_new_from_data() via gobject-introspection. In general, PDF data is not UTF-8 string. So we should treat parameter data as binary.
I wrote tiny patch and confirm that the poppler_document_new_from_data() definition is updated after applied this patch. However, I'm not familiar with gobject-introspection in honest.
Patch 137181, "Add type annotation to poppler_document_new_from_data()":
add-type-annotation-to-poppler_document_new_from_data.diff