poppler issueshttps://gitlab.freedesktop.org/poppler/poppler/-/issues2019-12-09T23:52:17Zhttps://gitlab.freedesktop.org/poppler/poppler/-/issues/8300.83.0: cmake fails2019-12-09T23:52:17ZTomasz Kłoczko0.83.0: cmake fails<pre>+ /usr/bin/cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_AR=/usr/bin/gcc-ar -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_NM=/usr/bin/gcc-nm -DC...<pre>+ /usr/bin/cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_AR=/usr/bin/gcc-ar -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_NM=/usr/bin/gcc-nm -DCMAKE_RANLIB=/usr/bin/gcc-ranlib -DCMAKE_VERBOSE_MAKEFILE=ON -DINCLUDE_INSTALL_DIR=/usr/include -DLIB_INSTALL_DIR=/usr/lib64 -DLIB_SUFFIX=64 -DSHARE_INSTALL_PREFIX=/usr/share -DSYSCONF_INSTALL_DIR=/etc . -Bx86_64-redhat-linux-gnu -DENABLE_CMS=lcms2 -DENABLE_DCTDECODER=libjpeg -DENABLE_GTK_DOC=ON -DENABLE_LIBOPENJPEG=openjpeg2 -DENABLE_UNSTABLE_API_ABI_HEADERS=ON -DENABLE_ZLIB=OFF -DTEST_BIG_ENDIAN=OFF
-- The C compiler identification is GNU 9.2.1
-- The CXX compiler identification is GNU 9.2.1
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/g++
-- Check for working CXX compiler: /usr/bin/g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test GCC_HAS_AS_NEEDED
-- Performing Test GCC_HAS_AS_NEEDED - Success
-- Found PkgConfig: /usr/bin/pkg-config (found version "1.6.3")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - no
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - failed
-- Check size of unsigned int
-- Check size of unsigned int - failed
-- Check size of unsigned long
-- Check size of unsigned long - failed
CMake Error at /usr/share/cmake/Modules/TestBigEndian.cmake:50 (message):
no suitable type found
Call Stack (most recent call first):
CMakeLists.txt:21 (test_big_endian)
-- Configuring incomplete, errors occurred!
See also "/home/tkloczko/rpmbuild/BUILD/poppler-0.81.0/x86_64-redhat-linux-gnu/CMakeFiles/CMakeOutput.log".
See also "/home/tkloczko/rpmbuild/BUILD/poppler-0.81.0/x86_64-redhat-linux-gnu/CMakeFiles/CMakeError.log".
error: Bad exit status from /var/tmp/rpm-tmp.QOZXAF (%build)
</pre>https://gitlab.freedesktop.org/poppler/poppler/-/issues/829pdffonts: long font names break fixed-width output2019-10-15T22:42:13Ztamunropdffonts: long font names break fixed-width outputIn ~pdffonts 0.80.0, fonts with long names break the fixed-width table format of the output. Here are examples from the attached file,
[Font_problems.pdf](/uploads/bf3d5e0b05d8211469274543a219a32d/Font_problems.pdf)
```
name ...In ~pdffonts 0.80.0, fonts with long names break the fixed-width table format of the output. Here are examples from the attached file,
[Font_problems.pdf](/uploads/bf3d5e0b05d8211469274543a219a32d/Font_problems.pdf)
```
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
MHAFVK+Times#20New#20Roman TrueType WinAnsi yes yes no 103 0
BMPSSD+Times#20New#20Roman,BoldItalic TrueType WinAnsi yes yes no 105 0
GXZEYL+Times#20New#20Roman,Bold_00 TrueType Custom yes yes no 108 0
WLZFBK+Times#20New#20Roman,Bold_0 TrueType WinAnsi yes yes no 110 0
GXZEYL+Times#20New#20Roman,BoldItalic_1_00 TrueType Custom yes yes no 112 0
LOMNPV+Times#20New#20Roman,Italic TrueType WinAnsi yes yes no 114 0
RXNFTE+Times#20New#20Roman_2_00 TrueType Custom yes yes no 117 0
QWBTND+Symbol_00 TrueType Custom yes yes no 120 0
AQAEUC+Times#20New#20Roman,Italic_3_01 TrueType Custom yes yes no 123 0
#cb#ce#cc#e5-GBK-EUC-H-Identity-H-Identity-H CID Type 0C Identity-H yes no no 59 0
CLIMBI+TimesNewRomanPSMT Type 1C Custom yes yes yes 54 0
CLIMDK+TimesNewRomanPS-BoldItalicMT Type 1C Custom yes yes yes 52 0
CLIMDL+Times#20New#20Roman Type 1C Custom yes yes yes 56 0
CLIMCI+TimesNewRomanPS-BoldMT Type 1C Custom yes yes yes 63 0
CLIMEL+TimesNewRomanPS-ItalicMT Type 1C Custom yes yes yes 60 0
MS#20Mincho-90ms-RKSJ-H-Identity-H-Identity-H CID Type 0C Identity-H yes no no 66 0
```
This introduces errors when importing into a spreadsheet. I've tried replacing spaces with tabs using regular expressions, but some font types and names contain spaces, so that introduces other errors. A simple fix for this could be to widen the name column by 20 characters, at least as an option. More elaborate, but probably more robust and useful, would be to offer tab-separated output as an option.https://gitlab.freedesktop.org/poppler/poppler/-/issues/828Support addition of revisions ("replies") to annotations with Qt5 bindings2019-10-15T22:39:25ZHugo DammeSupport addition of revisions ("replies") to annotations with Qt5 bindingsThe current Qt5 binding allows to retrieve the replies to a text/highlight annotation by using the revisions() method.
However, there is no method to allow adding new revisions to an annotation.
A service could be provided to add either...The current Qt5 binding allows to retrieve the replies to a text/highlight annotation by using the revisions() method.
However, there is no method to allow adding new revisions to an annotation.
A service could be provided to add either 1 annotation to the revision list, such as addRevision(const Annotation *), or a list of revisions addRevisions(const QVector<const Annotation *> &).https://gitlab.freedesktop.org/poppler/poppler/-/issues/825'Bogus memory allocation size' exception during image generation on a PDF2020-04-21T19:22:29ZLoickBRIOT'Bogus memory allocation size' exception during image generation on a PDFThe buggy PDF is here: [dws0rp_k.en.pdf](/uploads/d6dccbbff04801c220e8dacc29bf6c3d/dws0rp_k.en.pdf)
When it generates the image for the third page of this PDF, the program stops with the following exception:
```
Bogus memory allocation...The buggy PDF is here: [dws0rp_k.en.pdf](/uploads/d6dccbbff04801c220e8dacc29bf6c3d/dws0rp_k.en.pdf)
When it generates the image for the third page of this PDF, the program stops with the following exception:
```
Bogus memory allocation size
Syntax Error: Could not find start of jpeg data
```
I was able to reproduce this bug by using:
* the stable 'cpp' API from this file: https://gitlab.freedesktop.org/poppler/poppler/blob/master/cpp/tests/poppler-render.cpp
* the unstable 'poppler' API by using pdftocairo, pdftoppm... etchttps://gitlab.freedesktop.org/poppler/poppler/-/issues/820bug(pdftoppm): -: Error writing TIFF header.2020-09-23T13:58:33ZСаша Черныхbug(pdftoppm): -: Error writing TIFF header.### 1. Note
If I need post this issue to another place, please, show me this place.
### 2. Summary
I can't convert any PDF document to tif (tiff). In any case I get an error:
```text
-: Error writing TIFF header.
```
### 3. Possible...### 1. Note
If I need post this issue to another place, please, show me this place.
### 2. Summary
I can't convert any PDF document to tif (tiff). In any case I get an error:
```text
-: Error writing TIFF header.
```
### 3. Possible related issues
+ [**Bug 75969**](https://bugs.freedesktop.org/show_bug.cgi?id=75969)
+ Harsh Rai user [**reported**](http://disq.us/p/21m5qus) about this problem
### 4. Data
+ [**KiraGoddess.pdf**](https://app.box.com/s/dyunssgvwrybl5te7vq65nus4qfoeou9) — simple PDF file, that contains solely text `Kira Goddess!`.
### 5. Steps to reproduce
I download and unpack `poppler-0.68.0_x86` from [**here**](https://blog.alivate.com.au/poppler-windows/) as [**described on Stack Overflow**](https://stackoverflow.com/q/18381713/5951529) → I add `poppler-0.68.0_x86\poppler-0.68.0\bin` folder to user `PATH` environment variable → I run this command
```text
pdftoppm -tiff KiraGoddess.pdf KiraGoddess
```
### 6. Actual behavior
```text
-: Error writing TIFF header.
```
### 7. Expected behavior
Converting to JPEG successful for me.
```text
pdftoppm -jpeg KiraGoddess.pdf KiraGoddess
```
+ `KiraGoddess-1.jpg`:
![JPEG](https://i.imgur.com/uAlRJ24.jpg)
### 8. Environment
+ Windows 10 Enterprise LTSB 64-bit EN
Thanks.https://gitlab.freedesktop.org/poppler/poppler/-/issues/819Font Selection doesn't use postscript font name2022-10-10T13:09:06ZElahn IentileFont Selection doesn't use postscript font nameCurrently, poppler selects by font family name using fonts specified in the PDF. However, often PDF fonts are specified using their postscript names.
This behaviour is present in 0.62 and git d70f77ee6a1bdee8b17f08f3066c0cd685853d21. T...Currently, poppler selects by font family name using fonts specified in the PDF. However, often PDF fonts are specified using their postscript names.
This behaviour is present in 0.62 and git d70f77ee6a1bdee8b17f08f3066c0cd685853d21. Tested with fontconfig 2.12.6-0ubuntu2.
Steps to reproduce:
- Install msttcorefonts
- Download: https://www.unicode.org/notes/tn28/UTN28-PlainTextMath-v3.1.pdf
- `pdffonts -subst UTN28-PlainTextMath-v3.1.pdf`:
![Screenshot_from_2019-08-19_19-30-40](/uploads/06f318c384d22e2f6716dd6c7094d96a/Screenshot_from_2019-08-19_19-30-40.png)
```bash
$ fc-match ArialMT
Vera.ttf: "Bitstream Vera Sans" "Roman"
$ fc-match :postscriptname=ArialMT
Arial.ttf: "Arial" "Regular"
$ fc-match TimesNewRomanPSMT
Vera.ttf: "Bitstream Vera Sans" "Roman"
$ fc-match :postscriptname=TimesNewRomanPSMT
Times_New_Roman.ttf: "Times New Roman" "Regular"
$ fc-match TimesNewRomanPS-ItalicMT
Vera.ttf: "Bitstream Vera Sans" "Roman"
$ fc-match :postscriptname=TimesNewRomanPS-ItalicMT
Times_New_Roman_Italic.ttf: "Times New Roman" "Italic"
```https://gitlab.freedesktop.org/poppler/poppler/-/issues/810pdftocairo -paper crop2019-07-27T16:19:32Zvstepaniukpdftocairo -paper crop```
-paper size
Set the paper size to one of "letter", "legal", "A4", or "A3" (PS,PDF,SVG only). This can also be set to "match", which will set
the paper size of each page to match the size specifi...```
-paper size
Set the paper size to one of "letter", "legal", "A4", or "A3" (PS,PDF,SVG only). This can also be set to "match", which will set
the paper size of each page to match the size specified in the PDF file. If none the -paper, -paperw, or -paperh options are speci‐
fied the default is to match the paper size.
```
I suggest to add a `crop` value to the `pdftocairo -paper` option to mean the crop area dimensions, specified by the `-W` and `-H` options.https://gitlab.freedesktop.org/poppler/poppler/-/issues/807Font rendering issue unrelated to substituted fonts2019-07-24T22:59:14ZKrešimir ČoharFont rendering issue unrelated to substituted fontsHey guys, so we noticed something that might be a poppler (cairo) issue in https://gitlab.gnome.org/GNOME/evince/issues/1221
See "st udents" and "n ew" in Evince (left) vs Chromium (right, students and new, respectively):
![image](/upl...Hey guys, so we noticed something that might be a poppler (cairo) issue in https://gitlab.gnome.org/GNOME/evince/issues/1221
See "st udents" and "n ew" in Evince (left) vs Chromium (right, students and new, respectively):
![image](/uploads/ea37bb8d47c9b7832c0ca04906c8b7e1/image.png)
Using pdftocairo, this happens:
![image](/uploads/8db6e47fb9dc05961e445cc5787ca503/image.png)
So, while Evince's font smoothing exacerbates the problem, it doesn't seem to be the root cause of it.
The fonts in this PDF aren't substituted but embedded.
Is this fixable?
(Running Ubuntu 19.04, GNOME 3.32.1, Evince 3.32.0)https://gitlab.freedesktop.org/poppler/poppler/-/issues/800Sign existing signature field2022-01-23T17:12:38ZGyuris GellértSign existing signature fieldPlease add support for signing existing signature field.
In some PDF editors (eg. Adobe Acrobat) we can add empty signature field to the page. With FOSS software we can create this fieldtype with [eforms](https://ctan.org/pkg/eforms) an...Please add support for signing existing signature field.
In some PDF editors (eg. Adobe Acrobat) we can add empty signature field to the page. With FOSS software we can create this fieldtype with [eforms](https://ctan.org/pkg/eforms) and [digsig](http://home.htp-tel.de/lottermose2/tex/dist/digsig.sty) LaTeX packages or [Apache PDFBox](https://pdfbox.apache.org/) ([example]()).
With some PDF reader (eg. Adobe Reader, Master PDF Editor) we can sign these fields. It would be great to sign these fields with poppler.
See attached examples (empty signature field and signed).
According to #465.
- [Empty_signature_field_example__eform_.pdf](/uploads/677f27341a6001f4d6859df0acc46b11/Empty_signature_field_example__eform_.pdf)
- [Empty_signature_field_example__eform__signed_with_Adobe_Reader_DC_Win.pdf](/uploads/812aff4de92c53359eeb1d010f73c3c4/Empty_signature_field_example__eform__signed_with_Adobe_Reader_DC_Win.pdf)
- [Empty_signature_field_example__digsig_.pdf](/uploads/c60d38c4040513b53c1f3608bd07da72/Empty_signature_field_example__digsig_.pdf)
- [Empty_signature_field_example__digsig__signed_with_Adobe_Reader_DC_Win.pdf](/uploads/e5a0b6dda582fbf438abdb1abc1853bf/Empty_signature_field_example__digsig__signed_with_Adobe_Reader_DC_Win.pdf)
- [Empty_signature_field_example__Adobe_Acrobat_.pdf](/uploads/ab69b27c70a207d5768e864127c94282/Empty_signature_field_example__Adobe_Acrobat_.pdf)
- [Empty_signature_field_example__Adobe_Acrobat__signed_with_Master_PDF_Editor.pdf](/uploads/ead29160ada3cdce7721e4a92c48fecf/Empty_signature_field_example__Adobe_Acrobat__signed_with_Master_PDF_Editor.pdf)https://gitlab.freedesktop.org/poppler/poppler/-/issues/786TextOutputDev neglects the translation in the FontMatrix of type3 fonts2019-08-16T18:17:10ZsgerwkTextOutputDev neglects the translation in the FontMatrix of type3 fontsThe enclosed file uses the type3 font in the PDF spec changed by adding an x translation in the font matrix.
[type3.pdf](/uploads/f4a803c61d21e1aa16c903a81e819a92/type3.pdf)
Drawing the rectangles in the text layout atop of it produces...The enclosed file uses the type3 font in the PDF spec changed by adding an x translation in the font matrix.
[type3.pdf](/uploads/f4a803c61d21e1aa16c903a81e819a92/type3.pdf)
Drawing the rectangles in the text layout atop of it produces this:
[type3-boxes.pdf](/uploads/c57a666dcb1a249ca45c79d681e433bf/type3-boxes.pdf)
I grant that fonts with a translation in the font matrix are uncommon, yet the spec seems to allow them. Otherwise, the characters are placed incorrectly.https://gitlab.freedesktop.org/poppler/poppler/-/issues/785MacOS Mojave: pdftoppm removes bold fonts2019-06-27T20:50:05Zlaub blattMacOS Mojave: pdftoppm removes bold fontsI generate pdf files with R and then convert these into png.
Unfortunately the bold fonts (Helvetica-Bold) are not rendered and some other font is shown.
Conversion with Preview: everything looks good:
![testigureR_boldfont](/uploads/e...I generate pdf files with R and then convert these into png.
Unfortunately the bold fonts (Helvetica-Bold) are not rendered and some other font is shown.
Conversion with Preview: everything looks good:
![testigureR_boldfont](/uploads/ea6dfd97cb6c2247cc163ada63a47eb0/testigureR_boldfont.png)
Conversion with pdftoppm
`pdftoppm -singlefile -rx 100 -ry 100 -png testigureR_boldfont.pdf testigureR_boldfont.pdf`
![testigureR_boldfont.pdf](/uploads/136a4c2952198478458d2cb0cc45ceea/testigureR_boldfont.pdf.png)
pdftoppm version 0.77.0
On the most recent MacOS I installed poppler
brew upgrade poppler
On another machine with macOS 10.12 everythings is ok.
**Any suggestions why this happens and how to resolve this are welcome!**https://gitlab.freedesktop.org/poppler/poppler/-/issues/783Evince and other Poppler apps have trouble loading POPPLER_ANNOT_POLY_LINE fr...2019-06-26T19:49:32ZanarcatEvince and other Poppler apps have trouble loading POPPLER_ANNOT_POLY_LINE from Onyx Boox annotationsThis is a [bug that was originally submitted against Evince](https://gitlab.gnome.org/GNOME/evince/issues/1195) but that, after further inspection, affects other Poppler-related PDF readers.
The following PDF:
[about_empty-Exported.pdf...This is a [bug that was originally submitted against Evince](https://gitlab.gnome.org/GNOME/evince/issues/1195) but that, after further inspection, affects other Poppler-related PDF readers.
The following PDF:
[about_empty-Exported.pdf](/uploads/851e8812521859e4317a83c5c902e680/about_empty-Exported.pdf)
Was created from the following:
[about_empty.pdf](/uploads/2fe4a19aff752ee3dece9fd440c29222/about_empty.pdf)
to which was added the following text, in the appropriate tool, on a Onyx Boox Note Pro ebook reader:
```
light pen (2)
thick pen (10)
really thick pen (20)
pencil (light, thick, really)
line: [drawing of a line using the line tool]
square: [drawing of a square]
circle: [drawing of a circle]
triangle: [drawing of a triangle]
colors: [black scribbled square] black
[ ] red
[ ] green
[ ] blue
[ ] white <---- white
```
* The `pen` lines are drawn with the "pen" tool (or "brush"?) with varying intensity. The "really thick" is picked by manually selecting weight "20" that is not available by just choosing the presets.
* The `pencil` lines, and all remaining text, are drawn with the "pencil" tool (the one right of the "pen"), with varying intensity, similarly to the pen tool
* The line, square, circle and triangles have samples of those geometric shapes that can be created with the scribbling app
* the colors are all drawn with the pencil as well, and presumably affect other tools similarly.
This should give us a good basis covering almost all combinations. I think the only ones not covered are the colored pen and drawing tools, but i'd be suprised if they don't work right if they're black relatives do.
This is how it should render:
![image](/uploads/6023b11037c9d1199fbeabecf05bb424/image.png)
This is how it actually renders:
![image](/uploads/f97fea682771877f5f7c99ad334bcb3f/image.png)
Ie. the "brush" strokes do not render correctly. In Evince, we get the error:
```
** (evince:15296): WARNING **: 12:33:56.566: Unimplemented annotation: POPPLER_ANNOT_POLY_LINE, please post a bug report in Evince issue tracker (https://gitlab.gnome.org/GNOME/evince/issues) with a testcase.
```https://gitlab.freedesktop.org/poppler/poppler/-/issues/778Different behavior in Poppler when font is not found on FormFieldText2019-06-25T22:24:32ZEmil SedghDifferent behavior in Poppler when font is not found on FormFieldTextI have some PDF's (with form annotations) that when are written by Poppler (Okular, or any other) look different
than other editors (Tried Master PDF and Chrome so far).
I investigated and here are the findings:
Imagine a scenario wher...I have some PDF's (with form annotations) that when are written by Poppler (Okular, or any other) look different
than other editors (Tried Master PDF and Chrome so far).
I investigated and here are the findings:
Imagine a scenario where there is a:
* `FormFieldText` with `da = /Arial 10`
* `Form` with `da = /Helv 0`
Poppler tries to locate `Arial` font.
If failed, it will rely on Form's `da`, which is `/Helv 0`
Since the font size on `Form` is `0`, the text appears too big (As 0 means automatic sizing)
although the field is set to have size 10.
It seems that other generators will switch to Helv for font, but keep the font size as defined on field.https://gitlab.freedesktop.org/poppler/poppler/-/issues/776pdfinfo sends arbitrary bytes to stdout2019-10-20T20:58:58ZDaniel Kahn Gillmorpdfinfo sends arbitrary bytes to stdoutI've found a pdf file whose "producer" string has an embedded NUL byte in it. running `pdfinfo` on it sends the NUL byte to stdout.
This suggested to me that a suitably nasty PDF document can emit terminal escape sequences, which is po...I've found a pdf file whose "producer" string has an embedded NUL byte in it. running `pdfinfo` on it sends the NUL byte to stdout.
This suggested to me that a suitably nasty PDF document can emit terminal escape sequences, which is potentially a security vulnerability (e.g. similar to [CVE-2009-4487](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-4487)).
[x.pdf](/uploads/3ff44aeda3195a206e3fa93313342b1f/x.pdf) is a relatively benign PDF that demonstrates some of this. It will set the window title, add a fake `Subject:` line (via an injected `\n`), set some colors, and add some blinking text if you run `pdfinfo x.pdf`. I have not tried to weaponize it, but i don't think it would be hard to do, if `pdfinfo` is run in a vulnerable terminal. (see the `console_codes(4)` manpage to read up on the range of things that terminal escape sequences can do in some contexts)
Interestingly, an embedded NUL bytes makes it so that a typical `pdfinfo | grep` pipeline fails, because grep says `Binary file (standard input) matches`. So even if it were safe (it's not, because the ability to inject newlines screws up `pdfinfo`'s structured output), you'd have to use `grep -a` to force grep to consider it not a binary file.
pdfinfo should probably try to render any document-supplied text more carefully.https://gitlab.freedesktop.org/poppler/poppler/-/issues/775layout of graphics in the page2019-06-04T12:11:19Zsgerwklayout of graphics in the pageCurrently, poppler-glib only provides the layout of text (poppler_page_get_text_layout), images (poppler_page_get_image_mapping), etc. As far as I can see, there is currently no way to detect the position of graphics in the page, the kin...Currently, poppler-glib only provides the layout of text (poppler_page_get_text_layout), images (poppler_page_get_image_mapping), etc. As far as I can see, there is currently no way to detect the position of graphics in the page, the kind drawn by commands like m, l, v, etc.
For example, the box is not an image in the enclosed file. More generally, it seems not possible to detect the position of a diagram made of lines, rectangles, Bezier curves, etc.
[noimages.pdf](/uploads/c5611338f3c920d2b604256cf1e0a728/noimages.pdf)https://gitlab.freedesktop.org/poppler/poppler/-/issues/774pdftocairo generates PS that crashes Adobe engine2019-12-12T17:07:06ZAlex Kpdftocairo generates PS that crashes Adobe engineWhen some PDFs are converted to PS by pdftocairo using cairo 1.15.10 or later, the resulting PS files crash printers that are using Adobe PS engine. In my case, Ricoh printers.
Adobe reviewed the PS files and said this:
> Cairo incorp...When some PDFs are converted to PS by pdftocairo using cairo 1.15.10 or later, the resulting PS files crash printers that are using Adobe PS engine. In my case, Ricoh printers.
Adobe reviewed the PS files and said this:
> Cairo incorporates pattern data into form data. When printing, the form data is drawn repeatedly, but the cache memory overflows because the pattern data is included. Due to this, an illegal memory access occurs and the job is canceled.
>
> If Cairo does not include pattern data in the form data, it is possible to work around the problem.
Does this pattern data need to be included in the form?
Perhaps it's still possible to return to 1.15.8 behavior, or pinpoint the commit that I could remove just for my environment?
Here are two example PDF files, and two resulting PS files for each PDF.
Cairo 1.15.8
pdftocairo -ps -level3 [pwc.pdf](/uploads/4b116753fd71f98039d7b9859776cc40/pwc.pdf) [pwc158.ps](/uploads/244b5e7b88165b3dd47197f8e0ae5d17/pwc158.ps)
pdftocairo -ps -level3 [sub.pdf](/uploads/542415215e2f672623e0f6d2b5166faa/sub.pdf) [sub158.ps](/uploads/394979596b4345e5fc12d6cad0bd32fa/sub158.ps)
Cairo 1.16.0 (I'm using 1.16.0 in this example, but the issue occurs with 1.15.10 onward).
pdftocairo -ps -level3 [pwc.pdf](/uploads/4b116753fd71f98039d7b9859776cc40/pwc.pdf) [pwc160.ps](/uploads/ef14cb790292400e55d3544eaf3246f7/pwc160.ps)
pdftocairo -ps -level3 [sub.pdf](/uploads/542415215e2f672623e0f6d2b5166faa/sub.pdf) [sub160.ps](/uploads/4f396f64ed5f96d0ad2a1659a6954ab6/sub160.ps)https://gitlab.freedesktop.org/poppler/poppler/-/issues/773Missing text in PDF2019-05-27T22:43:35ZBastien NoceraMissing text in PDFThe PDF at https://panda-bg.com/datasheet/1582-256001-Illuminated-Push-Button-Switch-M12-LAS2-series.pdf shows text correctly in Firefox's PDF.js viewer, but not in the poppler-powered evince or GIMP.
Using `poppler-0.73.0-9.fc30.x86_64...The PDF at https://panda-bg.com/datasheet/1582-256001-Illuminated-Push-Button-Switch-M12-LAS2-series.pdf shows text correctly in Firefox's PDF.js viewer, but not in the poppler-powered evince or GIMP.
Using `poppler-0.73.0-9.fc30.x86_64`
Test file:
[1582-256001-Illuminated-Push-Button-Switch-M12-LAS2-series.pdf](/uploads/a64ba19444a8209fe7f248703c48a269/1582-256001-Illuminated-Push-Button-Switch-M12-LAS2-series.pdf)
Screenshot:
![pdf-comparison](/uploads/b6674ca653438aebddcfe7865bf63fc3/pdf-comparison.png)https://gitlab.freedesktop.org/poppler/poppler/-/issues/770Watermark removal breaks PDF rendering in poppler2019-05-27T09:36:07ZbaerbockWatermark removal breaks PDF rendering in popplerI've removed the watermark from [this book PDF](https://www.peterlang.com/downloadpdf/title/17680.pdf) with a [script/PDFtk](https://github.com/agarden/remove-pdf-watermark/blob/master/removewatermark).
poppler renders the [resulting PD...I've removed the watermark from [this book PDF](https://www.peterlang.com/downloadpdf/title/17680.pdf) with a [script/PDFtk](https://github.com/agarden/remove-pdf-watermark/blob/master/removewatermark).
poppler renders the [resulting PDF](https://github.com/agarden/remove-pdf-watermark/files/3217023/x.scrubbed.pdf) only as white pages. Mupdf however is rendering correctly.https://gitlab.freedesktop.org/poppler/poppler/-/issues/764pdftocairo takes 20+ minutes to process this PDF2019-05-08T21:58:13ZAlex Kpdftocairo takes 20+ minutes to process this PDFOn my 12-core machine this PDF conversion takes 20+ minutes and produces a huge 127Mb PDF.
pdftocairo -pdf [wojtka.pdf](/uploads/ea07e11941ae9699a4364de2018e5006/wojtka.pdf) wojtka-out.pdf
```
$ time pdftocairo -pdf ./wojtka.pdf wo...On my 12-core machine this PDF conversion takes 20+ minutes and produces a huge 127Mb PDF.
pdftocairo -pdf [wojtka.pdf](/uploads/ea07e11941ae9699a4364de2018e5006/wojtka.pdf) wojtka-out.pdf
```
$ time pdftocairo -pdf ./wojtka.pdf wojtka-out.pdf
Syntax Warning: Invalid Font Weight
Syntax Warning: Invalid Font Weight
Syntax Warning: Invalid Font Weight
Syntax Warning: Invalid Font Weight
Syntax Warning: Invalid Font Weight
Syntax Warning: Invalid Font Weight
real 20m46.750s
user 20m45.753s
sys 0m0.977s
```
Perhaps something could be done to speed it up?
cairo 1.16.0
poppler 0.61.1https://gitlab.freedesktop.org/poppler/poppler/-/issues/763find_text and get_text_for_area use a different 0 point on y axis2019-05-08T21:54:28ZCoyoteazulfind_text and get_text_for_area use a different 0 point on y axisI was trying to find the coordinates of a certain text on a document, and then bring the text on an area that was close to it. However I found that apparently the 2 functions use different criteria for where the cero point is on the y ax...I was trying to find the coordinates of a certain text on a document, and then bring the text on an area that was close to it. However I found that apparently the 2 functions use different criteria for where the cero point is on the y axis.
**main.cpp**
```
#include <iostream>
#include "poppler.h"
#include <string>
int main (int argc, char *argv[]){
PopplerDocument *doc = poppler_document_new_from_file(argv[1], NULL, NULL);
PopplerPage *pag1 = poppler_document_get_page(doc, 0);
GList *lista = poppler_page_find_text(pag1, "WUNMUN-2018-0039");
PopplerRectangle *rect = (PopplerRectangle *)lista->data;
std::cout << "rect : \tx1: " << rect->x1 << " \ty1: " <<rect->y1 << " \tx2: " << rect->x2 << " \ty2: " <<rect->y2 << std::endl;
std::string hallado = poppler_page_get_text_for_area(pag1, rect);
std::cout << "text on rect1: " << hallado << std::endl<< std::endl;
double izq, arriba;
poppler_page_get_size(pag1, &izq, &arriba);
PopplerRectangle *rect2 = poppler_rectangle_copy(rect);
rect2->y1 = arriba - rect2->y1;
rect2->y2 = arriba - rect2->y2;
std::cout << "rect2 : \tx1: " << rect2->x1 << " \ty1: " <<rect2->y1 << " \tx2: " << rect2->x2 << " \ty2: " <<rect2->y2 << std::endl;
std::string hallado2 = poppler_page_get_text_for_area(pag1, rect2);
std::cout << "text on rect2: " << hallado2 << std::endl<< std::endl;
return 0;
}
```
**terminal output**
```
$ ./pruebas file:///home/hernan/Documentos/pruebas/bin/Debug/Leiva.pdf
rect1 : x1: 84.128 y1: 549.304 x2: 162.36 y2: 556.704
text on rect1: ante Autorizado
1,00
U. Medida
unidades
Pág. 1/1
Esta Administración Fe
rect2 : x1: 84.128 y1: 292.696 x2: 162.36 y2: 285.296
text on rect2: WUNMUN-2018-0039
```
The text on rect1 is actually on another area of the document, much lower than WUNMUN.
poppler_page_get_text_for_area seems to assume that the cero point for the y axis is on the top, while poppler_page_find_text returns coordinates where the cero point for the y axis is on the bottom (as it should be)