Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
P
poppler
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 611
    • Issues 611
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 37
    • Merge Requests 37
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • poppler
  • poppler
  • Issues
  • #1009

Closed
Open
Opened Dec 16, 2020 by zeidoo@zeidoo

Selecting text from annotation rect selects the wrong text

Preface:

Using the latest code in the master branch. I have searched the issues and didn't find anything related, apologies if this is known. Also apologies if this is expected behavior. If it's not expected behavior, I'd gladly submit a PR to fix the issue.

Problem:

For highlight annotations, it's useful to extract the highlighted text. Annotations have their coordinates start (point 0,0) at the bottom left of the page, but it seems text selection starts at the top left of the page. At least for the glib api this is the case. Note that the annotation was done in Okular which uses the QT5 api of Poppler.

As such the following does not work:

char *text = poppler_page_get_text_for_area(demo->page, &amapping->area);

image image

Nor does this (same results as the screenshots above):

PopplerRectangle rect;
poppler_annot_get_rectangle(amapping->annot, &rect);
char *text = poppler_page_get_text_for_area(demo->page, &rect);

This does get the correct text (792 is the page height):

PopplerRectangle rect;
poppler_annot_get_rectangle(amapping->annot, &rect);

rect.y1 = 792 - rect.y1;
rect.y2 = 792 - rect.y2;

char *text = poppler_page_get_text_for_area(demo->page, &rect);

image image

Test files

1.pdf 2.pdf

Debug info

image

Edited Dec 16, 2020 by zeidoo
To upload designs, you'll need to enable LFS and have admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: poppler/poppler#1009