Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
P
poppler
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 613
    • Issues 613
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 39
    • Merge Requests 39
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • poppler
  • poppler
  • Issues
  • #1031

Closed
Open
Opened Jan 20, 2021 by Reza@rezahsnz

Extra space after a bold-faced letter when retrieving text

While trying to detect labels of figures in ebooks, I stumbled upon some odd bug.
If a word(label for images/figures/... in my use-case) contains bold-faced letters, then the text retrieved(either through text selection in Evince or the Poppler API(GLib)) would contain an extra space just after the bold-faced letter. An example:
The rendered text: 'Figure 5.54: bla bla bla'
The retrieved text: 'F igure 5.54: bla bla bla', notice the extra after F
I should also note that if one performs a search for the 'F igure 5.54: bla bla bla', she would not find anything. So, it seems that this bug has something to do with the way bold-faced stuff are retrieved. I've also attached the pdf file that caused this bug. Open the file with Evince/API and look for labels below figures(e.g. pages 18, 29, 32, 43, 46, 48, 49, 51, ...).

Evince:

  • Version: 3.28.4

Poppler(GLib):

  • Architecture: amd64
  • Version: 0.62.0-2ubuntu2.12

Please note that I have no idea if this is universal or not as I do not have extra pdf files to test this.

The_Book_of_Why__The_New_Science_of_Cause_and_Effect.pdf

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: poppler/poppler#1031