Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
P
poppler
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 613
    • Issues 613
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 40
    • Merge Requests 40
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • poppler
  • poppler
  • Issues

  • Open 60
  • Closed 5
  • All 65
New issue
  • Priority Created date Last updated Milestone due date Due date Popularity Label priority Manual
  • pdftohtml converts all ll into l then a space
    #4 · opened Jan 08, 2013 by Bugzilla Migration User   pdftohtml
    • 0
    updated Jun 22, 2020
  • Unnecessary conversion upper case letters to lower case
    #22 · opened Sep 01, 2011 by Bugzilla Migration User   pdftohtml
    • 2
    updated Jun 22, 2020
  • pdftohtml very slow if pdf uses tiling pattern fill
    #32 · opened Jan 08, 2014 by Bugzilla Migration User   pdftohtml
    • 1
    updated Jun 22, 2020
  • Solid rectangles are visible on resultant HTML out, PDF does not show them
    #42 · opened Sep 07, 2012 by Bugzilla Migration User   pdftohtml
    • 4
    updated Aug 20, 2018
  • Bullets are converted to other characters when converting PDF to HTML
    #45 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 6
    updated Dec 13, 2018
  • pdftohtml with -f -l option numbers image not right
    #79 · opened Apr 19, 2015 by Bugzilla Migration User   pdftohtml
    • 3
    updated Aug 20, 2018
  • Wrong font id used when first word of a line has certain style applied (xml)
    #91 · opened May 13, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • why not equal Chinese characters per line width ?
    #96 · opened Nov 30, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • misplacement of subsequent rows after symbol ¹ etc.
    #117 · opened Apr 11, 2012 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 20, 2018
  • -xml does not render all images despite -c rendering correctly
    #127 · opened Sep 17, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • pdftohtml - Add fontspec transform to XML output
    #148 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 0
    updated Oct 08, 2018
  • pdftohtml ignore png format option and extract inverted jpg images
    #151 · opened Oct 13, 2015 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • pdftohtml loses some double lls in duplicate check
    #165 · opened May 10, 2010 by Bugzilla Migration User   pdftohtml
    • 9
    updated Oct 07, 2018
  • Pdftohtml 0.17.2 does a reasonable job on this pdf, whereas 0.17.3 fails totally
    #169 · opened May 12, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • Power values doesnot appear properly in PDF to HTML conversion
    #205 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • PDF cropping ignored
    #214 · opened Oct 26, 2011 by Bugzilla Migration User   pdftohtml
    • 4
    updated Aug 20, 2018
  • Difference in style and font in output HTML
    #227 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 20, 2018
  • When converting the attached PDF to HTML, formatting is lost, color of images is different and some images are missing
    #242 · opened Aug 16, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 20, 2018
  • Extended spacing between word and hyperlink in HTML
    #251 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • [pdftohtml] Segfault when output set to /dev/null or other place with no write access
    #255 · opened Oct 23, 2014 by Bugzilla Migration User   pdftohtml
    • 1
    updated Oct 08, 2018
  • Links are not correct if using complex and single html mode in pdftohtml
    #270 · opened Oct 04, 2011 by Bugzilla Migration User   pdftohtml
    • 4
    updated Oct 11, 2018
  • poppler: file parsing infinite loop encountered with docs containing image masks (sample attached)
    #283 · opened Apr 03, 2013 by Bugzilla Migration User   crash / hang / abort pdftohtml
    • 3
    updated Oct 05, 2018
  • Add image names to pdftohtml dump in xml mode
    #309 · opened Aug 10, 2010 by Bugzilla Migration User   patch pdftohtml
    • 3
    updated Oct 27, 2018
  • Data overlapping issue in PDF to HTML conversion
    #312 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • Vertical text is shown as horizontal in output HTML
    #313 · opened Jun 18, 2013 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • pdftohtml: fakebold and dropshadow duplicated text
    #321 · opened Jul 16, 2017 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • Pattern image is repeated thousands of times, producing huge output
    #326 · opened Apr 13, 2015 by Bugzilla Migration User   pdftocairo pdftohtml
    • 0
    updated Oct 27, 2018
  • width set to 0 for vertical text
    #339 · opened Nov 22, 2016 by Bugzilla Migration User   pdftohtml
    • 2
    updated Aug 21, 2018
  • Add image dimensions to pdfhtml dump in xml mode
    #340 · opened Aug 10, 2010 by Bugzilla Migration User   patch pdftohtml
    • 1
    updated Oct 27, 2018
  • Incorrect positioning of text in PDFTOHTML
    #342 · opened Oct 22, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • Converting some PDFs results in images being converted in to 1000s of PNGs
    #346 · opened Apr 25, 2014 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • pdftohtml -c generates html with very ugly/unusual spacing
    #384 · opened Sep 22, 2007 by Bugzilla Migration User   pdftohtml
    • 4
    updated Oct 27, 2018
  • When extracting as XML all new lines are stripped
    #385 · opened Dec 12, 2017 by Bugzilla Migration User   pdftohtml
    • 2
    updated Aug 21, 2018
  • pdftohtml not working on some PDF
    #391 · opened Jan 16, 2014 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • "pdftohtml -s" produces multiple files.
    #392 · opened Mar 20, 2015 by Bugzilla Migration User   Feature Request pdftohtml
    • 3
    updated Aug 21, 2018
  • pdftohtml crash when converting pdf file that have png images in it
    #405 · opened Apr 06, 2012 by Bugzilla Migration User   pdftohtml
    • 4
    updated Aug 21, 2018
  • Text color output html is wrong
    #407 · opened Apr 19, 2015 by Bugzilla Migration User   pdftohtml
    • 2
    updated Aug 21, 2018
  • pdftohtml -xml fails to extract text that is extracted in pdftotext
    #417 · opened Jun 05, 2012 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • pdftohtml: add image and font extraction
    #422 · opened Jul 19, 2011 by Bugzilla Migration User   Feature Request patch pdftohtml
    • 11
    updated Oct 08, 2018
  • pdftohtml should include charset encoding in head section of *s.html files
    #457 · opened Sep 17, 2013 by Bugzilla Migration User   pdftohtml
    • 5
    updated Aug 21, 2018
  • Spaces are stripped in all PDF's generated with PhantomJS (Node.js)
    #480 · opened Nov 15, 2017 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • [PATCH] Feature : extract media in html format
    #482 · opened Nov 07, 2012 by Bugzilla Migration User   pdftohtml
    • 22
    updated Aug 21, 2018
  • pdftotext -htmlmeta should quote text content
    #485 · opened Aug 25, 2014 by Bugzilla Migration User   pdftohtml
    • 3
    updated Oct 08, 2018
  • Import multiple images instead of one large background image
    #492 · opened Nov 26, 2012 by Bugzilla Migration User   pdftohtml
    • 0
    updated Aug 21, 2018
  • Quality of convertion pdf to html
    #518 · opened Nov 28, 2011 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • windows build of pdftohtml generating malformed first page
    #527 · opened Aug 07, 2011 by Bugzilla Migration User   pdftohtml
    • 2
    updated Aug 21, 2018
  • -xml outputs malformed xml
    #556 · opened Oct 18, 2016 by Bugzilla Migration User   pdftohtml
    • 1
    updated Aug 21, 2018
  • HTML text shifts relatively it underline
    #562 · opened Apr 19, 2015 by Bugzilla Migration User   pdftohtml
    • 2
    updated Aug 21, 2018
  • Add option to omit DOCTYPE for XML output
    #566 · opened Nov 20, 2017 by Bugzilla Migration User   pdftohtml
    • 1
    updated Mar 28, 2019
  • pdftohtml produces wrongly nested tags
    #577 · opened Feb 20, 2015 by Bugzilla Migration User   pdftohtml
    • 9
    updated Aug 21, 2018
  • Emit more font information when pdftohtml is run with -xml
    #605 · opened Jul 21, 2018 by Bugzilla Migration User   patch pdftohtml
    • 0
    updated Oct 08, 2018
  • Allow page ranges in pdftohtml
    #621 · opened Jul 29, 2018 by Bugzilla Migration User   patch pdftohtml
    • 1
    • 3
    updated Oct 08, 2018
  • Text node broken into multiple parts
    #688 · opened Dec 17, 2018 by clark knøsen   pdftohtml
    • 0
    updated Dec 19, 2018
  • Pdf that is made of scans has all images with the wrong dimensions
    #726 · opened Feb 24, 2019 by mirh   pdfimages pdftohtml
    • 6
    updated Feb 27, 2019
  • pdftohtml: Cannot parse text from OCRed document (tesseract 4.0.0)
    #745 · opened Mar 25, 2019 by Vassilis Lemonidis   pdftohtml
    • 6
    updated Mar 29, 2019
  • Bug in PdfToHtml
    #863 · opened Jan 02, 2020 by Saduk   pdftohtml
    • 2
    updated Jan 03, 2020
  • Improving XML output
    #868 · opened Jan 06, 2020 by Saduk   Feature Request pdftohtml
    • 0
    updated Jan 06, 2020
  • Incorrect text in html output from a pdf file with Hebrew text
    #877 · opened Jan 21, 2020 by Manik Cheruku   RTL pdftohtml
    • 6
    updated Jan 23, 2020
  • Alternate text for images in output xml / html
    #879 · opened Feb 01, 2020 by Martijn Burgers   Feature Request pdftohtml
    • 1
    • 0
    updated Feb 03, 2020
  • provide crop region in pdftohtml
    #919 · opened May 19, 2020 by reogen   Feature Request pdftohtml
    • 4
    updated May 22, 2020