Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • P poppler
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 665
    • Issues 665
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 46
    • Merge requests 46
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • poppler
  • poppler
  • Issues
  • #818
Closed
Open
Issue created Aug 19, 2019 by Jeroen Ooms@jeroenContributor

poppler-cpp memory leaking on Windows

Several Windows users of the R bindings have complained about major memory leakage and unfortunately I was able to confirm the problem. The R bindings use the poppler-cpp interface and we use mingw-w64 to build on Windows.

I have compared exactly the same code on Linux, MacOS and Windows, both with poppler 0.73.0 (our current release version). Indeed, on MacOS the memory usage is stable and on Windows it rapidly increases. I have confirmed this both with GCC 8.3.0 and GCC 4.9.3 on Windows.

From some trial and error, it seems that the issue does not appear yet when loading with load_from_raw_data().

static document *read_raw_pdf(RawVector x, std::string opw, std::string upw, bool info_only = 0){
  document *doc = document::load_from_raw_data(	(const char*) x.begin(), x.length(), opw, upw);
  if(!doc)
    throw std::runtime_error("PDF parsing failure.");
  return doc;
}

However as soon as I read something from the document such as doc->fonts() or doc->pages(), it seems that the document starts leaking memory.

List poppler_pdf_fonts (RawVector x, std::string opw, std::string upw) {
  std::unique_ptr<poppler::document> doc(read_raw_pdf(x, opw, upw));
  std::vector<font_info> fonts = doc->fonts();
  ...
}

Even after doc has been delete'd the process keeps holding on to memory. If we do this for many pdf files, we eventually run out of memory. It seems like something in the document is not being free'd on Windows.

Is the memory allocation in poppler different on Windows than unix? What could be causing this?

Edited Aug 19, 2019 by Jeroen Ooms
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking