1. 30 Sep, 2013 12 commits
    • Adrián Pérez de Castro's avatar
      glib-demo: Pane showing the document structure · 9ab1b225
      Adrián Pérez de Castro authored
      Adds a new pane in poppler-glib-demo showing the structure for Tagged-PDF
      documents. It also serves as an example on how to to use the API for
      PopplerStructure and PopplerStructureElement.
      9ab1b225
    • Adrián Pérez de Castro's avatar
      glib: Implement accessors for element attributes · 3b5294b4
      Adrián Pérez de Castro authored
      Implement inspecting the standard attributes of PopplerStructureElement
      objects.
      3b5294b4
    • Adrián Pérez de Castro's avatar
      glib: Accessors for document structure references · 7a30796a
      Adrián Pérez de Castro authored
      Implements functions to work with PopplerStructureElement objects which
      are references to other entities in the document. Whether an element is
      a reference can be checked with poppler_structure_element_is_reference(),
      and poppler_structure_element_get_reference_type() returns the type of
      object referenced. For POPPLER_STRUCTURE_REFERENCE_LINK references, a
      PopplerAction (or PopplerLinkMapping) can be retrieved using the
      poppler_structure_element_get_reference_link_action() or
      poppler_structure_element_get_reference_link_mapping() methods.
      7a30796a
    • Adrián Pérez de Castro's avatar
      glib: Accessors for form fields of structure elements · 02a790c4
      Adrián Pérez de Castro authored
      Implements two methods which can be used on PopplerStructureElement objects
      of type POPPLER_STRUCTURE_ELEMENT_FORM:
      
      - poppler_structure_element_get_form_field()
      - poppler_structure_element_get_form_field_mapping()
      
      The only difference is that the later returns a PopplerFormFieldMapping,
      which itself contains the PopplerFormField, plus the rectangle covered by
      the form widget in the page.
      02a790c4
    • Adrián Pérez de Castro's avatar
      glib: Expose inline attributes of structure elements · 2bdbec8e
      Adrián Pérez de Castro authored
      Allows obtaining inline text attributes from structure elements. The text
      is divived into "spans", which are groups of consecutive glyphs that share
      their attributes. Each one of those is represented by a PopplerTextSpan,
      which gives information about the text font and color, and the link target
      for links. The list of PopplerTextSpans is created lazily when first used.
      2bdbec8e
    • Adrián Pérez de Castro's avatar
      glib: Private function _poppler_link_mapping_new_from_form_field() · 5d3a8424
      Adrián Pérez de Castro authored
      Move the code that creates a PopplerFormFieldMapping to its own function,
      and add it as private API. This will avoid duplication of the code when
      creation a PopplerFormFieldMapping from a PopplerStructureElement.
      5d3a8424
    • Adrián Pérez de Castro's avatar
      glib: Private function _poppler_link_mapping_new_from_annot_link() · c503dd05
      Adrián Pérez de Castro authored
      Move the code that creates a PopplerLinkMapping to its own function, and add
      it as private API. This will avoid duplication of the code when creating a
      PopplerLinkMapping from a PopplerStructureElement.
      c503dd05
    • Adrián Pérez de Castro's avatar
      glib: Expose the document structure tree · 454abc80
      Adrián Pérez de Castro authored
      Implements a new PopplerStructureElement classe, which builds upon
      StructTreeRoot and StructElement to expose the document structure of
      tagged PDFs in the GLib binding.
      
      Navigation of the structure tree is done by an iterator-based interface,
      using PopplerStructureElementIter.
      454abc80
    • Adrián Pérez de Castro's avatar
      Tagged-PDF: Text content extraction from structure elements · 37e73b9a
      Adrián Pérez de Castro authored
      Implement StructElement::getText(), by using MCOutputDev. This output device
      captures the a sequence MCOp structures representing the text drawing
      operations for a particular marked content text object from the page stream.
      Those are then used to convert the individual Unicode characters to the
      returned string.
      37e73b9a
    • Adrián Pérez de Castro's avatar
      Tagged-PDF: Parsing of StructElem standard types and attributes · 7cabf51a
      Adrián Pérez de Castro authored
      Parse attributes and standard types of StructElem nodes of the
      document structure tree. Type name aliases are resolved via the
      RoleMap (and cycles detected). Both standard attributes and user
      properties are mapped to instances of the Attribute class.
      Attributes are parsed both via ClassMap references and directly
      referenced from the StructElem objects.
      7cabf51a
    • Adrián Pérez de Castro's avatar
      Tagged-PDF: Implement parsing of StructTreeRoot · a322e14d
      Adrián Pérez de Castro authored
      Implement parsing of the StructTreeRoot entry of the Catalog. Also, the
      Catalog::getStructTreeRoot() and PDFDoc::getStructTreeRoot() methods are
      modified to return an instance of StructTreeRoot instead of an Object.
      
      All elements from the StructTreeRoot are parsed except for:
      
      - IDTree: it is a lookup tree to locate items by their ID, which would
        be barely useful because the whole structure tree is to be kept in
        memory, which should be fast enough to traverse.
      - ParentTreeNextKey: This is needed only when the ParentTree object is
        to be modified. For the moment the implementation deals only with
        reading, so this has been deliberately left out.
      
      StructElem tree nodes from the document structure tree are parsed as a
      StructElement instance. Attributes and extraction of content out from
      elements are not yet handled.
      
      https://bugs.freedesktop.org/show_bug.cgi?id=64815
      a322e14d
    • Adrián Pérez de Castro's avatar
      Implement Object::takeString() method · c6884248
      Adrián Pérez de Castro authored
      Object::takeString() behaves like Object::getString(), but transfers
      ownership of the returned string to the caller. Also, it makes sure that
      calling Object::free() afterwards won't free the string that the Object
      is holding.
      c6884248
  2. 21 Sep, 2013 4 commits
  3. 16 Sep, 2013 2 commits
  4. 14 Sep, 2013 1 commit
  5. 31 Aug, 2013 1 commit
  6. 30 Aug, 2013 2 commits
  7. 27 Aug, 2013 1 commit
  8. 26 Aug, 2013 2 commits
  9. 25 Aug, 2013 15 commits