Feature request: command line utility for document structures
It would be great to have a tool viewing and editing pdf structures.
Proposed functionalities:
pdfstructure <PDF-file> <component> [options]
Components:
table-of-contents, toc
page-numbers, num
options:
--get, -g: Print the specified component of the pdf to stdout. This is the default behavior.
--set=file, -s: Read the component from <file> and assign it to the pdf. '-' for file means stdin.
--interactive, -i: Use $EDITOR to view and edit the component.
The syntax of both components should be easy to read and parse. I propose yaml as a candidate. Examples:
# Table of Contents
# Physical page numbers should be prefixed with "+", e.g.: +1, +2, +3, ...
# Logical page numbers should be left as is, e.g.: ii, viii, 1, 8, ...
- ii Preface
- vii Table of Contents
- 1 Chapter 1:
- 1 Section 1.1
- 6 Section 1.2
- 2 Chapter 2:
- 13 Section 2.1
- 18 Section 2.2:
- 19 Subsection 2.2.1
# Page numbers
- 1-8: {from: 1, style: lower-roman} # 1: i, 2: ii, ..., 8: viii
- 9-: {from: 1} # same with {from: 1, style: decimal}; 9: 1, 10: 2, ...