Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add outline entries for pages in existing PDF files #170

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,70 @@ The << operator defaults to secure injection by renaming references to avoid con
pdf.pages(nil, false).each {|page| page << stamp_page}
```

## Add outline entries pointing to existing pages

You can add outline entries pointing to pages loaded or parsed from existing PDF files. The outlines dictionary allows you to create tree hierarchy.

To create outline entries for PDF files:

```ruby
pdf = CombinePDF.new
i = 1
CombinePDF.load("file.pdf").pages.each do |page|
pdf.add_outline_item(page, "Page #{i}")
pdf << page
i += 1
end
pdf.save "outlines.pdf"
```

You also can create a tree hierarchy:

```ruby
pdf = CombinePDF.new
i = 1
CombinePDF.load("file.pdf").pages.each do |page|
if i.eql? 1
pdf.add_outline_grouper(page, 'Section 1')
pdf.add_outline_grouper(page, 'Subsection 1.1')
elsif i.eql? 3
pdf.go_out_outline_grouping_level
pdf.add_outline_grouper(page, 'Subsection 1.2')
elsif i.eql? 4
pdf.go_outline_root
pdf.add_outline_grouper(page, 'Section 2')
elsif i.eql? 5
pdf.go_out_outline_grouping_level
end

pdf.add_outline_item(page, "Page #{i}")

pdf << page
i += 1
end
pdf.save "outlines.pdf"
```

This will generate something like:

```
Section 1
Subsection 1.1
Page 1
Page 2
Subsection 1.2
Page 3
Section 2
Page 4
Page 5
Page 6
```

Notice that if you add an outline_grouper (`add_outline_grouper`) all the future outline entries (`add_outline_item`) will be added under the last outline_grouper until you manually exit the grouping level (`go_out_grouping_level` and `go_outline_root`).

`go_out_grouping_level`: exit one level in the tree hierarchy.
`go_outline_root`: exit to the root in the tree hierarchy.

## Page Numbering

adding page numbers to a PDF object or file is as simple as can be:
Expand Down
131 changes: 131 additions & 0 deletions lib/combine_pdf/pdf_protected.rb
Original file line number Diff line number Diff line change
Expand Up @@ -405,5 +405,136 @@ def rename_object(object, _dictionary)
when Hash
end
end

# @private
# This method runs the process to add a new outline entry to the current
# (referenced) outline grouper (the grouper is the parent in the tree
# hierarchy). This method take 2 parameters:
#
# page:: the page object to which the outline will point.
# title:: the title for the outline.
def add_outline_node(page, title)
new_outline = new_outline_node(page, title)
insert_outline_node(new_outline)
update_children_count(actual_object(new_outline)[:Parent])
new_outline
end

# @private
# This method generates and returns a new outline object. This method takes
# 2 parameters:
#
# page:: the page to which the outline will point.
# title:: the title for the outline.
def new_outline_node(page, title)
{
is_reference_only: true,
referenced_object: {
Count: 0,
Title: title,
Dest: [
{ is_reference_only: true, referenced_object: page },
:XYZ, nil, nil, nil
],
Parent: {
is_reference_only: true,
referenced_object: @current_outline_grouper
}
}
}
end

# @private
# This method inserts a new outline node to the current (referenced)
# outline grouper (the grouper is the parent in the tree hierarchy). This
# method takes 1 parameter:
#
# outline_node:: the outline node to be inserted in the current outline grouper
def insert_outline_node(outline_node)
if outline_grouper_without_children?
insert_first_outline_child(outline_node)
elsif outline_grouper_with_only_one_child?
insert_second_outline_child(outline_node)
else
insert_last_outline_child(outline_node)
end
end

# @private
# This method inserts the first outline node in the current (referenced)
# outline grouper (the grouper is the parent in the tree hierarchy). This
# method takes 1 parameter:
#
# outline_node:: the outline node to be inserted in the current outline grouper
def insert_first_outline_child(outline_node)
@current_outline_grouper[:First] = outline_node
@current_outline_grouper[:Last] = outline_node
end

# @private
# This method inserts the second outline node in the current (referenced)
# outline grouper (the grouper is the parent in the tree hierarchy). This
# method takes 1 parameter:
#
# outline_node:: the outline node to be inserted in the current outline grouper
def insert_second_outline_child(outline_node)
actual_object(@current_outline_grouper[:First])[:Next] = outline_node
actual_object(outline_node)[:Prev] = @current_outline_grouper[:First]
@current_outline_grouper[:Last] = outline_node
end

# @private
# This method inserts one more outline node in the current (referenced)
# outline grouper (the grouper is the parent in the tree hierarchy), this
# means that the current grouper has more than 1 outline-child node. This
# method takes 1 parameter:
#
# outline_node:: the outline node to be inserted in the current outline grouper
def insert_last_outline_child(outline_node)
actual_object(@current_outline_grouper[:Last])[:Next] = outline_node
actual_object(outline_node)[:Prev] = @current_outline_grouper[:Last]
@current_outline_grouper[:Last] = outline_node
end

# @private
# This method is executed recursively to update the children count for the
# ascendant parents in the tree hierarchy of the outlines. This method takes
# 1 parameter:
#
# outline_grouper:: the outlien grouper to be updated in its children count.
def update_children_count(outline_grouper)
if actual_object(outline_grouper)[:Count].nil?
actual_object(outline_grouper)[:Count] = 0
end
actual_object(outline_grouper)[:Count] += 1
return if outline_root?(outline_grouper)

update_children_count(actual_object(outline_grouper)[:Parent])
end

# @private
# This method checks if the received outline node is the outline root of
# PDF document. This method takes 1 parameter:
#
# outline_node:: the outline object to be evaluated.
def outline_root?(outline_node)
actual_object(outline_node)[:Parent].nil?
end

# @private
# This method returns true if the current (referenced) outline grouper (the
# grouper is the parent in the tree hierarchy) has no outline-children
# nodes.
def outline_grouper_without_children?
@current_outline_grouper.exclude?(:First)
end

# @private
# This method returns true if the current (referenced) outline grouper (the
# grouper is the parent in the tree hierarchy) has only one outline-child
# node.
def outline_grouper_with_only_one_child?
@current_outline_grouper[:First].eql?(@current_outline_grouper[:Last])
end
end
end
64 changes: 64 additions & 0 deletions lib/combine_pdf/pdf_public.rb
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ def initialize(parser = nil)
@names = parser.names_object || {}
@forms_data = parser.forms_object || {}
@outlines = parser.outlines_object || {}
@current_outline_grouper = @outlines
# rebuild the catalog, to fix wkhtmltopdf's use of static page numbers
rebuild_catalog

Expand Down Expand Up @@ -503,5 +504,68 @@ def stamp_pages(stamp, options = {})
# end
# nil
# end

# This method adds a new node to the Outlines dictionary and references it
# as the current outline grouper, this means that new feature outline nodes
# are going to be added as children of this one. This method takes
# 2 parameters:
#
# page:: the page to which the outline will point.
# title:: the title for the outline.
def add_outline_grouper(page, title)
# The page param must be a Hash "Page" object
unless page.is_a?(Hash) && actual_object(page)[:Type] == :Page
warn "Shouldn't point object from outline unless it is a PDF page."
return false
end

# The title param must be a string object
unless title.is_a?(String)
warn 'Title for outline should be a String object'
return false
end

# Reference the new outline node as the current outline grouper in
# the tree hierarchy
@current_outline_grouper = actual_object(add_outline_node(page, title))
end

# This method adds a new node to the Outlines dictionary in the current
# outline grouper. This method takes 2 parameters:
#
# page:: the page to which the outline will point.
# title:: the title for the outline.
def add_outline_item(page, title)
# The page param must be a Hash "Page" object
unless page.is_a?(Hash) && actual_object(page)[:Type] == :Page
warn "Shouldn't point object from outline unless it is a PDF page."
return false
end

# The title param must be a string object
unless title.is_a?(String)
warn 'Title for outline should be a String object'
return false
end

add_outline_node(page, title)
end

# This method takes the current outline grouper out one level in the tree
# hierarchy of the Outlines.
def go_out_outline_grouping_level
return if @current_outline_grouper.nil? || outline_root?(@current_outline_grouper)

@current_outline_grouper = actual_object(@current_outline_grouper[:Parent])
end

# This method takes the current outline grouper to the root level in the
# tree hierarchy of the Outlines.
def go_outline_root
return if @current_outline_grouper.nil? || outline_root?(@current_outline_grouper)

go_out_outline_grouping_level
go_outline_root
end
end
end