class HexaPDF::Type::Page

See: PDF2.0 s7.7.3.3, s7.7.3.4, Pages
taken from the nearest page tree ancestor that has this value set.
Field inheritance means that if a field is not set on the page object itself, the value is
A number of field values can also be inherited: /Resources, /MediaBox, /CropBox, /Rotate.
/UserUnit) influence how or if the page’s content can be rendered correctly.
from the page’s content like the /Dur field. However, some of them (like /Resources or
A page object contains the meta information for a page. Most of the fields are independent
Represents a page of a PDF document.

def self.media_box(paper_size, orientation: :portrait)

See PAPER_SIZE for the defined paper sizes.

argument is not used in this case.
If an array is specified, it needs to contain exactly four numbers. The +orientation+

Returns the media box for the given paper size or array.

def self.media_box(paper_size, orientation: :portrait)
  return paper_size if paper_size.kind_of?(Array) && paper_size.size == 4 &&
    paper_size.all?(Numeric)
  unless PAPER_SIZE.key?(paper_size)
    raise HexaPDF::Error, "Invalid paper size specified: #{paper_size}"
  end
  media_box = PAPER_SIZE[paper_size].dup
  media_box[2], media_box[3] = media_box[3], media_box[2] if orientation == :landscape
  media_box
end

def [](name)

See: Dictionary#[]

value is retrieved from the ancestor page tree nodes.
If +name+ is an inheritable value and the value has not been set on the page object, its

Returns the value for the entry +name+.

def [](name)
  if value[name].nil? && INHERITABLE_FIELDS.include?(name)
    node = self
    node = node[:Parent] while node.value[name].nil? && node[:Parent]
    node == self || node.value[name].nil? ? super : node[name]
  else
    super
  end
end

def ancestor_nodes

The direct parent is the first node in the array and the root node the last.

Returns all parent nodes of the page up to the root of the page tree.

def ancestor_nodes
  parent = self[:Parent]
  result = [parent]
  result << parent while (parent = parent[:Parent])
  result
end

def annotation?(obj)

Returns +true+ if the given object seems to be an annotation.

def annotation?(obj)
  (obj.kind_of?(Hash) || obj.kind_of?(Dictionary)) &&
    obj&.key?(:Subtype) && obj&.key?(:Rect)
end

def box(type = :crop, rectangle = nil)

See: PDF2.0 s14.11.2

author. The default is the crop box.
The art box defines the region of the page's meaningful content as intended by the
:art::

value is the crop box.
The trim box defines the intended dimensions of the page after trimming. The default
:trim::

when output in a production environment. The default is the crop box.
The bleed box defines the region to which the contents of the page should be clipped
:bleed::

when it is displayed or printed. The default is the media box.
The crop box defines the region to which the contents of the page should be clipped
:crop::

The media box defines the boundaries of the medium the page is to be printed on.
:media::

The following types are allowed:

/BleedBox, /ArtBox or /TrimBox because it also takes the fallback values into account!
This method should be used instead of directly accessing any of /MediaBox, /CropBox,

values or a HexaPDF::Rectangle).
page. Otherwise sets the value for the given box type to +rectangle+ (an array with four
If no +rectangle+ is given, returns the rectangle defining a certain kind of box for the

page.box(type = :crop, rectangle) -> rectangle
page.box(type = :crop) -> box
:call-seq:

def box(type = :crop, rectangle = nil)
  if rectangle
    case type
    when :media, :crop, :bleed, :trim, :art
      self[:"#{type.capitalize}Box"] = rectangle
    else
      raise ArgumentError, "Unsupported page box type provided: #{type}"
    end
  else
    media_box = self[:MediaBox]
    result = case type
             when :media then media_box
             when :crop then self[:CropBox] || media_box
             when :bleed then self[:BleedBox] || self[:CropBox] || media_box
             when :trim then self[:TrimBox] || self[:CropBox] || media_box
             when :art then self[:ArtBox] || self[:CropBox] || media_box
             else
               raise ArgumentError, "Unsupported page box type provided: #{type}"
             end
    unless result == media_box
      if result.right < media_box.left || result.left > media_box.right ||
          result.top < media_box.bottom || result.bottom > media_box.top
        result.value = [0, 0, 0, 0]
      else
        result.left = media_box.left if result.left < media_box.left
        result.right = media_box.right if result.right > media_box.right
        result.top = media_box.top if result.top > media_box.top
        result.bottom = media_box.bottom if result.bottom < media_box.bottom
      end
    end
    result
  end
end

def canvas(type: :page, translate_origin: true)

translated.
and check whether the result is [0, 0]. If it is, then the origin has not been

canvas.pos(0, 0)

To check whether the origin has been translated or not, use

it won't have any effect as the cached canvas is returned.
if a canvas was initially requested with this argument set to false and then with true,
Note that this argument is only used for the first invocation for every canvas type. So

corner of the crop box.
Specifies whether the origin should automatically be translated into the lower-left
translate_origin::

* :underlay for getting the canvas for drawing unter the page contents
* :overlay for getting the canvas for drawing over the page contents
* :page for getting the canvas for the page itself (only valid for initially empty pages)
Can either be
type::

changed.
means that on subsequent invocations the graphic states of the canvases might already be
graphics states are correctly retained without the need for parsing the contents. This also
:page, and :overlay. The canvas objects are cached once they are created so that their
There are potentially three different canvas objects, one for each of the types :underlay,

Returns the requested type of canvas for the page.

def canvas(type: :page, translate_origin: true)
  unless [:page, :overlay, :underlay].include?(type)
    raise ArgumentError, "Invalid value for 'type', expected: :page, :underlay or :overlay"
  end
  cache_key = "#{type}_canvas".intern
  return cache(cache_key) if cached?(cache_key)
  if type == :page && key?(:Contents)
    raise HexaPDF::Error, "Cannot get the canvas for a page with contents"
  end
  create_canvas = lambda do
    Content::Canvas.new(self).tap do |canvas|
      next unless translate_origin
      crop_box = box(:crop)
      if crop_box.left != 0 || crop_box.bottom != 0
        canvas.translate(crop_box.left, crop_box.bottom)
      end
    end
  end
  contents = self[:Contents]
  if contents.nil?
    page_canvas = cache(:page_canvas, create_canvas.call)
    self[:Contents] = document.add({Filter: :FlateDecode},
                                   stream: page_canvas.stream_data)
  end
  if type == :overlay || type == :underlay
    underlay_canvas = cache(:underlay_canvas, create_canvas.call)
    overlay_canvas = cache(:overlay_canvas, create_canvas.call)
    stream = HexaPDF::StreamData.new do
      Fiber.yield(" q ")
      fiber = underlay_canvas.stream_data.fiber
      while fiber.alive? && (data = fiber.resume)
        Fiber.yield(data)
      end
      " Q q "
    end
    underlay = document.add({Filter: :FlateDecode}, stream: stream)
    stream = HexaPDF::StreamData.new do
      Fiber.yield(" Q q ")
      fiber = overlay_canvas.stream_data.fiber
      while fiber.alive? && (data = fiber.resume)
        Fiber.yield(data)
      end
      " Q "
    end
    overlay = document.add({Filter: :FlateDecode}, stream: stream)
    self[:Contents] = [underlay, *self[:Contents], overlay]
  end
  cache(cache_key)
end

def contents

streams' data!
Note: Any modifications done to the returned value *won't* be reflected in any of the

Returns the concatenated stream data from the content streams as binary string.

def contents
  Array(self[:Contents]).each_with_object("".b) do |content_stream, content|
    content << " " unless content.empty?
    content << content_stream.stream if content_stream.kind_of?(Stream)
  end
end

def contents=(data)

or by creating a new one if no content stream exists.
This is done by deleting all but the first content stream and reusing this content stream;

Replaces the contents of the page with the given string.

def contents=(data)
  first, *rest = self[:Contents]
  rest.each {|stream| document.delete(stream) }
  if first
    self[:Contents] = first
    document.deref(first).stream = data
  else
    self[:Contents] = document.add({Filter: :FlateDecode}, stream: data)
  end
end

def copy_inherited_values

position to another) or another page (e.g. when importing a page from another document).
The hash can then be used to update the page itself (e.g. when moving a page from one

the hash.
Copies the page's inherited values from the ancestor page tree nodes into a hash and returns

def copy_inherited_values
  INHERITABLE_FIELDS.each_with_object({}) do |name, hash|
    hash[name] = HexaPDF::Object.deep_copy(self[name]) if value[name].nil?
  end
end

def each_annotation

Yields each annotation of this page.

page.each_annotation -> Enumerator
page.each_annotation {|annotation| block} -> page
:call-seq:

def each_annotation
  return to_enum(__method__) unless block_given?
  Array(self[:Annots]).each do |annotation|
    next unless annotation?(annotation)
    yield(document.wrap(annotation, type: :Annot))
  end
  self
end

def flatten_annotations(annotations = self[:Annots])

field itself.
If an annotation is a form field widget, only the widget will be deleted but not the form

not rendered into the content stream.
page and deleting the annotations themselves. Invisible and hidden fields are deleted but
Flattening means making the appearances of the annotations part of the content stream of the

that couldn't be flattened because they don't have an appearance stream.
Flattens all or the given annotations of the page. Returns an array with all the annotations

def flatten_annotations(annotations = self[:Annots])
  not_flattened = Array(annotations) || []
  unless self[:Annots].kind_of?(PDFArray)
    return (not_flattened == [annotations] ? [] : not_flattened)
  end
  annotations = if annotations == self[:Annots]
                  not_flattened
                else
                  not_flattened & self[:Annots]
                end
  return not_flattened if annotations.empty?
  canvas = self.canvas(type: :overlay)
  if (pos = canvas.pos(0, 0)) != [0, 0]
    canvas.save_graphics_state
    canvas.translate(-pos[0], -pos[1])
  end
  to_delete = Set.new
  not_flattened -= annotations
  annotations.each do |annotation|
    unless annotation?(annotation)
      self[:Annots].delete(annotation)
      next
    end
    annotation = document.wrap(annotation, type: :Annot)
    appearance = annotation.appearance
    if annotation.flagged?(:hidden) || annotation.flagged?(:invisible)
      to_delete << annotation
      next
    elsif !appearance
      not_flattened << annotation
      next
    end
    rect = annotation[:Rect]
    box = appearance.box
    # PDF2.0 12.5.5 algorithm
    # Step 1) Calculate smallest rectangle containing transformed bounding box
    matrix = HexaPDF::Content::TransformationMatrix.new(*appearance[:Matrix].value)
    llx, lly = matrix.evaluate(box.left, box.bottom)
    ulx, uly = matrix.evaluate(box.left, box.top)
    lrx, lry = matrix.evaluate(box.right, box.bottom)
    left, right = [llx, ulx, lrx, lrx + (ulx - llx)].minmax
    bottom, top = [lly, uly, lry, lry + (uly - lly)].minmax
    # Handle degenerate case of the transformed bounding box being a line or point
    if right - left == 0 || top - bottom == 0
      to_delete << annotation
      next
    end
    # Step 2) Fit calculated rectangle to annotation rectangle by translating/scaling
    # The final matrix is composed by translating the bottom-left corner of the transformed
    # bounding box to the bottom-left corner of the annotation rectangle and scaling from the
    # bottom-left corner of the transformed bounding box.
    sx = rect.width.fdiv(right - left)
    sy = rect.height.fdiv(top - bottom)
    tx = rect.left - left + left - left * sx
    ty = rect.bottom - bottom + bottom - bottom * sy
    # Step 3) Premultiply form matrix - done implicitly when drawing the XObject
    canvas.transform(sx, 0, 0, sy, tx, ty) do
      # Use [box.left, box.bottom] to counter default translation in #xobject since that
      # is already taken care of in matrix a
      canvas.xobject(appearance, at: [box.left, box.bottom])
    end
    to_delete << annotation
  end
  canvas.restore_graphics_state unless pos == [0, 0]
  to_delete.each do |annotation|
    if annotation[:Subtype] == :Widget
      annotation.form_field.delete_widget(annotation)
    else
      self[:Annots].delete(annotation)
      document.delete(annotation)
    end
  end
  not_flattened
end

def index

Returns the index of the page in the page tree.

def index
  idx = 0
  node = self
  while (parent_node = node[:Parent])
    parent_node[:Kids].each do |kid|
      break if kid == node
      idx += (kid.type == :Page ? 1 : kid[:Count])
    end
    node = parent_node
  end
  idx
end

def label

See HexaPDF::Document::Pages for details.

index.
Returns the label of the page which is an optional, alternative description of the page

def label
  document.pages.page_label(index)
end

def must_be_indirect?

Returns +true+ since page objects must always be indirect.

def must_be_indirect?
  true
end

def orientation(type = :crop)

:landscape.
Returns the orientation of the specified box (default is the crop box), either :portrait or

def orientation(type = :crop)
  box = self.box(type)
  rotation = self[:Rotate]
  if (box.height > box.width && (rotation == 0 || rotation == 180)) ||
      (box.height < box.width && (rotation == 90 || rotation == 270))
    :portrait
  else
    :landscape
  end
end

def perform_validation(&block)

Ensures that the required inheritable fields are set.

def perform_validation(&block)
  root_node = document.catalog.pages
  parent_node = self[:Parent]
  parent_node = parent_node[:Parent] while parent_node && parent_node != root_node
  return unless parent_node
  super
  unless self[:Resources]
    yield("Required inheritable page field Resources not set", true)
    resources.validate(&block)
  end
  unless self[:MediaBox]
    yield("Required inheritable page field MediaBox not set", true)
    index = self.index
    box_before = index == 0 ? nil : document.pages[index - 1][:MediaBox]
    box_after = index == document.pages.count - 1 ? nil : document.pages[index + 1]&.[](:MediaBox)
    self[:MediaBox] =
      if box_before && (box_before&.value == box_after&.value || box_after.nil?)
        box_before.dup
      elsif box_after && box_before.nil?
        box_after
      else
        self.class.media_box(document.config['page.default_media_box'],
                             orientation: document.config['page.default_media_orientation'])
      end
  end
end

def process_contents(processor)

See: HexaPDF::Content::Processor

Processes the content streams associated with the page with the given processor object.

def process_contents(processor)
  self[:Resources] = {} if self[:Resources].nil?
  processor.resources = self[:Resources]
  Content::Parser.parse(contents, processor)
end

def resources

doesn't exist.
Returns the, possibly inherited, resource dictionary which is automatically created if it

def resources
  self[:Resources] ||= document.wrap({}, type: :XXResources)
end

def rotate(angle, flatten: false)

HexaPDF::Content::Canvas#rotate).
method uses counterclockwise rotation to be consistent with other rotation methods (e.g.
* The /Rotate key of a page object describes the angle in a clockwise orientation but this

page boxes and annotations).
/Rotate key is removed and the existing rotation information incorporated into the canvas,
The only meaningful usage of 0 for +angle+ is when +flatten+ is set to +true+ (so that the
* Specifying 0 for +angle+ is valid and means that no additional rotation should be applied.

(specified via the /Rotate key) and does not replace it.
* The given +angle+ is applied in addition to a possibly already existing rotation

Notes:

annotations.
rotating the canvas itself and all other necessary objects like the various page boxes and
+true+, the rotation is not done via the page's meta (i.e. the /Rotate key) data but by
Positive values rotate the page to the left, negative values to the right. If +flatten+ is

Rotates the page +angle+ degrees counterclockwise where +angle+ has to be a multiple of 90.

def rotate(angle, flatten: false)
  if angle % 90 != 0
    raise ArgumentError, "Page rotation has to be multiple of 90 degrees"
  end
  # /Rotate and therefore cw_angle is angle in clockwise orientation
  cw_angle = (self[:Rotate] - angle) % 360
  if flatten
    delete(:Rotate)
    return if cw_angle == 0
    pbox = box
    matrix = case cw_angle
             when 90  then Content::TransformationMatrix.new(0, -1, 1, 0, -pbox.bottom, pbox.right)
             when 180 then Content::TransformationMatrix.new(-1, 0, 0, -1, pbox.right, pbox.top)
             when 270 then Content::TransformationMatrix.new(0, 1, -1, 0, pbox.top, -pbox.left)
             end
    rotate_box = lambda do |box|
      llx, lly, urx, ury =
        case cw_angle
        when 90  then [box.right, box.bottom, box.left, box.top]
        when 180 then [box.right, box.top, box.left, box.bottom]
        when 270 then [box.left, box.top, box.right, box.bottom]
        end
      box.value.replace(matrix.evaluate(llx, lly).concat(matrix.evaluate(urx, ury)))
    end
    [:MediaBox, :CropBox, :BleedBox, :TrimBox, :ArtBox].each do |box_name|
      next unless key?(box_name)
      rotate_box.call(self[box_name])
    end
    each_annotation do |annot|
      rotate_box.call(annot[:Rect])
      if (quad_points = annot[:QuadPoints])
        quad_points = quad_points.value if quad_points.respond_to?(:value)
        result = []
        quad_points.each_slice(2) {|x, y| result.concat(matrix.evaluate(x, y)) }
        quad_points.replace(result)
      end
      if (appearance = annot.appearance)
        appearance[:Matrix] = matrix.dup.premultiply(*appearance[:Matrix].value).to_a
      end
      if annot[:Subtype] == :Widget
        app_ch = annot[:MK] ||= document.wrap({}, type: :XXAppearanceCharacteristics)
        app_ch[:R] = (app_ch[:R] + 360 - cw_angle) % 360
      end
    end
    before_contents = document.add({}, stream: " q #{matrix.to_a.join(' ')} cm ")
    after_contents = document.add({}, stream: " Q ")
    self[:Contents] = [before_contents, *self[:Contents], after_contents]
  else
    self[:Rotate] = cw_angle
  end
end

def to_form_xobject(reference: true)

make it a fully valid content stream.
reason is that during the copying of the content stream data the contents may be modified to
method should only be called once the contents of the page has been fully defined. The
Note 2: If +reference+ is false and if a canvas is used on this page (see #canvas), this

Note 1: The created Form XObject is *not* added to the document automatically!

decoding/encoding.
If +reference+ is true, the page's contents is referenced when possible to avoid unnecessary

Creates a Form XObject from the page's dictionary and contents for the given PDF document.

def to_form_xobject(reference: true)
  first, *rest = self[:Contents]
  stream = if !first
             nil
           elsif !reference || !rest.empty? || first.raw_stream.kind_of?(String)
             contents
           else
             first.raw_stream
           end
  dict = {
    Type: :XObject,
    Subtype: :Form,
    BBox: HexaPDF::Object.deep_copy(box(:crop)),
    Resources: HexaPDF::Object.deep_copy(self[:Resources]),
    Filter: :FlateDecode,
  }
  document.wrap(dict, stream: stream)
end

Instance Methods

Defined in

lib/hexapdf/type/page.rb

Modules

Classes