class HexaPDF::Type::ObjectStream

See PDF2.0 s7.5.7
other objects are deleted from the object stream (#delete_object) and written normally.
However, only objects that can be written to the object stream are actually written. The
A user thus only has to define which objects should reside in the object stream.
to-be-stored objects are serialized to the stream. This is automatically done by the Writer.
Before an object stream is written, it is necessary to invoke #write_objects so that the
assigned to the object stream via #add_object or deleted from it via #delete_object.
list of to-be-stored objects when #parse_stream is invoked. Additional objects can be
The indirect objects initially stored in the object stream are automatically added to the
changes when an object is located inside an object stream instead of directly in a PDF file.
loaded and returned using this ObjectStream::Data object. From a user’s perspective nothing
representing the stored indirect objects. After that the requested indirect object itself is
itself is parsed and loaded and #parse_stream is invoked to get an ObjectStream::Data object
When an indirect object that resides in an object stream needs to be loaded, the object stream
== How are Object Streams Used?
represent the indirect objects more compactly than would be possible otherwise.
stored inside the stream, filters can be used to compress the stream content and therefore
An object stream is a stream that can hold multiple indirect objects. Since the objects are
Represents PDF type ObjStm, object streams.

def add_object(ref)

The +ref+ argument can either be a reference or any PDF object.

Adds the given object to the list of objects that should be stored in this object stream.

def add_object(ref)
  return if object_index(ref)
  index = objects.size / 2
  objects[index] = ref
  objects[ref] = index
end

def delete_object(ref)

The +ref+ argument can either be a reference or a PDF object.

stream.
Deletes the given object from the list of objects that should be stored in this object

def delete_object(ref)
  index = objects[ref]
  return unless index
  move_index = objects.size / 2 - 1
  objects[index] = objects[move_index]
  objects[objects[index]] = index
  objects.delete(ref)
  objects.delete(move_index)
end

def object_index(obj)

reference/PDF object.
Returns the index into the array containing the to-be-stored objects for the given

def object_index(obj)
  objects[obj]
end

def objects

Returns the container with the to-be-stored objects.

def objects
  @objects ||=
    begin
      @objects = {}
      parse_stream
      @objects
    end
end

def parse_oids_and_offsets(data)

Parses the object numbers and their offsets from the start of the stream data.

def parse_oids_and_offsets(data)
  oids = []
  offsets = []
  first = value[:First].to_i
  stream_tokenizer = Tokenizer.new(StringIO.new(data))
  !data.empty? && value[:N].to_i.times do
    oids << stream_tokenizer.next_object
    offsets << first + stream_tokenizer.next_object
  end
  [oids, offsets]
end

def parse_stream

the object gets written.
The object references are also added to this object stream so that they are included when

the objects defined by this object stream.
Parses the stream and returns an ObjectStream::Data object that can be used for retrieving

def parse_stream
  return @stream_data if defined?(@stream_data)
  data = stream
  oids, offsets = parse_oids_and_offsets(data)
  @objects ||= {}
  oids.each {|oid| add_object(Reference.new(oid, 0)) }
  @stream_data = Data.new(data, oids, offsets)
end

def perform_validation

Validates that the generation number of the object stream is zero.

def perform_validation
  # Assign dummy values so that the validation for required values works since those values
  # are only set on #write_objects
  self[:N] ||= 0
  self[:First] ||= 0
  super
  yield("Object stream has invalid generation number > 0", false) if gen != 0
end

def write_objects(revision)

written as indirect objects.
Such objects are additionally deleted from the list of to-be-stored objects and are later

* It doesn't reside in the given Revision object.
* It is a stream object.
* It has a generation number other than 0.

There are some reasons why an added object may not be stored in the stream:

this object stream.
Writes the added objects to the stream and returns a hash mapping all written objects to

objstm.write_objects(revision) -> obj_to_stm_hash
:call-seq:

def write_objects(revision)
  index = 0
  object_info = ''.b
  data = ''.b
  serializer = Serializer.new
  obj_to_stm = {}
  is_encrypt_dict = document.revisions.each.with_object({}) do |rev, hash|
    hash[rev.trailer[:Encrypt]] = true
  end
  while index < objects.size / 2
    obj = revision.object(objects[index])
    # Due to a bug in Adobe Acrobat, the Catalog may not be in an object stream if the
    # document is encrypted
    if obj.nil? || obj.null? || obj.gen != 0 || obj.kind_of?(Stream) ||
        is_encrypt_dict[obj] ||
        obj.type == :Catalog ||
        obj.type == :Sig || obj.type == :DocTimeStamp ||
        (obj.respond_to?(:key?) && obj.key?(:ByteRange) && obj.key?(:Contents))
      delete_object(objects[index])
      next
    end
    obj_to_stm[obj] = self
    object_info << "#{obj.oid} #{data.size} "
    data << serializer.serialize(obj) << " "
    index += 1
  end
  value[:Type] = :ObjStm
  value[:N] = objects.size / 2
  value[:First] = object_info.size
  self.stream = object_info << data
  set_filter(:FlateDecode)
  obj_to_stm
end

Instance Methods

Defined in

lib/hexapdf/type/object_stream.rb