class HexaPDF::Revisions
See: PDF2.0 s7.5.6, HexaPDF::Revision
best to use the convenience methods of this class to create, access or delete indirect objects.
this should only be done if one is familiar with the inner workings of HexaPDF. Otherwise it is
Important: It is possible to manipulate the individual revisions and their objects oneself but
written.
the newest revision the highest index. This is also the order in which the revisions get
The order of the revisions is important. In HexaPDF the oldest revision always has index 0 and
content.
are made. This allows for adding information/content to a PDF file without changing the original
A PDF document has one revision when it is created. Later, new revisions are added when changes
Manages the revisions of a PDF document.
def add
*Note*: This method should only be used if one is familiar with the inner workings of HexaPDF
Adds a new empty revision to the document and returns it.
def add if @revisions.empty? trailer = {} else trailer = current.trailer.value.dup trailer.delete(:Prev) trailer.delete(:XRefStm) end rev = Revision.new(@document.wrap(trailer, type: :XXTrailer)) @revisions.push(rev) rev end
def add_object(obj)
Adds the given HexaPDF::Object to the current revision and returns it.
revisions.add_object(object) -> object
:call-seq:
def add_object(obj) if obj.indirect? && (rev_obj = current.object(obj.oid)) if rev_obj.data == obj.data return obj else raise HexaPDF::Error, "Can't add object because there is already " \ "an object with object number #{obj.oid}" end end obj.oid = next_oid unless obj.indirect? current.add(obj) end
def all
*Note*: This method should only be used if one is familiar with the inner workings of HexaPDF
Returns a list of all revisions.
def all @revisions end
def current
*Note*: This method should only be used if one is familiar with the inner workings of HexaPDF
Returns the current revision.
def current @revisions.last end
def delete_object(ref)
revisions.delete_object(oid)
revisions.delete_object(ref)
:call-seq:
def delete_object(ref) @revisions.reverse_each do |rev| if rev.object?(ref) rev.delete(ref) break end end end
def each(&block)
*Note*: This method should only be used if one is familiar with the inner workings of HexaPDF
Iterates over all revisions from oldest to current one.
revisions.each -> Enumerator
revisions.each {|rev| block } -> revisions
:call-seq:
def each(&block) return to_enum(__method__) unless block_given? @revisions.each(&block) self end
def each_object(only_current: true, only_loaded: false, &block)
*Note* that setting +only_current+ to +false+ is normally not necessary and should not be
oid/gen [3,1].
generation numbers in different revisions, e.g. one object with oid/gen [3,0] and one with
* Additionally, there may also be objects with the same object number but different
two (different) objects with oid/gen [3,0].
* Multiple revisions may contain objects with the same object and generation numbers, e.g.
revisions:
The +only_current+ option can make a difference because the document can contain multiple
from newest to oldest are returned, not only the current version of each object.
number is yielded exactly once. If the +only_current+ option is +false+, all stored objects
By default, only the current version of each object is returned which implies that each object
This does only matter when the document instance was created from an existing PDF document.
If +only_loaded+ is +true+, only the already loaded objects of the PDF document are yielded.
Yields every object and optionally the revision it is in.
revisions.each_object(only_current: true, only_loaded: false) -> Enumerator
revisions.each_object(only_current: true, only_loaded: false) {|obj, rev| block } -> revisions
revisions.each_object(only_current: true, only_loaded: false) {|obj| block } -> revisions
:call-seq:
def each_object(only_current: true, only_loaded: false, &block) unless block_given? return to_enum(__method__, only_current: only_current, only_loaded: only_loaded) end yield_rev = (block.arity == 2) oids = {} @revisions.reverse_each do |rev| rev.each(only_loaded: only_loaded) do |obj| next if only_current && oids.include?(obj.oid) yield_rev ? yield(obj, rev) : yield(obj) oids[obj.oid] = true end end self end
def from_io(document, io)
object.
Loads all revisions for the document from the given IO and returns the created Revisions
def from_io(document, io) return new(document) if io.nil? parser = Parser.new(io, document) object_loader = lambda {|xref_entry| parser.load_object(xref_entry) } revisions = [] begin offset = parser.startxref_offset seen_xref_offsets = {} while offset && !seen_xref_offsets.key?(offset) # PDF2.0 s7.5.5 states that :Prev needs to be indirect, Adobe's reference 3.4.4 says it # should be direct. Adobe's POV is followed here. Same with :XRefStm. xref_section, trailer = parser.load_revision(offset) seen_xref_offsets[offset] = true stm = trailer[:XRefStm] if stm && !seen_xref_offsets.key?(stm) if xref_section.max_oid == 0 && trailer[:Prev] > stm # Revision is completely empty, with xref stream in previous revision merge_revision = trailer[:Prev] end stm_xref_section, = parser.load_revision(stm) stm_xref_section.merge!(xref_section) xref_section = stm_xref_section seen_xref_offsets[stm] = true end if parser.linearized? && !trailer.key?(:Prev) merge_revision = offset end if merge_revision == offset && !revisions.empty? xref_section.merge!(revisions.first.xref_section) offset = trailer[:Prev] # Get possible next offset before overwriting trailer trailer = revisions.first.trailer revisions.shift else offset = trailer[:Prev] end revisions.unshift(Revision.new(document.wrap(trailer, type: :XXTrailer), xref_section: xref_section, loader: object_loader)) end rescue HexaPDF::MalformedPDFError raise unless (reconstructed_revision = parser.reconstructed_revision) unless revisions.empty? reconstructed_revision.trailer.data.value = revisions.last.trailer.data.value end revisions << reconstructed_revision end document.version = parser.file_header_version rescue '1.0' new(document, initial_revisions: revisions, parser: parser) end
def initialize(document, initial_revisions: nil, parser: nil)
even though the document was read from an IO stream, some parts may not work, like
The parser with which the initial revisions were read. If this option is not specified
parser::
single empty revision is added.
An array of revisions that should initially be used. If this option is not specified, a
initial_revisions::
Options:
Creates a new revisions object for the given PDF document.
def initialize(document, initial_revisions: nil, parser: nil) @document = document @parser = parser @revisions = [] if initial_revisions @revisions += initial_revisions else add end end
def merge(range = 0..-1)
Merges the revisions specified by the given range into one. Objects from newer revisions
revisions.merge(range = 0..-1) -> revisions
:call-seq:
def merge(range = 0..-1) @revisions[range].reverse.each_cons(2) do |rev, prev_rev| prev_rev.trailer.value.replace(rev.trailer.value) rev.each do |obj| if obj.data != prev_rev.object(obj)&.data prev_rev.delete(obj.oid, mark_as_free: false) prev_rev.add(obj) end end end _first, *other = *@revisions[range] other.each {|rev| @revisions.delete(rev) } self end
def next_oid
def next_oid @revisions.map(&:next_free_oid).max end
def object(ref)
PDF Null object, not by +nil+!
For references to unknown objects, +nil+ is returned but free objects are represented by a
given object number.
Returns the current version of the indirect object for the given exact reference or for the
revisions.object(oid) -> obj or nil
revisions.object(ref) -> obj or nil
:call-seq:
def object(ref) i = @revisions.size - 1 while i >= 0 if (result = @revisions[i].object(ref)) return result end i -= 1 end nil end
def object?(ref)
Even though this method might return +true+ for some references, #object may return +nil+
reference or for the given object number.
Returns +true+ if one of the revisions contains an indirect object for the given exact
revisions.object?(oid) -> true or false
revisions.object?(ref) -> true or false
:call-seq:
def object?(ref) @revisions.any? {|rev| rev.object?(ref) } end