class PDF::Reader::XRef

file.
object needs to be found, the Xref table is used to find where it is stored in the
An Xref table is a map of object identifiers and byte offsets. Any time a particular
An internal PDF::Reader class that represents the Xref table in a PDF file
###############################################################################

def initialize (buffer)

create a new Xref table based on the contents of the supplied PDF::Reader::Buffer object
###############################################################################

def initialize (buffer)
  @buffer = buffer
  @xref = {}
end

def load (offset = nil)

Will fail silently if there is no xref table at the requested offset.

will be loaded from there, otherwise the default offset will be located and used.
Read the xref table from the underlying buffer. If offset is specified the table
###############################################################################

def load (offset = nil)
  offset ||= @buffer.find_first_xref_offset
  @buffer.seek(offset)
  token = @buffer.token
  
  if token == "xref" || token == "ref"
    load_xref_table
  else
    raise PDF::Reader::MalformedPDFError, "xref table not found at offset #{offset} (#{token} != xref)"
  end
end

def load_xref_table

processes it into memory.
Assumes the underlying buffer is positioned at the start of an Xref table and
###############################################################################

def load_xref_table
  tok_one = tok_two = nil
  begin
    # loop over all subsections of the xref table
    # In a well formed PDF, the 'trailer' token will indicate
    # the end of the table. However we need to be careful in case 
    # we're processing a malformed pdf that is missing the trailer.
    loop do
      tok_one, tok_two = @buffer.token, @buffer.token
      if tok_one != "trailer" && !tok_one.match(/\d+/)
        raise MalformedPDFError, "PDF malformed, missing trailer after cross reference"
      end
      break if tok_one == "trailer" or tok_one.nil?
      objid, count = tok_one.to_i, tok_two.to_i
      count.times do
        offset = @buffer.token.to_i
        generation = @buffer.token.to_i
        state = @buffer.token
        store(objid, generation, offset) if state == "n"
        objid += 1
      end
    end
  rescue EOFError => e
    raise MalformedPDFError, "PDF malformed, missing trailer after cross reference"
  end
  raise MalformedPDFError, "PDF malformed, trailer should be a dictionary" unless tok_two == "<<"
  trailer = Parser.new(@buffer, self).dictionary
  load(trailer['Prev'].to_i) if trailer.has_key?('Prev')
  trailer
end

def object (ref, save_pos = true)

If the object is a stream, that is returned as well

number
by specifying a PDF::Reader::Reference object that contains the objects ID and revision
Return a string containing the contents of an entire PDF object. The object is requested
###############################################################################

def object (ref, save_pos = true)
  return ref unless ref.kind_of?(Reference)
  pos = @buffer.pos if save_pos
  obj, stream = Parser.new(@buffer.seek(offset_for(ref)), self).object(ref.id, ref.gen)
  @buffer.seek(pos) if save_pos
  if stream
    return obj, stream
  else
    return obj
  end
end

def offset_for (ref)

ref - a PDF::Reader::Reference object containing an object ID and revision number

returns the byte offset for the specified PDF object.
###############################################################################

def offset_for (ref)
  @xref[ref.id][ref.gen]
end

def store (id, gen, offset)

Stores an offset value for a particular PDF object ID and revision number
###############################################################################

def store (id, gen, offset)
  (@xref[id] ||= {})[gen] ||= offset
end

Modules

Classes