class CodeRay::Tokens

to load them from a file, and still use any Encoder that CodeRay provides.
It also allows you to generate tokens directly (without using a scanner),
that you put in your DB.
You can convert it to a webpage, a YAML file, or dump it into a gzip’ed string
Tokens gives you the power to handle pre-scanned code very easily:
See how small it is? ;)
# the Tokens object is here ——-^
CodeRay.scan(‘price = 2.59’, :ruby).html
Thus, the syntax below becomes clear:
then builds the output from this object.
The input is split and saved into a Tokens object. The Encoder
Tokens is the interface between Scanners and Encoders:
]
[:end_group, :string]
[‘“’, :delimiter],
[‘a string’, :content],
[‘”’, :delimiter],
[:begin_group, :string],
[
The Ruby scanner, for example, splits “a string” into:
token actions, namely begin_group and end_group.
Some scanners also yield sub-tokens, represented by special
[‘$^’, :error]
[‘3.1415926’, :float]
[‘# It looks like this’, :comment]
A token looks like this:
* the token kind (a Symbol representing the type of the token)
a token action (begin_group, end_group, begin_line, end_line)
* the token text (the original source of the token in a String) or
consisting of
A token is not a special object, just a two-element Array
a Scanner.
The Tokens class represents a list of tokens returnd from
= Tokens TODO: Rewrite!

def begin_group kind

def begin_group kind
  self << :begin_group << kind
end

def begin_group kind; push :begin_group, kind end

def begin_group kind; push :begin_group, kind end

def begin_line kind

def begin_line kind
  self << :begin_line << kind
end

def begin_line kind; push :begin_line, kind end

def begin_line kind; push :begin_line, kind end

def count

Return the actual number of tokens.

def count
  size / 2
end

def dump gzip_level = 7

See GZip module.

speed and compression rate.
in most cases as it is a good compromise between
but the default value 7 should be what you want
You can configure the level of compression,

so it has an #undump method. See Tokens.load.
The returned String object includes Undumping

In addition, it is gzipped using GZip.gzip.
The dump is created with Marshal.dump;

in files or databases.
Dumps the object into a String that can be saved

def dump gzip_level = 7
  dump = Marshal.dump self
  dump = GZip.gzip dump, gzip_level
  dump.extend Undumping
end

def encode encoder, options = {}

options are passed to the encoder.

* an Encoder object
* an Encoder class
* a symbol like :html oder :statistic
encoder can be

Encode the tokens using encoder.

def encode encoder, options = {}
  unless encoder.is_a? Encoders::Encoder
    if encoder.respond_to? :to_sym
      encoder_class = Encoders[encoder]
    end
    encoder = encoder_class.new options
  end
  encoder.encode_tokens self, options
end

def encode_with encoder, options = {}

def encode_with encoder, options = {}
  Encoders[encoder].new(options).encode_tokens self
end

def end_group kind

def end_group kind
  self << :end_group << kind
end

def end_group kind; push :end_group, kind end

def end_group kind; push :end_group, kind end

def end_line kind

def end_line kind
  self << :end_line << kind
end

def end_line kind; push :end_line, kind end

def end_line kind; push :end_line, kind end

def fix

TODO: Test this!

Ensure that all begin_group tokens have a correspondent end_group.

def fix
  raise NotImplementedError, 'Tokens#fix needs to be rewritten.'
  # tokens = self.class.new
  # # Check token nesting using a stack of kinds.
  # opened = []
  # for type, kind in self
  #   case type
  #   when :begin_group
  #     opened.push [:begin_group, kind]
  #   when :begin_line
  #     opened.push [:end_line, kind]
  #   when :end_group, :end_line
  #     expected = opened.pop
  #     if [type, kind] != expected
  #       # Unexpected end; decide what to do based on the kind:
  #       # - token was never opened: delete the end (just skip it)
  #       next unless opened.rindex expected
  #       # - token was opened earlier: also close tokens in between
  #       tokens << token until (token = opened.pop) == expected
  #     end
  #   end
  #   tokens << [type, kind]
  # end
  # # Close remaining opened tokens
  # tokens << token while token = opened.pop
  # tokens
end

def fix!

def fix!
  replace fix
end

def method_missing meth, options = {}

is used to highlight the tokens.
For example, if you call +tokens.html+, the HTML encoder

Redirects unknown methods to encoder calls.

def method_missing meth, options = {}
  encode_with meth, options
rescue PluginHost::PluginNotFound
  super
end

def optimize

joined in one comment token by the Scanner.
for example, consecutive //-comment lines could already be
If the scanner is written carefully, this is not required -

Combined with dump, it saves space for the cost of time.

in most Encoders. It basically makes the output smaller.
This can not be undone, but should yield the same output

tokens of the same kind.
Returns the tokens compressed by joining consecutive

def optimize
  raise NotImplementedError, 'Tokens#optimize needs to be rewritten.'
  # last_kind = last_text = nil
  # new = self.class.new
  # for text, kind in self
  #   if text.is_a? String
  #     if kind == last_kind
  #       last_text << text
  #     else
  #       new << [last_text, last_kind] if last_kind
  #       last_text = text
  #       last_kind = kind
  #     end
  #   else
  #     new << [last_text, last_kind] if last_kind
  #     last_kind = last_text = nil
  #     new << [text, kind]
  #   end
  # end
  # new << [last_text, last_kind] if last_kind
  # new
end

def optimize!

Compact the object itself; see optimize.

def optimize!
  replace optimize
end

def split_into_lines

like HTML with list-style numeration.
This makes it simple for encoders that work line-oriented,

- there are no open tokens at the end the line
(which means all other token are single-line)
- newlines are single tokens
Makes sure that:

TODO: Scanner#split_into_lines

def split_into_lines
  raise NotImplementedError
end

def split_into_lines!

def split_into_lines!
  replace split_into_lines
end

def split_into_parts *sizes

of source strings. The Diff encoder uses it for inline highlighting.
This method is used by @Scanner#tokenize@ when called with an Array

betweem them.
part closes all opened tokens. This is useful to insert tokens
the text size specified by the parameter. In addition, each
The result will be an Array of Tokens objects. The parts have

Split the tokens into parts of the given +sizes+.

def split_into_parts *sizes
  parts = []
  opened = []
  content = nil
  part = Tokens.new
  part_size = 0
  size = sizes.first
  i = 0
  for item in self
    case content
    when nil
      content = item
    when String
      if size && part_size + content.size > size  # token must be cut
        if part_size < size  # some part of the token goes into this part
          content = content.dup  # content may no be safe to change
          part << content.slice!(0, size - part_size) << item
        end
        # close all open groups and lines...
        closing = opened.reverse.flatten.map do |content_or_kind|
          case content_or_kind
          when :begin_group
            :end_group
          when :begin_line
            :end_line
          else
            content_or_kind
          end
        end
        part.concat closing
        begin
          parts << part
          part = Tokens.new
          size = sizes[i += 1]
        end until size.nil? || size > 0
        # ...and open them again.
        part.concat opened.flatten
        part_size = 0
        redo unless content.empty?
      else
        part << content << item
        part_size += content.size
      end
      content = nil
    when Symbol
      case content
      when :begin_group, :begin_line
        opened << [content, item]
      when :end_group, :end_line
        opened.pop
      else
        raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
      end
      part << content << item
      content = nil
    else
      raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
    end
  end
  parts << part
  parts << Tokens.new while parts.size < sizes.size
  parts
end

def text_token text, kind

:nocov:

def text_token text, kind
  self << text << kind
end

def to_s

Turn tokens into a string by concatenating them.

def to_s
  encode CodeRay::Encoders::Encoder.new
end

Modules

Classes