class Loofah::Scrubber
Scrubber::STOP to terminate the traversal of a subtree.
default) or bottom-up. Top-down scrubbers can optionally return
Scrubbers can be run on a document in either a top-down traversal (the
# => “<div>foo</div><p>bar</p>”
Loofah.html5_fragment(“<span>foo</span><p>bar</p>”).scrub!(span2div).to_s
This can then be run on a document:
span2div = Span2Div.new
end
end
node.name = “div” if node.name == “span”
def scrub(node)
class Span2Div < Loofah::Scrubber
Alternatively, this scrubber could have been implemented as:
end
node.name = “div” if node.name == “span”
span2div = Loofah::Scrubber.new do |node|
# change all <span> tags to <div> tags
A Scrubber wraps up a block (or method) that is run on an HTML node (element):
def append_attribute(node, attribute, value)
If the attribute is set, don't overwrite the existing value
If the attribute is not set, add it
def append_attribute(node, attribute, value) current_value = node.get_attribute(attribute) || "" current_values = current_value.split(/\s+/) updated_value = current_values | [value] node.set_attribute(attribute, updated_value.join(" ")) end
def html5lib_sanitize(node)
def html5lib_sanitize(node) case node.type when Nokogiri::XML::Node::ELEMENT_NODE if HTML5::Scrub.allowed_element?(node.name) HTML5::Scrub.scrub_attributes(node) return Scrubber::CONTINUE end when Nokogiri::XML::Node::TEXT_NODE, Nokogiri::XML::Node::CDATA_SECTION_NODE if HTML5::Scrub.cdata_needs_escaping?(node) node.before(HTML5::Scrub.cdata_escape(node)) return Scrubber::STOP end return Scrubber::CONTINUE end Scrubber::STOP end
def initialize(options = {}, &block)
block.
and implement +scrub+, which is slightly faster than using a
Alternatively, a Scrubber may inherit from Loofah::Scrubber,
for the current node's subtree.
Loofah::Scrubber::STOP, then the traversal will be terminated
For top_down traversals, if the block returns
:direction => :bottom_up
or
:direction => :top_down (the default)
Options may include
def initialize(options = {}, &block) direction = options[:direction] || :top_down unless [:top_down, :bottom_up].include?(direction) raise ArgumentError, "direction #{direction} must be one of :top_down or :bottom_up" end @direction = direction @block = block end
def scrub(node)
+scrub+, which will be called for each document node.
When +new+ is not passed a block, the class may implement
def scrub(node) raise ScrubberNotFound, "No scrub method has been defined on #{self.class}" end
def traverse(node)
method, in the direction specified at +new+ time.
either the lambda passed to the initializer or the +scrub+
Calling +traverse+ will cause the document to be traversed by
def traverse(node) direction == :bottom_up ? traverse_conditionally_bottom_up(node) : traverse_conditionally_top_down(node) end
def traverse_conditionally_bottom_up(node)
def traverse_conditionally_bottom_up(node) node.children.each { |j| traverse_conditionally_bottom_up(j) } if block block.call(node) else scrub(node) end end
def traverse_conditionally_top_down(node)
def traverse_conditionally_top_down(node) if block return if block.call(node) == STOP elsif scrub(node) == STOP return end node.children.each { |j| traverse_conditionally_top_down(j) } end