class Hpricot::Elements
and Hpricot::Container::Trav.
Most of the useful element methods are in the mixins Hpricot::Traverse
end
end
ary << ele # stuff away all the elements under the h3
while ele = h3.next_sibling
doc.search(“h3”).each do |h3|
A good method for this is next_sibling
:
and grab all the tags underneath the header, but not inside the header.
For example, you may want to search for all the H3 header tags in a document
which can do what you need.
loop through the elements and find a method in Hpricot::Container::Trav
If you can’t find a method here that does what you need, you may need to
== Getting More Detailed
#=> “That’s my fork, Tyler.”
doc.to_html
doc.at(“b”).swap(“fork”)
doc = Hpricot(“That’s my spoon, Tyler.”)
the document you started searching from.
When you’re altering elements in the list, your changes will be reflected in
== Altering Elements
elements = elements.at(“img”)
elements = elements.search(“/a”)
elements = doc.search(“/div/p”)
and search
methods you can use on plain elements.
done. Well, you can continue searching the list by using the same at
Usually the Hpricot::Elements you’re working on comes from a search you’ve
== Continuing Searches
Hpricot::Doc, etc.)
Assuming that ele1, ele2 and ele3 contain element objects (Hpricot::Elem,
Hpricot::Elements[ele1, ele2, ele3]
If you need to create an element array from regular elements:
added for altering elements contained in the array.
Hpricot::Elements is an extension of Ruby’s array class, with some methods
a group. Or you may want to perform the same action on each of them.
Once you’ve matched a list of elements, you will often need to handle them as
def self.expand(ele1, ele2, excl=false)
Given two elements, attempt to gather an Elements array of everything between
def self.expand(ele1, ele2, excl=false) ary = [] offset = excl ? -1 : 0 if ele1 and ele2 # let's quickly take care of siblings if ele1.parent == ele2.parent ary = ele1.parent.children[ele1.node_position..(ele2.node_position+offset)] else # find common parent p, ele1_p = ele1, [ele1] ele1_p.unshift p while p.respond_to?(:parent) and p = p.parent p, ele2_p = ele2, [ele2] ele2_p.unshift p while p.respond_to?(:parent) and p = p.parent common_parent = ele1_p.zip(ele2_p).select { |p1, p2| p1 == p2 }.flatten.last child = nil if ele1 == common_parent child = ele2 elsif ele2 == common_parent child = ele1 end if child ary = common_parent.children[0..(child.node_position+offset)] end end end return Elements[*ary] end
def self.filter(nodes, expr, truth = true)
def self.filter(nodes, expr, truth = true) until expr.empty? _, *m = *expr.match(/^(?:#{ATTR_RE}|#{BRACK_RE}|#{FUNC_RE}|#{CUST_RE}|#{CATCH_RE})/) break unless _ expr = $' m.compact! if m[0] == '@' m[0] = "@#{m.slice!(2,1).join}" end if m[0] == '[' && m[1] =~ /^\d+$/ m = [":", "nth", m[1].to_i-1] end if m[0] == ":" && m[1] == "not" nodes, = Elements.filter(nodes, m[2], false) elsif "#{m[0]}#{m[1]}" =~ /^(:even|:odd)$/ new_nodes = [] nodes.each_with_index {|n,i| new_nodes.push(n) if (i % 2 == (m[1] == "even" ? 0 : 1)) } nodes = new_nodes elsif "#{m[0]}#{m[1]}" =~ /^(:first|:last)$/ nodes = [nodes.send(m[1])] else meth = "filter[#{m[0]}#{m[1]}]" unless m[0].empty? if meth and Traverse.method_defined? meth args = m[2..-1] else meth = "filter[#{m[0]}]" if Traverse.method_defined? meth args = m[1..-1] end end args << -1 nodes = Elements[*nodes.find_all do |x| args[-1] += 1 x.send(meth, *args) ? truth : !truth end] end end [nodes, expr] end
def add_class class_name
(doc/"p").add_class("bacon")
Adds the class to all matched elements.
def add_class class_name each do |el| next unless el.respond_to? :get_attribute classes = el.get_attribute('class').to_s.split(" ") el.set_attribute('class', classes.push(class_name).uniq.join(" ")) end self end
def after(str = nil, &blk)
Just after each element in this list, add some HTML.
def after(str = nil, &blk) each { |x| x.parent.insert_after x.make(str, &blk), x } end
def append(str = nil, &blk)
Add to the end of the contents inside each element in this list.
def append(str = nil, &blk) each { |x| x.html(x.children + x.make(str, &blk)) } end
def at(expr, &blk)
the CSS or XPath expression +expr+. Root is assumed to be the element scanned.
Searches this list for the first element (or child of these elements) matching
def at(expr, &blk) search(expr, &blk).first end
def attr key, value = nil, &blk
This example adds a #top anchor to each link.
records.attr("href") { |e| e['href'] + "#top" }
the new value of the attribute.
it belongs to. The block will pass in an element. Return from the block
Lastly, a block can be used to rewrite an attribute based on the element
(doc/"a").attr(:class => "basic", :href => "http://hackety.org/")
You may also use a Hash to set a series of attributes:
doc.search("p").attr("class", "basic")
matched elements.
Or, pass in a +key+ and +value+. This will set an attribute for all
#=> "http://hacketyhack.net/"
doc.search("a").attr("href")
attribute isn't found.
assigned to that attribute for the first elements. Or +nil+ if the
Pass in a +key+ on its own and this method will return the string value
Gets and sets attributes on all matched elements.
def attr key, value = nil, &blk if value or blk each do |el| el.set_attribute(key, value || blk[el]) end return self end if key.is_a? Hash key.each { |k,v| self.attr(k,v) } return self else return self[0].get_attribute(key) end end
def before(str = nil, &blk)
Add some HTML just previous to each element in this list.
def before(str = nil, &blk) each { |x| x.parent.insert_before x.make(str, &blk), x } end
def copy_node(node, l)
def copy_node(node, l) l.instance_variables.each do |iv| node.instance_variable_set(iv, l.instance_variable_get(iv)) end end
def empty
=> "
We have to say.
"doc.to_html
doc.search("i").empty
doc = Hpricot("
We have so much to say.
")Empty the elements in this list, by removing their insides.
def empty each { |x| x.inner_html = nil } end
def filter(expr)
def filter(expr) nodes, = Elements.filter(self, expr) nodes end
def inner_html(*string)
Returns an HTML fragment built of the contents of each element in this list.
def inner_html(*string) if string.empty? map { |x| x.inner_html }.join else x = self.inner_html = string.pop || x end end
def inner_html=(string)
which is loaded into Hpricot objects and inserted into every element in this
Replaces the contents of each element in this list. Supply an HTML +string+,
def inner_html=(string) each { |x| x.inner_html = string } end
def inner_text
Returns an string containing the text contents of each element in this list.
def inner_text map { |x| x.inner_text }.join end
def not(expr)
def not(expr) if expr.is_a? Traverse nodes = self - [expr] else nodes, = Elements.filter(self, expr, false) end nodes end
def prepend(str = nil, &blk)
Add to the start of the contents inside each element in this list.
def prepend(str = nil, &blk) each { |x| x.html(x.make(str, &blk) + x.children) } end
def pretty_print(q)
def pretty_print(q) q.object_group(self) { super } end
def remove
=> "Remove this: "
doc.to_html
doc.search("b").remove
doc = Hpricot("Remove this: here")
Remove all elements in this list from the document which contains them.
def remove each { |x| x.parent.children.delete(x) } end
def remove_attr name
(doc/"input").remove_attr("disabled")
Remove an attribute from each of the matched elements.
def remove_attr name each do |el| next unless el.respond_to? :remove_attribute el.remove_attribute(name) end self end
def remove_class name = nil
(doc/"span").remove_class
Or, to remove all classes:
(doc/"span").remove_class("lightgrey")
Removes a class from all matched elements.
def remove_class name = nil each do |el| next unless el.respond_to? :get_attribute if name classes = el.get_attribute('class').to_s.split(" ") el.set_attribute('class', (classes - [name]).uniq.join(" ")) else el.remove_attribute("class") end end self end
def search(*expr,&blk)
the CSS or XPath expression +expr+. Root is assumed to be the element scanned.
Searches this list for any elements (or children of these elements) matching
def search(*expr,&blk) Elements[*map { |x| x.search(*expr,&blk) }.flatten.uniq] end
def to_html
Convert this group of elements into a complete HTML fragment, returned as a
def to_html map { |x| x.output("") }.join end
def wrap(str = nil, &blk)
wrap(%{
doc.search("a[@href]").
deepest spot inside the first element.
If more than one element is found in the string, Hpricot locates the
Wraps each element in the list inside the element created by HTML +str+.
def wrap(str = nil, &blk) each do |x| wrap = x.make(str, &blk) nest = wrap.detect { |w| w.respond_to? :children } unless nest raise "No wrapping element found." end x.parent.replace_child(x, wrap) nest = nest.children.first until nest.empty? nest.html([x]) end end