class Nokogiri::XML::Document
please read that class’s documentation as well.
Document inherits a great deal of functionality from its superclass Nokogiri::XML::Node, so
Nokogiri::XML::Document.parse for more information on parsing.
is created by parsing XML content from a String or an IO object. See
Nokogiri::XML::Document is the main entry point for dealing with XML documents. The Document
def add_child(node_or_tags)
def add_child(node_or_tags) raise "A document may not have multiple root nodes." if (root && root.name != "nokogiri_text_wrapper") && !(node_or_tags.comment? || node_or_tags.processing_instruction?) node_or_tags = coerce(node_or_tags) if node_or_tags.is_a?(XML::NodeSet) raise "A document may not have multiple root nodes." if node_or_tags.size > 1 super(node_or_tags.first) else super end end
def clone(level = 1)
[Returns] The new Nokogiri::XML::Document
- +level+ (optional Integer). 0 is a shallow copy, 1 (the default) is a deep copy.
[Parameters]
Clone this node.
clone(level) → Nokogiri::XML::Document
clone → Nokogiri::XML::Document
:call-seq:
def clone(level = 1) copy = OBJECT_CLONE_METHOD.bind_call(self) copy.initialize_copy_with_args(self, level) end
def collect_namespaces
{"xmlns:foo" => "baz"}
The hash returned will be something like:
Given this document:
*Example:* Duplicate prefixes
{"xmlns:foo"=>"bar", "xmlns"=>"default", "xmlns:hello"=>"world"}
This method will return:
Given this document:
*Example:* Basic usage
underlying XML library.
order (and which duplicate prefix "wins") may be dependent on the implementation of the
Note that this method does an xpath lookup for nodes with namespaces, and as a result the
⚠ This method will not handle duplicate namespace prefixes, since the return value is a hash.
hash.
Recursively get all namespaces from this node and its subtree and return them as a
collect_namespaces() → Hash
:call-seq:
def collect_namespaces xpath("//namespace::*").each_with_object({}) do |ns, hash| hash[["xmlns", ns.prefix].compact.join(":")] = ns.href if ns.prefix != "xml" end end
def create_cdata(string, &block)
def create_cdata(string, &block) Nokogiri::XML::CDATA.new(self, string.to_s, &block) end
def create_comment(string, &block)
def create_comment(string, &block) Nokogiri::XML::Comment.new(self, string.to_s, &block) end
def create_element(name, *contents_or_attrs, &block)
doc.create_element("div") { |node| node["class"] = "blue" if before_noon? }
*Example:* Passing a block to mutate the element
# =>
contents
doc.create_element("div", "contents", {"class" => "container"})
*Example:* An element with contents and attributes
# =>
doc.create_element("div", {"class" => "container"})
*Example:* An element with attributes
# =>
contents
doc.create_element("div", "contents")
*Example:* An element with contents
# =>
doc.create_element("div")
*Example:* An empty element without attributes
[Returns] Nokogiri::XML::Element
[Yields] `node` (Nokogiri::XML::Element)
- `contents_or_attrs` (\#to_s, Hash)
- `name` (String)
[Parameters]
A block may be passed to mutate the node.
- a non-Hash object that responds to \#to_s will be used to set the new node's contents
- a Hash argument will be used to set attributes
Arguments may be passed to initialize the element:
place it in the document tree.
Node#add_next_sibling, Node#replace, etc. which will both create an element (or subtree) and
document tree. Prefer one of the Nokogiri::XML::Node methods like Node#add_child,
This method is _not_ the most user-friendly option if your intention is to add a node to the
attributes.
Create a new Element with `name` belonging to this document, optionally setting contents or
create_element(name, *contents_or_attrs, &block) → Nokogiri::XML::Element
:call-seq:
def create_element(name, *contents_or_attrs, &block) elm = Nokogiri::XML::Element.new(name, self, &block) contents_or_attrs.each do |arg| case arg when Hash arg.each do |k, v| key = k.to_s if key =~ NCNAME_RE ns_name = Regexp.last_match(1) elm.add_namespace_definition(ns_name, v) else elm[k.to_s] = v.to_s end end else elm.content = arg end end if (ns = elm.namespace_definitions.find { |n| n.prefix.nil? || (n.prefix == "") }) elm.namespace = ns end elm end
def create_text_node(string, &block)
def create_text_node(string, &block) Nokogiri::XML::Text.new(string.to_s, self, &block) end
def deconstruct_keys(keys)
Since v1.14.0
# => {:root=>nil}
doc.deconstruct_keys([:root])
doc = Nokogiri::XML::Document.new
*Example* of an empty document
# })}
# #(Text "\n")]
# #(Element:0x370 { name = "child", children = [ #(Text "\n")] }),
# #(Text "\n" + " "),
# children = [
# name = "root",
# #(Element:0x35c {
# => {:root=>
doc.deconstruct_keys([:root])
XML
doc = Nokogiri::XML.parse(<<~XML)
*Example*
by opening an issue or a discussion on the github project.
instructions. If you have a use case and would like this functionality, please let us know
In the future, other keys may allow accessing things like doctype and processing
- +root+ → (Node, nil) The root node of the Document, or +nil+ if the document is empty.
Valid keys and their values:
Returns a hash describing the Document, to use in pattern matching.
:call-seq: deconstruct_keys(array_of_names) → Hash
def deconstruct_keys(keys) { root: root } end
def decorate(node)
#
def decorate(node) return unless @decorators @decorators.each do |klass, list| next unless node.is_a?(klass) list.each { |mod| node.extend(mod) } end end
def decorators(key)
def decorators(key) @decorators ||= {} @decorators[key] ||= [] end
def document
def document self end
def dup(level = 1)
[Returns] The new Nokogiri::XML::Document
- +level+ (optional Integer). 0 is a shallow copy, 1 (the default) is a deep copy.
[Parameters]
Duplicate this node.
dup(level) → Nokogiri::XML::Document
dup → Nokogiri::XML::Document
:call-seq:
def dup(level = 1) copy = OBJECT_DUP_METHOD.bind_call(self) copy.initialize_copy_with_args(self, level) end
def empty_doc?(string_or_io)
def empty_doc?(string_or_io) string_or_io.nil? || (string_or_io.respond_to?(:empty?) && string_or_io.empty?) || (string_or_io.respond_to?(:eof?) && string_or_io.eof?) end
def fragment(tags = nil)
Create a Nokogiri::XML::DocumentFragment from +tags+
#
def fragment(tags = nil) DocumentFragment.new(self, tags, root) end
def initialize(*args) # :nodoc: # rubocop:disable Lint/MissingSuper
def initialize(*args) # :nodoc: # rubocop:disable Lint/MissingSuper @errors = [] @decorators = nil @namespace_inheritance = false end
def inspect_attributes
def inspect_attributes [:name, :children] end
def name
def name "document" end
def namespaces
def namespaces root ? root.namespaces : {} end
def parse(
can be configured before parsing. See Nokogiri::XML::ParseOptions for more information.
If a block is given, a Nokogiri::XML::ParseOptions object is yielded to the block which
[Yields]
+ParseOptions::DEFAULT_XML+.
behaviors during parsing. See ParseOptions for more information. The default value is
- +options:+ (Nokogiri::XML::ParseOptions) Configuration object that determines some
content.
document. When not provided, the encoding will be determined based on the document
- +encoding:+ (String) The name of the encoding that should be used when processing the
- +url:+ (String) The base URI for this document.
[Optional Keyword Arguments]
- +input+ (String | IO) The content to be parsed.
[Required Parameters]
that module's DEFAULT_XML constant for what's set (and not set) by default.
or access the network. See Nokogiri::XML::ParseOptions for a complete list of options; and
🛡 By default, Nokogiri treats documents as untrusted, and so does not attempt to load DTDs
Parse \XML input from a String or IO object, and return a new XML::Document.
parse(input, url:, encoding:, options:) => Nokogiri::XML::Document
parse(input) { |options| ... } => Nokogiri::XML::Document
call-seq:
def parse( string_or_io, url_ = nil, encoding_ = nil, options_ = XML::ParseOptions::DEFAULT_XML, url: url_, encoding: encoding_, options: options_ ) options = Nokogiri::XML::ParseOptions.new(options) if Integer === options yield options if block_given? url ||= string_or_io.respond_to?(:path) ? string_or_io.path : nil if empty_doc?(string_or_io) if options.strict? raise Nokogiri::XML::SyntaxError, "Empty document" else return encoding ? new.tap { |i| i.encoding = encoding } : new end end doc = if string_or_io.respond_to?(:read) # TODO: should we instead check for respond_to?(:to_path) ? if string_or_io.is_a?(Pathname) # resolve the Pathname to the file and open it as an IO object, see #2110 string_or_io = string_or_io.expand_path.open url ||= string_or_io.path end read_io(string_or_io, url, encoding, options.to_i) else # read_memory pukes on empty docs read_memory(string_or_io, url, encoding, options.to_i) end # do xinclude processing doc.do_xinclude(options) if options.xinclude? doc end
def slop!
... which does absolutely nothing.
irb> doc.slop!
... followed by irb's implicit inspect (and therefore instantiation of every node) ...
irb> doc = Nokogiri::HTML my_markup
and not
irb> doc = Nokogiri::Slop my_markup
irb, the preferred idiom is:
is called will not be decorated with sloppy behavior. So, if you're in
Note that any nodes that have been instantiated before #slop!
Explore a document with shortcut methods. See Nokogiri::Slop for details.
#
def slop! unless decorators(XML::Node).include?(Nokogiri::Decorators::Slop) decorators(XML::Node) << Nokogiri::Decorators::Slop decorate! end self end
def validate
Validate this Document against its DTD. Returns a list of errors on
#
def validate return unless internal_subset internal_subset.validate(self) end
def xpath_doctype
[Returns] The document type which determines CSS-to-XPath translation.
xpath_doctype() → Nokogiri::CSS::XPathVisitor::DoctypeConfig
:call-seq:
def xpath_doctype Nokogiri::CSS::XPathVisitor::DoctypeConfig::XML end