DOM tree
document_tree is a DOM-style, in-memory representation
of an XML document. It is built on top of the low-level
SAX parsers: loading a document runs a parser internally
and assembles the resulting events into a navigable tree of nodes.
The tree is accessed through lightweight value-type handles.
const_node is a read-only handle that exposes a node’s
type, name, attributes, children and parent, while
node derives from it and adds methods for building and
editing the tree in place. Both handles refer to storage owned by the
enclosing document_tree, so they must not be used after
the tree is destroyed. Names are represented by
entity_name, which pairs a local name with an optional
namespace identifier.
Note
A document_tree is constructed with a reference to
an xmlns_repository, which it uses to create namespace
identifiers for the names it stores. The repository must outlive the tree.
Both examples below share the following headers:
#include <orcus/dom_tree.hpp>
#include <orcus/xml_namespace.hpp>
#include <orcus/stream.hpp>
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
Building a tree
The same API can build a document from scratch.
set_root() installs the root element and
returns a mutable node, which is then populated with
append_element(),
set_attribute() and
append_content():
xmlns_repository repo;
dom::document_tree tree(repo);
// set_root() installs a fresh root element and returns a mutable handle
dom::node root = tree.set_root({"message"});
root.set_attribute("lang", "en");
dom::node greeting = root.append_element({"greeting"});
greeting.append_content("Hello, world!");
Note
Each name is passed as an entity_name. The braces
in {"message"} are what construct that entity_name from the string:
passing a bare string literal would not compile, because converting it to an
entity_name requires two user-defined conversions (first to
std::string_view, then to entity_name), and only one is allowed in an
implicit conversion. The braced-initializer form sidesteps this by
constructing the argument in place. To give a name a namespace, pass both an
xmlns_id_t and the local name, as in {ns, "message"}.
Finally, serialize the tree back to XML with
dump(). The indent argument gives the
number of spaces per nesting level:
// dump() serializes the tree; the indent is the number of spaces per level
std::cout << tree.dump(2) << std::endl;
This produces the following output:
--- build and serialize ---
<message lang="en">
<greeting>Hello, world!</greeting>
</message>