Description
xml
is a module to parse XML documents into a tree structure. It also supports
validation of XML documents against a DTD.
Note that this is not a streaming XML parser. It reads the entire document into memory and then parses it. This is not a problem for small documents, but it might be a problem for extremely large documents (several hundred megabytes or more).
The public function parse_single_node
can be used to parse a single node from
an implementation of io.Reader
, which can help parse large XML documents on an
element-by-element basis. Sample usage is provided in the parser_test.v
file.
Usage
Parsing XML Files
There are three different ways to parse an XML Document:
- Pass the entire XML document as a string to
XMLDocument.from_string
. - Specify a file path to
XMLDocument.from_file
. - Use a source that implements
io.Reader
and pass it toXMLDocument.from_reader
.
import encoding.xml
//...
doc := xml.XMLDocument.from_file('test/sample.xml')!
Validating XML Documents
Simply call validate
on the parsed XML document.
Querying
Check the get_element...
methods defined on the XMLDocument struct.
Escaping and Un-escaping XML Entities
When the validate
method is called, the XML document is parsed and all text
nodes are un-escaped. This means that the text nodes will contain the actual
text and not the escaped version of the text.
When the XML document is serialized (using str
or pretty_str
), all text nodes are escaped.
The escaping and un-escaping can also be done manually using the escape_text
and
unescape_text
methods.