Welcome to pygixml¶
pygixml is a high-performance XML parser for Python built on Cython and pugixml. It delivers fast parsing, full XPath 1.0 support, and a clean Pythonic API for reading, writing, and transforming XML.
New to XML? Start with What is XML? for a primer on the format, its structure, and real-world applications.
Note
Enjoy pygixml? Star the project on GitHub to support the development: https://github.com/MohammadRaziei/pygixml
Why pygixml?¶
Speed — pugixml is one of the fastest XML parsers available. pygixml brings that speed directly to Python:
Library |
Avg Time |
Speedup vs ElementTree |
|---|---|---|
pygixml |
0.0009 s |
9.2× faster |
lxml |
0.0041 s |
2.0× faster |
ElementTree |
0.0083 s |
1.0× (baseline) |
(Benchmark: parsing a document with 5 000 elements. See Performance for the full comparison.)
Features¶
Blazing-fast parsing — up to 14× faster than ElementTree
Full XPath 1.0 — complete query engine with all standard functions
Memory efficient — zero-copy C++ memory management via pugixml
Pythonic API — intuitive methods and properties, not a direct C++ mirror
Cross-platform — Windows, Linux, macOS
Text extraction — recursive text gathering with configurable joins
XML serialization — output with custom indentation (spaces or integer)
Node iteration — depth-first traversal of the entire document
Node identity — memory-based ID for debugging and comparison
Quick Example¶
import pygixml
doc = pygixml.parse_string("""
<library>
<book id="1">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
</book>
</library>
""")
# Access elements and attributes
root = doc.root
book = root.child("book")
print(book.name) # → book
print(book.attribute("id").value) # → 1
print(book.child("title").text()) # → The Great Gatsby
# XPath queries
titles = root.select_nodes("book/title")
for t in titles:
print(t.node.text()) # → The Great Gatsby
# Create and save
doc = pygixml.XMLDocument()
root = doc.append_child("catalog")
root.append_child("item").set_value("Hello")
doc.save_file("output.xml")
Core Classes¶
See the API Reference for the complete reference with every class, method, and property documented.
Class |
Description |
|---|---|
Document-level operations: load, save, append-child |
|
Navigate, read, and modify individual nodes |
|
Attribute name and value access |
|
Pre-compiled XPath queries for repeated evaluation |
|
Single XPath result (wraps a node or attribute) |
|
Collection of XPath results |
Pythonic Extensions¶
pugixml gives pygixml its speed, but the API you actually use goes well beyond what the C++ library provides. pygixml adds several features that make working with XML from Python natural and productive:
text— recursive text extraction with configurable joins. One call to gather all text content from an element and its descendants.children()— iterate direct child elements only (or all descendants withrecursive=True), no manual sibling walking.xpath— generate an absolute XPath to any node using a custom O(depth) algorithm. Not available in pugixml natively.xml— serialize a node and its subtree to a formatted XML string in one property.mem_id— a unique numeric identifier for each node, ideal for caching and dictionary-based lookups.to_string()— customizable XML serialization with string or integer indentation.
These are pygixml’s own contributions — you won’t find equivalents in pugixml. See the API Reference for full documentation of every member.
Note
Properties vs Methods — pygixml uses properties for simple accessors and methods for operations that take arguments:
Properties (no parentheses): node.name, node.value,
node.type, node.parent, node.next_sibling,
node.previous_sibling, node.xml, node.xpath,
attr.name, attr.value, attr.next_attribute, doc.root
Methods (need parentheses): node.child(name),
node.first_child(), node.append_child(name),
node.child_value(name), node.set_value(v),
node.first_attribute(), node.attribute(name),
node.select_nodes(query), node.select_node(query),
node.text(), node.to_string()
XPath Support¶
pygixml exposes pugixml’s full XPath 1.0 engine:
Axes:
child::,attribute::,descendant::,ancestor::Predicates:
book[@id='1'],book[year > 1950]Functions:
position(),last(),count(),sum(),string(),number(),concat(),substring()Operators:
and,or,not(),=,!=,<,>,+,-,*,div,modWildcards:
*,@*,node()
See XPath Support for a detailed walkthrough.
Installation¶
From PyPI
pip install pygixml
From source
pip install git+https://github.com/MohammadRaziei/pygixml.git