The ElementRXP module

Fredrik Lundh | February 2005 | Originally posted to online.effbot.org

Here’s a simple module that uses the PyRXP parser to build an element tree:

# File: ElementRXP.py

try:
    from cElementTree import Element
except ImportError:
    from elementtree.ElementTree import Element

try:
    from pyRXPU import Parser
except ImportError:
    # fall back on ASCII-only parser
    from pyRXP import Parser

def fixelement((tag, attrib, children, spare)):
    elem = this = Element(tag, attrib)
    for child in children:
        if isinstance(child, tuple):
            this = fixelement(child)
            elem.append(this)
        else:
            # add text fragments to the right place
            if this is elem:
                this.text = child
            else:
                this.tail = child
    return elem

def parse(file):
    if not hasattr(file, "read"):
        file = open(file)
    p = Parser(ExpandEmpty=1)
    return fixelement(p.parse(file.read()))

This is a faster than the Python version of ElementTree, but a lot slower than plain cElementTree. However, the PyRXP(U) library supports DTD validation, which can come in handy in some applications.

 

A Django site. rendered by a django application. hosted by webfaction.