The Element API (Work in Progress)
Fredrik Lundh | August 2007
This document describes the Element interface. It covers both the 1.2 releases (including 1.2.6 that’s included in Python 2.5), and the upcoming 1.3 release.
Also see The elementtree.ElementTree Module.
Overview #
Examples #
The Element Class #
Element(tag) ⇒ element
Element(tag, name=value, …) ⇒ element
Creates an element instance. Depending on the implementation, this may be either a factory function or an ordinary class. It takes an element name (the tag), and, optionally, a number of attribute name/value pairs given as keyword arguments.
The element name, attribute names, and attribute values can be either 8-bit ASCII strings or Unicode strings.
Element(tag, attrib) ⇒ element
Element(tag, attrib, name=value, …) ⇒ element
Same, but takes a dictionary with attribute name/value pairs. If you use both a dictionary and keyword name/value pairs, the pairs will override the corresponding key/value pairs from the dictionary.
The SubElement Class #
SubElement(parent, tag, attrib, name=value, …) ⇒ element
Same as Element, but appends the new element to a given parent.
Attributes #
tag #
elem.tag
(Attribute) Element tag. This is either a string or the value None, if there is no text.
text #
elem.text
(Attribute) Text before first subelement. This is either a string or the value None, if there is no text.
Some implementations set this attribute to an empty string if there is no text, so user code should treat None and an empty string as equivalent.
Both ElementTree and cElementTree allow you to assign other data types to this attribute, something that can be useful when writing de-serialization code. Not all implementations support this, though, so such code isn’t fully portable.
tail #
elem.tail
(Attribute) Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there is no text.
See text for more information on attribute types and usage.
attrib #
(Attribute) A dictionary containing all element attributes.
Note that some implementations uses custom storage for attributes, and create the dictionary only when needed. For best performance, use the get and set methods, when suitable.
value = elem.get("attribute") value = elem.get("attribute", "default") elem.set("attribute", "value")
Note that this attribute shouldn’t be replaced; the dictionary can be modified, but if you need to replace it, call clear() followed by update():
elem.attrib.clear() elem.attrib.update(new_values)
Methods #
Sequence Interface #
Element objects behave as a sequence of their direct child elements, and support all basic sequence operations, including indexing, slicing and slice assignment, and the len function.
To iterate over an entire subtree, use the iter method (getiterator, in 1.2).
Note that truth testing falls back on length in the 1.2 series, which makes it a bit impractical; an element is only considered to be true if it has subelements. The 1.3 series issues a warning in this case, and future versions will most likely treat any element as true. For portability, use len(elem) to test for subelements, and use explicit comparisions to None when checking the result of find:
if len(elem): print "elem has subelements" e = elem.find(tag) if e is None: print "tag not found"
append #
elem.append(subelement)
Adds a subelement to the end of this element. When serialized, the new element will appear just before the end tag for this element.
extend #
elem.extend(sequence)
Appends subelements from a sequence. The sequence can be any kind of iterable, including lists, tuples and generators. If you pass in another element, its children are added.
insert #
elem.insert(index, element)
Inserts a subelement at the given position in this element.
remove #
elem.remove(subelement)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
To remove subelements on other conditions, or remove multiple subelements in one step, use the following pattern:
elem[:] = [e for e in elem if condition]
Where condition determines if the element should be kept.
This method raises a ValueError exception if no matching element could be found.
clear #
elem.clear()
Resets an element. This function removes all subelements, clears all attributes, and sets the text and tail attributes to None.
get #
elem.get(key, default) ⇒ string or None
Gets an element attribute. If the attribute doesn’t exist, returns the default value, or None if no default was given.
This is equivalent to elem.attrib.get(key, default).
set #
elem.set(key, value)
Sets an element attribute.
This is equivalent to elem.attrib[key] = value.
keys #
elem.keys() ⇒ sequence
Gets a list of attribute names. The names are returned in an arbitrary order (just like for an ordinary Python dictionary).
This is equivalent to elem.attrib.keys().
items #
elem.items() ⇒ sequence of (string, string) tuples
Gets element attributes, as (name, value) tuples in a sequence. The attributes are returned in an arbitrary order.
This is equivalent to elem.attrib.items().
iter #
elem.iter() ⇒ iterator
elem.iter(tag) ⇒ iterator
(New in 1.3) Creates a tree iterator. The iterator loops over this element and all subelements, in document order, and returns all elements with a matching tag. If the tag is omitted, all elements are returned.
If the tree structure is modified during iteration, the result is undefined.
To loop over all matching subelements, except the element itself, use elem.findall(“.//” + tag).
Note that this method was renamed in 1.3. In earlier releases, use the getiterator name instead. The old name is still available, for compatibility only.
itertext #
elem.itertext() ⇒ iterator
(New in 1.3) Creates a text fragment iterator. The iterator loops over this element and all subelements, in document order, and returns all inner text.
To get all inner text as a single string, you can use:
text = "".join(elem.itertext())If the tree structure is modified during iteration, the result is undefined.
The inner text is defined as the text attributes for the given element, and the text and tail attributes for all subelements, returned in document order. An implementation may omit empty text fragments, and may merge multiple fragments into a single string, or split long strings into more than one fragment.
find #
elem.find(path) ⇒ element or None
Finds the first matching subelement, by tag name or path.
elem.find(path, namespaces=dictionary) ⇒ element or None
(New in 1.3) Same, but uses the given dictionary to map prefixes to namespace URI:s.
findall #
elem.findall(path) ⇒ list
Finds all matching subelements, by tag name or path.
elem.findall(path, namespaces=dictionary) ⇒ list
(New in 1.3) Same, but uses the given dictionary to map prefixes to namespace URI:s
iterfind #
elem.iterfind(path) ⇒ iterator
Same as findall, but returns an iterator.
findtext #
elem.findtext(path) ⇒ text
elem.findtext(path, default) ⇒ text
Finds text for the first matching subelement, by tag name or path. If no matching element can be found, this method return the default value, or None if no default was given.
If the element is found, but has no text content, this method returns an empty string.
Note that this method returns the contents of the text for the first matching element only; it does not traverse the tree. To get all internal text, you can use something like:
def gettext(elem): text = elem.text or "" for e in elem: text += gettext(e) if e.tail: text += e.tail return text
elem.findtext(path, namespaces=dictionary) ⇒ text
elem.findtext(path, default, namespaces=dictionary) ⇒ text
(New in 1.3) Same, but uses the given dictionary to map prefixes to namespace URI:s
getchildren #
elem.getchildren() ⇒ list of elements
(Deprecated) Returns all child elements. The elements are returned in document order.
Note that since elements are containers, you can just loop over an element to get its child elements. To get a list of all subelements, use list(elem).
getiterator #
elem.getiterator() ⇒ iterator
elem.getiterator(tag) ⇒ iterator
(Deprecated) Same as iter.
Note that this method is deprecated. New code should use the shorter name, unless compatibility with pre-1.3 versions (e.g. the version shipped with Python 2.5) is important.
makeelement #
elem.makeelement(tag, attrib) ⇒ element
Creates a new element object of the same type as this element. Both the tag and the attribute dictionary must be given; this method does not support keyword arguments.
This method is mainly provided for code that builds element structures from other sources; user code can usually use the Element or SubElement to create new elements.
