We're back after a server migration that caused effbot.org to fall over a bit harder than expected. Expect some glitches.

Reading processing instructions and comments with ElementTree

March 2005 | Fredrik Lundh

The following is an alternative XML parser that adds Comment and ProcessingInstruction elements to the element tree. Since such elements can appear outside the XML document proper, it wraps the entire document in an extra document element.

Note that this uses undocumented and unsupported parts of the ElementTree interface. It does work with ElementTree 1.2.X, but may not work with future versions.

import elementtree.ElementTree as ET

class PIParser(ET.XMLTreeBuilder):

   def __init__(self):
       # assumes ElementTree 1.2.X
       self._parser.CommentHandler = self.handle_comment
       self._parser.ProcessingInstructionHandler = self.handle_pi
       self._target.start("document", {})

   def close(self):
       return ET.XMLTreeBuilder.close(self)

   def handle_comment(self, data):
       self._target.start(ET.Comment, {})

   def handle_pi(self, target, data):
       self._target.start(ET.PI, {})
       self._target.data(target + " " + data)

def parse(source):
    return ET.parse(source, PIParser())