A Simple XML-Over-HTTP Class

Updated May 13, 2003 | July 12, 2002 | Fredrik Lundh

This module implements a simple helper class, HTTPClient, which can send an XML document (represented either as an element tree or a string) to a remote server, and parse the result into an element tree.

 
A Simple XML-Over-HTTP Helper (File: HTTPClient.py)
from httplib import HTTP
from StringIO import StringIO
import urlparse

# elementtree (from effbot.org/downloads)
from elementtree import ElementTree

class HTTPClient:

    user_agent = "HTTPClient (from effbot.org)"

    def __init__(self, uri):

        scheme, host, path, params, query, fragment = urlparse.urlparse(uri)
        if scheme != "http":
            raise ValueError("only supports HTTP requests")

        # put the path back together again
        if not path:
            path = "/"
        if params:
            path = path + ";" + params
        if query:
            path = path + "?" + query

        self.host = host
        self.path = path

    def do_request(self, body,
        # optional keyword arguments follow
        path=None, method="POST", content_type="text/xml",
        extra_headers=(), parser=None):

        if not path:
            path = self.path

        if isinstance(body, ElementTree.ElementTree):
            # serialize element tree
            file = StringIO()
            body.write(file)
            body = file.getvalue()

        # send xml request
        h = HTTP(self.host)
        h.putrequest(method, path)
        h.putheader("User-Agent", self.user_agent)
        h.putheader("Host", self.host)
        if content_type:
            h.putheader("Content-Type", content_type)
        h.putheader("Content-Length", str(len(body)))
        for header, value in extra_headers:
            h.putheader(header, value)
        h.endheaders()

        h.send(body)

        # fetch the reply
        errcode, errmsg, headers = h.getreply()

        if errcode != 200:
            raise Exception(errcode, errmsg)

        return ElementTree.parse(h.getfile(), parser=parser)

The main workhorse is the do_request method, which uses the httplib library module for all protocol-related stuff. The HTTP class represents a connection to an HTTP server. The putrequest and putheader methods are used to generate the header part of an HTTP message, and send is used for the body. Finally, the getreply method is used to parse the response header, and getfile returns a file handle that can be passed right into the element tree parser.

You can use the path, method, content_type and extra_headers options to get better control over the request header:

path

Overrides the path. If omitted, use the path extracted from the host URI given in the constructor.

method

What HTTP method to use. The default is “POST”, but you can also use e.g. “PUT”, “GET”, and “HEAD”. Note that some methods doesn’t take a body; in that case, use an empty string for the body.

content_type

What type to use for the body. The default is “text/xml”.

extra_headers

A list of (header, value) pairs for extra headers needed by the server. For example, you can add SOAP’s SOAPAction headers to the mix, by passing in [(“SOAPAction”, action)].

Sending XML-RPC requests

Let’s put this class to use. The following example sends a pre-defined XML-RPC request to the effbot.org echo service, and prints the result.

request = """\
<?xml version="1.0"?>
<methodCall>
  <methodName>echo</methodName>
  <params>
    <param><value>hello, world</value></param>
  </params>
</methodCall>
"""

from HTTPClient import HTTPClient

client = HTTPClient("http://effbot.org/rpc/echo.cgi")

response = client.do_request(request)

import sys
response.write(sys.stdout)

Here’s the expected output:

<?xml version='1.0'?>
<methodResponse>
<params>
<param>
<value><string>hello, world</string></value>
</param>
</params>
</methodResponse>

For more examples, see Using Element Trees For XML-RPC and You Can Never Have Too Many Stock Tickers!.


Notes:

The implementation currently ignores the charset parameter in the content-type headers. The HTTP protocol allows HTTP transports to convert documents between different encodings on the way (“transcoding”), usually based on accept-charset client headers. If you read data from such a source, the XML parser cannot figure out the encoding by looking at the document; it must use the charset specified by the server.

Also, a strict reading of the HTTP and XML Media Types specifications says that if you set the content-type to text/xml, without any charset parameter, the XML body cannot use 8-bit characters; the body is assumed to contain US ASCII only. This is no problem if you pass in element trees; the default encoding uses character entities for all non-ascii characters anyway. But you probably should keep this in mind if you’re generating the body outside the do_request method.

 

A Django site. rendered by a django application. hosted by webfaction.