back next

Using the ElementTree Module to Generate SOAP Messages, Part 4: Automatically Decoding Responses

November 23, 2003 | Fredrik Lundh

Note: A distribution kit containing the source code for this article is available from the effbot.org downloads site (look for ElementSOAP 0.3 or later).

Automatically Decoding Responses #

The method wrappers we’ve used this far either returns the response element as is, or explicitly converts the response to a Python object. For example, the Delayed Stock Quote example used float to convert the return value to a Python float, and the doGetCachedpage method in the Google wrapper used the base64 module to decode the response string.

 

However, if you look at the XML structures returned by the servers, you’ll notice that they contain type annotations that can be used by the SOAP layer. For example, here’s a typical doSpellingSuggestion response envelope (somewhat simplified, namespace declarations not shown):

<soap:Envelope>
  <soap:Body>
    <google:doSpellingSuggestionResponse soap:encodingStyle="...">
      <return xsi:type="xsd:string">python</return>
    </google:doSpellingSuggestionResponse>
  </soap:Body>
</soap:Envelope>

And here’s a doGetCachedPage response:

 
<soap:Envelope>
  <soap:Body>
    <google:doGetCachedPageResponse soap:encodingStyle="...">
      <return xsi:type="soap-encoding:base64">PG1l....Cg==</return>
    </google:doGetCachedPageResponse>
  </soap:Body>
</soap:Envelope>

In both cases, the return element contains an xsi:type attribute, which tells you how the server expects you to handle the data. The xsd:string type is a plain string, while soap-encoding:base64 indicates that the response contains BASE64-encoded binary data.

You can use these annotations to automatically decode the return structures, without having to write method-specific code in the wrapper class. First, you have to do something about the QName issue; the contents of the xsi:type attribute has to be converted from the prefix/name encoding to a true universal name. In this case, xsd:string should actually be “{http://www.w3.org/1999/XMLSchema}string“, and soap-encoding:base64 should be “{http://schemas.xmlsoap.org/soap/encoding/}base64“.

 

You can do this conversion in the decoder, but it’s probably easier to do it once and for all in the SoapService wrapper. Here’s a version of that class that processes all xsi:type attributes:

class SoapService:
    def __init__(self, url=None):
        self.__client = HTTPClient(url or self.url)
    def call(self, action, request):
        # build SOAP envelope
        envelope = Element(NS_SOAP_ENV + "Envelope")
        body = SubElement(envelope, NS_SOAP_ENV + "Body")
        body.append(request)
        # call the server
        try:
            parser = NamespaceParser()
            response = self.__client.do_request(
                tostring(envelope),
                extra_headers=[("SOAPAction", action)],
                parser=parser
                )
        except HTTPError, v:
            if v[0] == 500:
                # might be a SOAP fault
                response = ElementTree.parse(v[3], parser)
        response = response.find(body.tag)[0]
        # fix XSI type attributes
        for elem in response.getiterator():
            type = elem.get(NS_XSI + "type")
            if type:
                elem.set(NS_XSI + "type", fixqname(elem, type))
        # look for fault descriptors
        if response.tag == NS_SOAP_ENV + "Fault":
            faultcode = response.find("faultcode")
            raise SoapFault(
                fixqname(faultcode, faultcode.text),
                response.findtext("faultstring"),
                response.findtext("faultactor"),
                response.find("detail")
                )
        return response

The next step is to update the service wrapper; in the following snippet, the explicit calls to findtext and the base64 module have been replaced with calls to a decode_element helper function:

class GoogleService(SoapService):

    ...

    def doGetCachedPage(self, url):
        action = "urn:GoogleSearchAction"
        request = SoapRequest("{urn:GoogleSearch}doGetCachedPage")
        SoapElement(request, "key", "string", self.__key)
        SoapElement(request, "url", "string", url)
        return decode_element(self.call(action, request).find("return"))

    def doSpellingSuggestion(self, phrase):
        action = "urn:GoogleSearchAction"
        request = SoapRequest("{urn:GoogleSearch}doSpellingSuggestion")
        SoapElement(request, "key", "string", self.__key)
        SoapElement(request, "phrase", "string", phrase)
        return decode_element(self.call(action, request).find("return"))

And here’s a first version of the decode_element helper:

 
def decode_element(element):
    if element is None:
        return None
    type = element.get(NS_XSI + "type")
    if type == NS_XSD + "string":
        return element.text or ""
    if type == NS_XSD + "integer":
        return int(element.text)
    if type == NS_XSD + "float" or type == NS_XSD + "double":
        return float(element.text)
    if type == NS_SOAP_ENC + "base64":
        import base64
        return base64.decodestring(element.text)
    raise ValueError("type %s not supported" % type)

When you call the new methods, you’ll find that they return the same kind of objects as they did before:

>>> g = GoogleService(key)
>>> g.doGetCachedPage("online.effbot.org")[:40]
'<meta http-equiv="Content-Type" content='
>>> len(g.doGetCachedPage("online.effbot.org"))
11467
>>> g.doSpellingSuggestion("pyhton")
'python'

Decoding Nested Structures #

But what about the doGoogleSearch method? The current wrapper returns an element structure, and leaves it to the application to extract the information it needs. Let’s take a look at a typical search response envelope:

 
<soap:Envelope>
  <soap:Body>
    <google:doGoogleSearchResponse" soap:encodingStyle="...">
      <return xsi:type="google:GoogleSearchResult">
        <documentFiltering xsi:type="xsd:boolean">true</documentFiltering>
        <estimatedTotalResultsCount xsi:type="xsd:int">22</estimatedTotalResultsCount>
        <directoryCategories xsi:type="soap-encoding:Array">
        </directoryCategories>
        <searchTime xsi:type="xsd:double">0.206521</searchTime>
        <resultElements xsi:type="soap-encoding:Array">
          <item xsi:type="google:ResultElement">
            <cachedSize xsi:type="xsd:string">5k</cachedSize>
            <hostName xsi:type="xsd:string"></hostName>
            <snippet xsi:type="xsd:string"><b>...</b></snippet>
            <directoryCategory xsi:type="google:DirectoryCategory">
              <specialEncoding xsi:type="xsd:string"></specialEncoding>
              <fullViewableName xsi:type="xsd:string"></fullViewableName>
            </directoryCategory>
            <relatedInformationPresent xsi:type="xsd:boolean">true</relatedInformationPresent>
            <directoryTitle xsi:type="xsd:string"></directoryTitle>
            <summary xsi:type="xsd:string"></summary>
            <URL xsi:type="xsd:string">http://effbot.org/...</URL>
            <title xsi:type="xsd:string">downloads.effbot.org</title>
          </item>
          <item xsi:type="google:ResultElement">
             ...
          </item>
          ...
        </resultElements>
        <endIndex xsi:type="xsd:int">10</endIndex>
        <searchTips xsi:type="xsd:string"></searchTips>
        <searchComments xsi:type="xsd:string"></searchComments>
        <startIndex xsi:type="xsd:int">1</startIndex>
        <estimateIsExact xsi:type="xsd:boolean">false</estimateIsExact>
        <searchQuery xsi:type="xsd:string">elementsoap</searchQuery>
      </return>
    </google:doGoogleSearchResponse>
  </soap:Body>
</soap:Envelope>

Ouch. There’s lots of stuff in there, including a number of custom data types (google:GoogleSearchResult, google:ResultElement, etc).

The new types fall in three categories:

  1. New xsd types, including xsd:int and xsd:boolean. To handle these, you can simply add more cases to the decode_element helper.
  2. The soap-encoding:Array type. This type specifier indicates that the child elements can be treated as a elements of an array (e.g. a Python list).
  3. The google:GoogleSearchResult, google:ResultElement, and google:DirectoryCategory types. These are custom record types, and can be mapped to Python classes (or dictionaries).
 

Here’s an enhanced decoder, which handles arrays and simple types (via an updated version of decode_element), and treats everything else as a custom record type. To simplify things, custom record types are returned as dictionaries.

def decode(element):
    type = element.get(NS_XSI + "type")
    # is it an array?
    if type == NS_SOAP_ENC + "Array":
        value = []
        for elem in element:
            value.append(decode(elem))
        return value
    # is it a primitive type?
    try:
        return decode_element(element)
    except ValueError:
        if type and type.startswith(NS_XSD):
            raise # unknown primitive type
    # assume it's a structure
    value = {}
    for elem in element:
        value[elem.tag] = decode(elem)
    return value

def decode_element(element):
    if element is None:
        return None
    type = element.get(NS_XSI + "type")
    if type == NS_XSD + "string":
        return element.text or ""
    if type == NS_XSD + "integer" or type == NS_XSD + "int":
        return int(element.text)
    if type == NS_XSD + "float" or type == NS_XSD + "double":
        return float(element.text)
    if type == NS_XSD + "boolean":
        return element.text == "true"
    if type == NS_SOAP_ENC + "base64":
        import base64
        return base64.decodestring(element.text)
    raise ValueError("type %s not supported" % type)
pprint module to print a nicely rendered version of the resulting Python object structure:
>>> g = GoogleService(key)
>>> from pprint import pprint
>>> r = g.doGoogleSearch("elementsoap")
>>> pprint(decode(r))
{'directoryCategories': [],
 'documentFiltering': True,
 'endIndex': 10,
 'estimateIsExact': False,
 'estimatedTotalResultsCount': 22,
 'resultElements': [
   {'URL': 'http://effbot.org/downloads/...',
    'cachedSize': '5k',
    'directoryCategory': {'fullViewableName': '',
                          'specialEncoding': ''},
    'directoryTitle': '',
    'hostName': '',
    'relatedInformationPresent': True,
    'snippet': '<b>...</b> Contents of <b>...</b>',
    'summary': '',
    'title': 'downloads.effbot.org'},
   {...},
   ...
 ],
 'searchComments': '',
 'searchQuery': 'elementsoap',
 'searchTime': 0.051152999999999997,
 'searchTips': '',
 'startIndex': 1}

Note that the doGoogleSearch method is left intact; it’s probably not a good idea to change the return type for a method in a library that has already been released. We can solve this by adding a more Pythonic search method:

class GoogleService(SoapService):

    ...

    def pyGoogleSearch(self, *args, **opts):
        return decode(self.doGoogleSearch(*args, **opts))
 

A Django site. rendered by a django application. hosted by webfaction.