Seeking Elegant Pythonic Solution | Musings of an Anonymous Geek

So, I have some code that queries a data source, and that data source sends me back an XML message. I have to parse the XML message so I can store information from it into a relational database. So, let’s say my XML response looks like this:

<xml>
<response>
<results=2>
  <result>
    <fname>Brian</fname>
    <lname>Jones</lname>
    <gender>M</gender>
    <office_phone_ext>777</office_phone_ext>
    <mobile_phone>201-555-1212</mobile_phone>
  </result>
  <result>
    <fname>Molly</fname>
    <lname>Jones</lname>
    <home_phone>201-555-1234</home_phone>
  </result>
</results>
</xml>

So, as you can see, the attributes for each result returned for a query can differ, and if a result doesn’t have a value for some attribute, the corresponding xml element isn’t included at all for that result. If it were just 2 or 3 attributes, I could easily enough get around it by doing something like this:

def __init__(self, xmlresult):
  self.xmlresult = xmlresult
  if self.xmlresult.xpath('fname') is not None:
    self.fname = self.xmlresult.xpath('fname')
  if self.xmlresult.xpath('lname') is not None:
    self.lname = self.xmlresult.xpath('lname')

Like I said, if it were just a few things I needed to check for, I’d do it this way and be done with it. It’s not just a few though — it’s like 50 attributes. Now what?

I decided lxml.objectify would be a great way to go. It would allow me to access these things as object attributes, which should mean I can do something like this:

self.fname = getattr(self.xmlresult, 'fname', None)
self.lname = getattr(self.xmlresult, 'lname', None)
...

So, you *can* do this, technically speaking. Trouble is, you’re asking for an attribute of an ObjectifiedElement object, and when you do that, it returns an object that is not a native Python datatype, which I did not realize when I first started using lxml.objectify. So, in the above, ‘self.fname’ will not be a Python string — it’ll be an lxml.objectify.StringElement object. Of course, my database driver, my ‘join()’ operations, and everything else in my code that relies on native Python datatypes is now broken.

What I actually need to do is get the ‘.pyval’ attribute of self.xmlresult.fname, if that attribute exists at all. So, something that does what I mean, which is “self.fname = getattr(self.xmlresult, ‘fname.pyval’, None). And, of course, doing ‘getattr(self.xmlresult, ‘fname’, None).pyval’ doesn’t work because None has no attribute ‘pyval’. I’ve tried a couple of other hacks too, but I’ve learned enough Python to know that if it feels like a hack, there’s probably a better way. But I can’t find that better way. Ideas?

Share this: