Wonky Bunny Issue “Fixed”

For those who don’t know what the headline means:

  1. Bunny is an open source command line utility written in Python that provides a shell for talking to and testing AMQP brokers (tested on RabbitMQ).
  2. AMQP is a queuing protocol. It’s defined as a binary wire-level protocol as well as a command set. The spec also defines a good portion of the server semantics, so by that logic Bunny should work against other AMQP brokers besides RabbitMQ
  3. RabbitMQ is written in Erlang atop OTP, so clustering is ‘free and easy’. My experience with RabbitMQ so far has been fantastic, though I’d like to see client libraries in general mature a bit further.

So, Bunny had this really odd quirk upon its first release. If you did something to cause an error that resulted in a connection being dropped, bunny wouldn’t trap the error. It would patiently wait for you to enter the next command, and fail miserably. The kicker is that I actually defined a ‘check_conn’ method to make sure that the connection was alive before doing anything else, and that really wasn’t working.

The reason is because py-amqplib (or, perhaps, its interpretation of the AMQP spec, which defines a Connection class), implements a high-level Connection class, along with a Channel class (also defined in the spec), which is what seems to actually map to what you and I as users actually care about: some “thing” that lets us communicate with the server, and without which we can’t talk to the server.

With py-amqplib, a Connection is actually defined as a channel 0, and always channel 0. I gather that channel 0 gets some special treatment in other sections of the library code, and the object that lives at index ’0′ in Connection.channels is actually defined as a Connection object, whereas others are Channel objects.

The result of all of this is that creating a channel in my code and then checking my own object’s ‘chan’ attribute is useless because channels can be dropped on the floor in py-amqplib, and the only way I can tell to figure that out is to check the connection object’s ‘channels’ dictionary. So that’s what I do now. It seems to be working well.

Not only does bunny now figure out that your connection is gone, but it’ll also attempt a reconnect using the credentials you gave it in the last ‘connect’ command. You see, bunny extends the Python built-in cmd.Cmd object, which lets me define my whole program as a single class. That means that whatever you type in, like the credentials to the ‘connect’ command, can be kept handy, since the lifetime of the instance of this class is the same as the lifetime of a bunny session.

So, in summary, bunny is more useful now, but it’s still not “done”. I made this fix over the weekend during an hour I unexpectedly found for myself. It’s “a” solution, but it’s not “the” solution. The real solution is to map out all of the errors that actually cause a connection to drop and give the user a bit more feedback about what happened. I also want to add more features (like support for getting some stats back from Alice to replace bunny’s really weak ‘qlist’ command).

Python Date Manipulation

This post is the result of some head-scratching and note taking I did for a reporting project I undertook recently. It’s not a complete rundown of Python date manipulation, but hopefully the post (and hopefully the comments) will help you and maybe me too :)

The head-scratching is related to the fact that there are several different time-related objects, spread out over a few different time-related modules in Python, and I have found myself in plenty of instances where I needed to mix and match various methods and objects from different modules to get what I needed (which I thought was pretty simple at first glance). Here are a few nits to get started with:

  • strftime/strptime can generate the “day of week” where Sunday is 0, but there’s no way to tell any of the conversion functions like gmtime() that you want your week to start on Sunday as far as I know. I’m happy to be wrong, so leave comments if I am. It seems odd that you can do a sort of conversion like this when you output, but not within the calculation logic.
  • If you have a struct_time object in localtime format and want to convert it to an epoch date, time.mktime() works, but if your struct_time object is in UTC format, you have to use calendar.timegm() — this is lame and needs to go away. Just add timegm() to the time module (possibly renamed?).
  • time.ctime() will convert an epoch date into nicely formatted local time, but there’s no function to provide the equivalent output for UTC time.

There are too many methods and modules for dealing with date manipulation in Python, such that performing fairly common tasks requires importing and using a few different modules, different object types and methods from each. I’d love this to be cleaned up. I’d love it more if I were qualified to do it. More learning probably needs to happen for that. Anyway, just my $.02.

Mission 1: Calculating Week Start/End Dates Where Week Starts on Sunday

My mission: Pull epoch dates from a database. They were generated on a machine whose time does not use UTC, but rather local time (GMT-4).  Given the epoch date, find the start and end of the previous week, where the first day of the week is Sunday, and the last day of the week is Saturday.

So, I need to be able to get a week start/end range, from Sunday at 00:00 through Saturday at 23:59:59. My initial plan of attack was to calculate midnight of the current day, and then base my calculations for Sunday 00:00 on that, using simple timedelta(days=x) manipulations. Then I could do something like calculate the next Sunday and subtract a second to get Saturday at 23:59:59.

Nothing but ‘time’

In this iteration, I’ll try to accomplish my mission using only the ‘time’ module and some epoch math.

Seems like you should be able to easily get the epoch value for midnight of the current epoch date, and display it easily with time.ctime(). This isn’t quite true, however. See here:

>>> etime = int(time.time())
>>> time.ctime(etime)
'Thu May 20 15:26:40 2010'
>>> etime_midnight = etime - (etime % 86400)
>>> time.ctime(etime_midnight)
'Wed May 19 20:00:00 2010'
>>>

The reason this doesn’t do what you might expect is that time.ctime() in this case outputs the local time, which in this case is UTC-4 (I live near NY, USA, and we’re currently in DST. The timezone is EDT now, and EST in winter). So when you do math on the raw epoch timestamp (etime), you’re working with a bare integer that has no idea about time zones. Therefore, you have to account for that. Let’s try again:

>>> etime = int(time.time())
>>> etime
1274384049
>>> etime_midnight = (etime - (etime % 86400)) + time.altzone
>>> time.ctime(etime_midnight)
'Thu May 20 00:00:00 2010'
>>>

So, why is this necessary? It might be clearer if we throw in a call to gmtime() and also make the math bits more transparent:

>>> etime
1274384049
>>> time.ctime(etime)
'Thu May 20 15:34:09 2010'
>>> etime % 86400
70449
>>> (etime % 86400) / 3600
19
>>> time.gmtime(etime)
time.struct_time(tm_year=2010, tm_mon=5, tm_mday=20, tm_hour=19, tm_min=34, tm_sec=9, tm_wday=3, tm_yday=140, tm_isdst=0)
>>> midnight = etime - (etime % 86400)
>>> time.gmtime(midnight)
time.struct_time(tm_year=2010, tm_mon=5, tm_mday=20, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=140, tm_isdst=0)
>>> time.ctime(midnight)
'Wed May 19 20:00:00 2010'
>>> time.altzone
14400
>>> time.altzone / 3600
4
>>> midnight = (etime - (etime % 86400)) + time.altzone
>>> time.gmtime(midnight)
time.struct_time(tm_year=2010, tm_mon=5, tm_mday=20, tm_hour=4, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=140, tm_isdst=0)
>>> time.ctime(midnight)
'Thu May 20 00:00:00 2010'
>>>

What’s that now? You want what? You want the epoch timestamp for the previous Sunday at midnight? Well, let’s see. The time module in Python doesn’t do deltas per se. You can calculate things out using the epoch bits and some math if you wish. The only bit that’s really missing is the day of the week our current epoch timestamp lives on.

>>> time.ctime(midnight)
'Thu May 20 00:00:00 2010'
>>> struct_midnight = time.localtime(midnight)
>>> struct_midnight
time.struct_time(tm_year=2010, tm_mon=5, tm_mday=20, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=140, tm_isdst=1)
>>> dow = struct_midnight.tm_wday
>>> dow
3
>>> midnight_sunday = midnight - ((dow + 1) * 86400)
>>> time.ctime(midnight_sunday)
'Sun May 16 00:00:00 2010'

You can do this going forward in time from the epoch time as well. Remember, we also want to grab 23:59:59 on the Saturday after the epoch timestamp you now have:

>>> saturday_night = midnight + ((5 - dow+1) * 86400) - 1
>>> time.ctime(saturday_night)
'Sat May 22 23:59:59 2010'
>>>

And that’s how you do date manipulation using *only* the time module. Elegant,no?

No. Not really.

Unfortunately, the alternatives also aren’t the most elegant in the world, imho. So let’s try doing this all another way, using the datetime module and timedelta objects.

Now with datetime!

The documentation for the datetime module says:

“While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation.”

Hm. Sounds a lot like what the time module functions do. Some conversion here or there, but no real arithmetic support. We had to pretty much do it ourselves mucking about with epoch integer values. So what’s this buy us over the time module?

Let’s try to do our original task using the datetime module. We’re going to start with an epoch timestamp, and calculate the values for the previous Sunday at midnight, and the following Saturday at 23:59:59.

The first thing I had a hard time finding was a way to deal with the notion of a “week”. I thought I’d found it in ‘date.timetuple()’, which help(date.timetuple) says is “compatible with time.localtime()”. I guess they must mean that the output is the same as time.localtime(), because I can’t find any other way in which it is similar. Running time.localtime() with no arguments returns a time_struct object for the current time. date.timetuple() requires arguments or it’ll throw an error, and to make you extra frustrated, the arguments it takes aren’t in the docs or the help() output.

So maybe they mean it takes the same arguments as time.localtime(), eh? Not so much — time.localtime() takes an int representing an epoch timestamp. Trying to feed an int to date.timetuple throws an error saying it requires a ‘date’ object.

So, the definition of “compatible” is a little unclear to me in this context.

So here I’ve set about finding today, then “last saturday”, and then “the sunday before the last saturday”:

def get_last_whole_week(today=None):
    # a date object
    date_today = today or datetime.date.today()

    # day 0 is Monday. Sunday is 6.
    dow_today = date_today.weekday()

    if dow_today == 6:
        days_ago_saturday = 1
    else:
    # If day between 0-5, to get last saturday, we need to go to day 0 (Monday), then two more days.
        days_ago_saturday = dow_today + 2

    # Make a timedelta object so we can do date arithmetic.
    delta_saturday = datetime.timedelta(days=days_ago_saturday)

    # saturday is now a date object representing last saturday
    saturday = date_today - delta_saturday

    # timedelta object representing '6 days'...
    delta_prevsunday = datetime.timedelta(days=6)

    # Making a date object. Subtract the days from saturday to get "the Sunday before that".
    prev_sunday = saturday - delta_prevsunday

This gets me date objects representing the start and end time of my reporting range… sort of. I need them in epoch format, and I need to specifically start at midnight on Sunday and end on 23:59:59 on Saturday night. Sunday at midnight is no problem: timetuple() sets time elements to 0 anyway. For Saturday night, in epoch format, I should probably just calculate a date object for two Sundays a week apart, and subtract one second from one of them to get the last second of the previous Saturday.

Here’s the above function rewritten to return a tuple containing the start and end dates of the previous week. It can optionally be returned in epoch format, but the default is to return date objects.

def get_last_whole_week(today=None, epoch=False):
    # a date object
    date_today = today or datetime.date.today()
    print "date_today: ", date_today

    # By default day 0 is Monday. Sunday is 6.
    dow_today = date_today.weekday()
    print "dow_today: ", dow_today

    if dow_today == 6:
        days_ago_saturday = 1
    else:
        # If day between 0-5, to get last saturday, we need to go to day 0 (Monday), then two more days.
        days_ago_saturday = dow_today + 2
    print "days_ago_saturday: ", days_ago_saturday
    # Make a timedelta object so we can do date arithmetic.
    delta_saturday = datetime.timedelta(days=days_ago_saturday)
    print "delta_saturday: ", delta_saturday
    # saturday is now a date object representing last saturday
    saturday = date_today - delta_saturday
    print "saturday: ", saturday
    # timedelta object representing '6 days'...
    delta_prevsunday = datetime.timedelta(days=6)
    # Making a date object. Subtract the 6 days from saturday to get "the Sunday before that".
    prev_sunday = saturday - delta_prevsunday

    # we need to return a range starting with midnight on a Sunday, and ending w/ 23:59:59 on the
    # following Saturday... optionally in epoch format.

    if epoch:
        # saturday is date obj = 'midnight saturday'. We want the last second of the day, not the first.
        saturday_epoch = time.mktime(saturday.timetuple()) + 86399
        prev_sunday_epoch = time.mktime(prev_sunday.timetuple())
        last_week = (prev_sunday_epoch, saturday_epoch)
    else:
        saturday_str = saturday.strftime('%Y-%m-%d')
        prev_sunday_str = prev_sunday.strftime('%Y-%m-%d')
        last_week = (prev_sunday_str, saturday_str)
    return last_week

It would be easier to just have some attribute for datetime objects that lets you set the first day of the week to be Sunday instead of Monday. It wouldn’t completely alleviate every conceivable issue with calculating dates, but it would be a help. The calendar module has a setfirstweekday() method that lets you set the first weekday to whatever you want. I gather this is mostly for formatting output of matrix calendars, but it would be useful if it could be used in date calculations as well. Perhaps I’ve missed something? Clues welcome.

Mission 2: Calculate the Prior Month’s Start and End Dates

This should be easy. What I hoped would happen is I’d be able to get today’s date, and then create a timedelta object for ’1 month’, and subtract, having Python take care of things like changing the year when the current month is January. Calculating this yourself is a little messy: you can’t just use “30 days” or “31 days” as the length of a month, because:

  1. “January 31″ – “30 days” = “January 1″ — not the previous month.
  2. “March 1″ – “31 days” = “January 30″ — also not the previous month.

Instead, what I did was this:

  1. create a datetime object for the first day of the current month (hard coding the ‘day’ argument)
  2. used a timedelta object to subtract a day, which gives me a datetime object for the last day of the prior month (with year changed for me if needed),
  3. used that object to create a datetime object for the first day of the prior month (again hardcoding the ‘day’ argument)

Here’s some code:

today = datetime.datetime.today()
first_day_current = datetime.datetime(today.year, today.month, 1)
last_day_previous = first_day_current - datetime.timedelta(days=1)
first_day_previous = datetime.datetime(last_day_previous.year, last_day_previous.month, 1)
print 'Today: ', today
print 'First day of this month: ', first_day_current
print 'Last day of last month: ', last_day_previous
print 'First day of last month: ', first_day_previous

This outputs:

Today:  2010-07-06 09:57:33.066446
First day of this month:  2010-07-01 00:00:00
Last day of last month:  2010-06-30 00:00:00
First day of last month:  2010-06-01 00:00:00

Not nearly as onerous as the week start/end range calculations, but I kind of thought that between all of these modules we have that one of them would be able to find me the start and end of the previous month. The raw material for creating this is, I suspect, buried somewhere in the source code for the calendar module, which can tell you the start and end dates for a month, but can’t do any date calculations to give you the previous month. The datetime module can do calculation, but it can’t tell you the start and end dates for a month. The datetime.timedelta object’s largest granularity is ‘week’ if memory serves, so you can’t just do ‘timedelta(months=1)’, because the deltas are all converted internally to a fixed number of days, seconds, or milliseconds, and a month isn’t a fixed number of any of them.

Converge!

While I could probably go ahead and use dateutil, which is really darn flexible, I’d rather be able to do this without a third-party module. Also, dateutil’s flexibility is not without it’s complexity, either. It’s not an insurmountable task to learn, but it’s not like you can directly transfer your experience with the built-in modules to using dateutil.

I don’t think merging all of the time-related modules in Python would be necessary or even desirable, really, but I haven’t thought deeply about it. Perhaps a single module could provide a superclass for the various time-related objects currently spread across three modules, and they could share some base level functionality. Hard to conceive of a timedelta object not floating alone in space in that context, but alas, I’m thinking out loud. Perhaps a dive into the code is in order.

What have you had trouble doing with dates and times in Python? What docs have I missed? What features are completely missing from Python in terms of time manipulation that would actually be useful enough to warrant inclusion in the collection of included batteries? Let me know your thoughts.