Category: Python

The Python User Group in Princeton (PUG-IP): 6 months in

By , November 2, 2011 11:08 pm

In May, 2011, I started putting out feelers on Twitter and elsewhere to see if there might be some interest in having a Python user group that was not in Philadelphia or New York City. A single tweet resulted in 5 positive responses, which I took as a success, given the time-sensitivity of Twitter, my “reach” on Twitter (which I assume is far smaller than what might be the entire target audience for that tweet), etc.

Happy with the responses I received, I still wanted to take a baby step in getting the group started. Rather than set up a web site that I’d then have to maintain, a mailing list server, etc., I went to the cloud. I started a group on meetup.com, and started looking for places to hold our first meeting.

Meetup.com

Meetup.com, I’m convinced, gives you an enormous value if you’re looking to start a user group Right Now, Today™. For $12/mo., you get a place where you can announce future meetups, hold discussions, collect RSVPs so you have a head count for food or space or whatever, and vendors can also easily jump in to provide sponsorship or ‘perks’ in the form of discounts on services to user group members and the like. It’s a lot for a little, and it’s worked well enough. If we had to stick with it for another year, I’d have no real issue with that.

Google Groups

I set up a mailing list using Google Groups about 2-3 months ago now. I only waited so long because I thought meetup.com’s discussion forum might work for a while. After a few meetings, though, I noticed that there were always about five more people in attendance than had RSVP’d on meetup.com. Some people just aren’t going to be bothered with having yet another account on yet another web site I guess. If that’s the case, then I have two choices (maybe more, but these jumped to mind): force the issue by constantly trumpeting meetup.com’s service, or go where everyone already was. Most people have a Google account, and understand its services. Also, since the group is made up of technical people, they mostly like the passive nature of a mailing list as opposed to web forums.

If you’re setting up a group, I’d say that setting up a group on meetup.com and simultaneously setting up a Google group mailing list is the way to go if you want to get a fairly complete set of services for very little money and about an hour’s worth of time.

Meeting Space

Meeting space can come from a lot of different places, but I had a bit of trouble settling on a place at first. Princeton University is an awesome place and has a ton of fantastic places to meet with people, but if you’re not living on campus (almost no students are group members, btw), parking can be a bit troublesome, and Princeton University is famous for having little or no signage, and that includes building names, so finding where to go even if you did find parking can be problematic. So, so far, the University is out.

The only sponsor I had that was willing to provide space was my employer, but we’re nowhere near Princeton, and don’t really have the space. Getting a sponsor for space can be a bit difficult when your group doesn’t exist yet, in part because none of them have engaged with you or your group until the first meeting, when the attendees, who all work for potential sponsors, show up.

I started looking at the web site for the Princeton Public Library. I’ve been involved in the local Linux user group for several years, and they use free meeting space made available by the public library in Lawrenceville, which borders Princeton. I wondered if the Princeton Public Library did this as well, but they don’t, actually. In fact, meeting space at that location can get pretty expensive, since they charge for the space and A/V equipment like projectors and stuff separately (or they did when I started the group – I believe it’s still the case).

I believe I tweeted my disappointment about the cost of meeting at the Princeton Public Library, and did a callout on Twitter for space sponsors and other ideas about meeting space in or near Princeton. The Princeton Public Library got in touch through their @PrincetonPL Twitter account, and we were able to work out a really awesome deal where they became a sponsor, and agreed to host our group for 6 months, free of charge. Awesome!

Now, six months in, we either had to come to some other agreement with the library, or move on to a new space. After six months, it’s way easier to find space, or sponsors who might provide space, but I felt if we could find some way to continue the relationship with the library, it’d be best not to relocate the group. We wound up finding a deal that does good things for the group, the library, the local Python user community, and the evangelism of the Python language….

Knowledge for Space

Our group got a few volunteers together to commit to providing a 5-week training course to the public, held at the Princeton Public Library. Adding public offerings like this adds value to the library, attracts potential new members (they’re a member-supported library, not a state/municipality-funded one), etc. In exchange for providing this service to the library, the library provides us with free meeting space, including the A/V equipment.

If you don’t happen to have a public library that offers courses, seminars, etc., to the general public, you might be able to cut a similar deal with a local community college, or even high school. If you know of a corporation locally that uses Python or some other technology the group can speak or train people in, you might be able to trade training for meeting space in their offices. Training is a valued perk to the employees of most corporations.

How To Get Talks (or “How we stopped caring about getting talks”)

Whether you’re running a publishing outfit, a training event, or user group, getting people to deliver content is a challenge. Some people don’t think they have any business talking to what they perceive as a roomful of geniuses about anything. Some just aren’t comfortable talking in front of audiences, but are otherwise convinced of their own genius. Our group is trying to attack this issue in various ways, and so far it seems to be working well enough, though more ideas are welcome!

Basically, the group isn’t necessarily locked into traditions like “Thou shalt provide a speaker, who shalt bequeath upon our many wisdom of the ages”. Once you’ve decided as a group that having cookie-cutter meetings isn’t necessary, you start to think of all sorts of things you could all be doing together.

Below are some ideas, some in the works, some in planning, that I hope help other would-be group starters to get the ball rolling, and keep it in motion!

Projects For the Group, By the Group

Some members of PUG-IP are working together on building the pugip.org website, which is housed in a GitHub repository under the ‘pugip’ GitHub organization. This one project will inevitably result in all kinds of home-grown presentations & events within the group. As new ideas come up and new features are implemented, people will give lightning talks about their implementation, or we’ll do a group peer review of the code, or we’ll have speakers give talks about third-party technologies we might use (so, we might have two speakers each give a 30-minute talk about two different NoSQL solutions, for example. We’ve already had a great overview of about 10 different Python micro-frameworks), etc.

We may also decide to break up into pairs, and then sprint together on a set of features, or a particularly large feature, or something like that.

As of now, we’ve made enough decisions as a group to get the ball rolling. If there’s any interest I can blog about the setup that allows the group to easily share, review, and test code, provide live demos of their work, etc. The tl;dr version is we use GitHub and free heroku accounts, but new ideas come into play all the time. Just today I was wondering if we could, as a group, make use of the cloud9 IDE (http://cloud9ide.com).

The website is a great idea, but other group projects are likely to come up.

Community Outreach

PUG-IPs first official community outreach project will be the training we provide through the Princeton Public library. A few of us will collaborate on delivering the training, but the rest of the group will be involved in providing feedback on various aspects of the material, etc., so it’s a ‘whole group’ project, really. On top of increasing interactivity among the group members, outreach is also a great way to grow and diversify the group, and perhaps gain sponsorships as well!

There’s another area group called LUG-IP (a Linux user group) that also does some community outreach through a hardware SIG (special interest group), certification training sessions, and participating in local computing events and conferences. I’d like to see PUG-IP do this, too, maybe in collaboration with the LUG (they’re a good and passionate group of technologists).

Community outreach can also mean teaming up with various other technology groups, and one event I’m really looking forward to is a RedSnake meeting to be held next February. A RedSnake meeting is a combined meeting between PhillyPUG (the Philadelphia Python User Group) and Philly.rb (the Philadelphia Ruby Group). As a member of PhillyPUG I participated in last year’s RedSnake meeting, and it was a fantastic success. Probably 70+ people in attendance (here’s a pic at the end – some had already left by the time someone snapped this), and perhaps 10 or so lightning talks given by members of both organizations. We tried to do a ‘matching’ talk agenda at the meeting, so if someone on the Ruby side did a testing talk, we followed that with a Python testing talk, etc. It was a ton of fun, and the audience was amazing.

Socials

Socials don’t have to be dedicated events, per se. For example, PUG-IP has a sort of mini-social after every single meetup. We’re lucky to have our meetings located about a block away from a brewpub, so after each meeting, perhaps half of us make it over for a couple of beers and some great conversations. After a few of these socials, I started noticing that more talk proposals started to spring up.

Of course, socials can also be dedicated events. Maybe some day PUG-IP will…. I dunno… go bowling? Or maybe we’ll go as a group to see the next big geeky movie that comes out. Maybe we’ll have some kind of all-inclusive, bring-the-kids BBQ next summer. Who knows?

As a sort of sideshow event to the main LUG meetings, LUG-IP has a regularly-scheduled ‘coffee klatch’. Some of the members meet up one Sunday per month at (if memory serves) 8-11AM at a local Panera for coffee, pastries, and geekery. It’s completely informal, but it’s a good time.

Why Not Having Talks Will Help You Get Talks

I have a theory that is perhaps half-proven through my experiences with technology user groups: increasing engagement among and between the members of the group in a way that doesn’t shine a huge floodlight on a single individual (like a talk would) eventually breaks down whatever fears or resistance there is to proposing and giving a talk. Sometimes it’s just a comfort level thing, and working on projects, or having a beer, or sprinting on code, etc. — together — turns a “talking in front of strangers” experience into more of a “sharing with my buddies” one.

I hope that’s true, anyway. It seems to be. :)

Thanks For Reading

I hope someone finds this useful. It’s early on in the life of PUG-IP, but I thought it would be valuable to get these ideas out into the ether early and often before they slip from my brain. Good luck with your groups!

pyrabbit Makes Testing and Managing RabbitMQ Easy

By , October 23, 2011 1:05 pm

I have a lot of hobby projects, and as a result getting any one of them to a state where I wouldn’t be completely embarrassed to share it takes forever. I started working on pyrabbit around May or June of this year, and I’m happy to say that, while it’ll never be totally ‘done’ (it is software, after all), it’s now in a state where I’m not embarrassed to say I wrote it.

What is it?

It’s a Python module to help make managing and testing RabbitMQ servers easy. RabbitMQ has, for some time, made available a RESTful interface for programmatically performing all of the operations you would otherwise perform using their browser-based management interface.

So, pyrabbit lets you write code to manipulate resources like vhosts & exchanges, publish and get messages, set permissions, and get information on the running state of the broker instance. Note that it’s *not* suitable for writing AMQP consumer or producer applications; for that you want an *AMQP* module like pika.

PyRabbit is tested with Python versions 2.6-3.2. The testing is automated using tox. In fact, PyRabbit was a project I started in part because I wanted to play with tox.

Here’s the example, ripped from the documentation (which is ripped right from my own terminal session):

>>> from pyrabbit.api import Client
>>> cl = Client('localhost:55672', 'guest', 'guest')
>>> cl.is_alive()
True
>>> cl.create_vhost('example_vhost')
True
>>> [i['name'] for i in cl.get_all_vhosts()]
[u'/', u'diabolica', u'example_vhost', u'testvhost']
>>> cl.get_vhost_names()
[u'/', u'diabolica', u'example_vhost', u'testvhost']
>>> cl.set_vhost_permissions('example_vhost', 'guest', '.*', '.*', '.*')
True
>>> cl.create_exchange('example_vhost', 'example_exchange', 'direct')
True
>>> cl.get_exchange('example_vhost', 'example_exchange')
{u'name': u'example_exchange', u'durable': True, u'vhost': u'example_vhost', u'internal': False, u'arguments': {}, u'type': u'direct', u'auto_delete': False}
>>> cl.create_queue('example_queue', 'example_vhost')
True
>>> cl.create_binding('example_vhost', 'example_exchange', 'example_queue', 'my.rtkey')
True
>>> cl.publish('example_vhost', 'example_exchange', 'my.rtkey', 'example message payload')
True
>>> cl.get_messages('example_vhost', 'example_queue')
[{u'payload': u'example message payload', u'exchange': u'example_exchange', u'routing_key': u'my.rtkey', u'payload_bytes': 23, u'message_count': 2, u'payload_encoding': u'string', u'redelivered': False, u'properties': []}]
>>> cl.delete_vhost('example_vhost')
True
>>> [i['name'] for i in cl.get_all_vhosts()]
[u'/', u'diabolica', u'testvhost']

Hopefully you’ll agree that this is simple enough to use in a Python interpreter to get information and do things with RabbitMQ ‘on the fly’.

How Can I Get It?

Well, there’s already a package on PyPI called ‘pyrabbit’, and it’s not mine. It’s some planning-stage project that has no actual software associated with it. I’m not sure when the project was created, but the PyPI page has a broken home page link, and what looks like a broken RST-formatted doc section. I’ve already pinged someone to see if it’s possible to take over the name, because I can’t think of a cool name to change it to.

Until that issue is cleared up, you can get downloadable packages or clone/fork the code at the pyrabbit github page (see the ‘Tags’ section for downloads), and the documentation is hosted on the (awesome) ReadTheDocs.org site.

Shhh… I’m Hunting Talks

By , October 13, 2011 1:10 pm

Well, it’s that time of year again. The PyCon 2012 Call for Proposals has ended. This means it’s time for the Program Committee to spring into action, evaluating all of the proposals, preparing to champion their favorites, and participating in the interactive meetings that eventually decide the fate of PyCon 2012′s slate of talks, tutorials, and poster sessions.

But… who is the Program Committee?

I asked that question last year. “Who are these people? Is this something I can participate in?” It turns out that *anyone* can join the Program Committee. The genius in the way it works is its simplicity: if you want to be on the Program Committee, you join a mailing list, send an intro email, and you’re added as a committee member.

My first year on the Program Committee was a fantastic experience that served to educate me about the selection process, and to further my conviction that the Python community is the most welcoming, open and inclusive group I’ve ever had the pleasure of being involved with.

So, if you haven’t already, I encourage you to join the Program Committee for this year’s round of meetings to help build what will undoubtedly become the biggest and best PyCon yet.

Thoughts on Python and Python Cookbook Recipes to Whet Your Appetite

By , June 9, 2011 10:33 pm

Dave Beazley and myself are, at this point, waist deep into producing Python Cookbook 3rd Edition. We haven’t really taken the approach of going chapter by chapter, in order. Rather, we’ve hopped around to tackle chapters one or the other finds interesting or in-line with what either of us happens to be working with a lot currently.

For me, it’s testing (chapter 8, for those following along with the 2nd edition), and for Dave, well, I secretly think Dave touches every aspect of Python at least every two weeks whether he needs to or not. He’s just diabolical that way. He’s working on processes and threads at the moment, though (chapter 9 as luck would have it).

In both chapters (also a complete coincidence), we’ve decided to toss every scrap of content and start from scratch.

Why on Earth Would You Do That?

Consider this: when the last edition (2nd ed) of the Python Cookbook was released, it went up to Python 2.4. Here’s a woefully incomplete list of the superamazing awesomeness that didn’t even exist when the 2nd Edition was released:

  • Modules:
    • ElementTree
    • ctypes
    • sqlite3
    • functools
    • cProfile
    • spwd
    • uuid
    • hashlib
    • wsgiref
    • json
    • multiprocessing
    • fractions
    • plistlib
    • argparse
    • importlib
    • sysconfig
  • Other Stuff
    • The ‘with’ statement and context managers*
    • The ‘any’ and ‘all’ built-in functions
    • collections.defaultdict
    • advanced string formatting (the ‘format()’ method)
    • class decorators
    • collections.OrderedDict
    • collections.Counter
    • collections.namedtuple()
    • the ability to send data *into* a generator (yield as an expression)
    • heapq.merge()
    • itertools.combinations
    • itertools.permutations
    • operator.methodcaller()

* If memory serves, the ‘with’ statement was available in 2.4 via future import.

Again, woefully incomplete, and that’s only the stuff that’s in the 2.x version! I don’t even mention 3.x-only things like concurrent.futures. From this list alone, though, you can probably discern that the way we think about solving problems in Python, and what our code looks like these days, is fundamentally altered forever in comparison to the 2.4 days.

To give a little more perspective: Python core development moved from CVS to Subversion well after the 2nd edition of the book hit the shelves. They’re now on Mercurial. We skipped the entire Subversion era of Python development.

The addition of any() and all() to the language by themselves made at least 3-4 recipes in chapter 1 (strings) one-liners. I had to throw at least one recipe away because people just don’t need three recipes on how to use any() and all(). The idea that you have a chapter covering processes and threads without a multiprocessing module is just weird to think about these days. The with statement, context managers, class decorators, and enhanced generators have fundamentally changed how we think about certain operations.

Also something to consider: I haven’t mentioned a single third-party module! Mock, tox, and nosetests all support Python 3. At least Mock and tox didn’t exist in the old days (I don’t know about nose off-hand). Virtualenv and pip didn’t exist (both also support Python 3). So, not only has our code changed, but how we code, test, deploy, and generally do our jobs with Python has also changed.

Event-based frameworks aside from Twisted are not covered in the 2nd edition if they existed at all, and Twisted does not support Python 3.

WSGI, and all it brought with it, did not exist to my knowledge in the 2.4 days.

We need a Mindset List for Python programmers!

So, What’s Your Point

My point is that I suspect some people have been put off of submitting Python 3 recipes, because they don’t program in Python 3, and if you’re one of them, you need to know that there’s a lot of ground to cover between the 2nd and 3rd editions of the book. If you have a recipe that happens to be written in Python 2.6 using features of the language that didn’t exist in Python 2.4, submit it. You don’t even have to port it to Python 3 if you don’t want to or don’t have the time or aren’t interested or whatever.

Are You Desperate for Recipes or Something?

Well, not really. I mean, if you all want to wait around while Dave and I crank out recipe after recipe, the book will still kick ass, but it’ll take longer, and the book’s world view will be pretty much limited to how Dave and I see things. I think everyone loses if that’s the case. Having been an editor of a couple of different technical publications, I can say that my two favorite things about tech magazines are A) The timeliness of the articles (if Python Magazine were still around, we would’ve covered tox by now), and B) The broad perspective it offers by harvesting the wisdom and experiences of a vast sea of craftspeople.

What Other Areas Are In Need?

Network programming and system administration. For whatever reason, the 2nd edition’s view of system administration is stuff like checking your Windows sound system and spawning an editor from a script. I guess you can argue that these are tasks for a sysadmin, but it’s just not the meat of what sysadmins do for a living. I’ll admit to being frustrated by this because I spent some time searching around for Python 3-compatible modules for SNMP and LDAP and came up dry, but there’s still all of that sar data sitting around that nobody ever seems to use and is amazing, and is easy to parse with Python. There are also terminal logging scripts that would be good.

Web programming and fat client GUIs also need some love. The GUI recipes that don’t use tkinter mostly use wxPython, which isn’t Python 3-compatible. Web programming is CGI in the 2nd edition, along with RSS feed aggregation, Nevow, etc. I’d love to see someone write a stdlib-only recipe for posting an image to a web server, and then maybe more than one recipe on how to easily implement a server that accepts them.

Obviously, any recipes that solve a problem that others are likely to have that use any of the aforementioned modules & stuff that didn’t exist in the last edition would really rock.

How Do I Submit?

  1. Post the code and an explanation of the problem it solves somewhere on the internet, or send it (or a link to it) via email to PythonCookbook@oreilly.com or to @bkjones on Twitter.
  2. That’s it.

We’ll take care of the rest. “The rest” is basically us pinging O’Reilly, who will contact you to sign something that says it’s cool if we use your code in the book. You’ll be listed in the credits for that recipe, following the same pattern as previous editions. If it goes in relatively untouched, you’ll be the only name in the credits (also following the pattern of previous editions).

What Makes a Good Recipe?

A perfect recipe that is almost sure to make it into the cookbook would ideally meet most of the criteria set out in my earlier blog post on that very topic. Keep in mind that the ability to illustrate a language feature in code takes precedence over the eloquence of any surrounding prose.

What If…

I sort of doubt this will come up, but if we’ve already covered whatever is in your recipe, we’ll weigh that out based on the merits of the recipes. I want to say we’ll give new authors an edge in the decision, but for an authoritative work, a meritocracy seems the only valid methodology.

If you think you’re not a good writer, then write the code, and a 2-line description of the problem it solves, and a 2-line description of how it works. We’ll flesh out the text if need be.

If you just can’t think of a good recipe, grep your code tree(s) for just the import statements, and look for ideas by answering questions on Stackoverflow or the various mailing lists.

If you think whatever you’re doing with the language isn’t very cool, then stop thinking that a cookbook is about being cool. It’s about being practical, and showing programmers possibly less senior than yourself an approach to a problem that isn’t completely insane or covered in warts, even if the problem is relatively simple.

Slides, an App, a Meetup, and More On the Way

By , June 2, 2011 10:27 pm

I’ve been busy. Seriously. Here’s a short dump of what I’ve been up to with links and stuff. Hopefully it’ll do until I can get back to my regular blogging routine.

PICC ’11 Slides Posted

I gave a Python talk at PICC ’11. If you were there, then you have a suboptimal version of the slides, both because I caught a few bugs, and also because they’re in a flattened, lifeless PDF file, which sort of mangles anything even slightly fancy. I’m not sure how much value you’ll get out of these because my presentation slides tend to present code that I then explain, and you won’t have the explanation, but people are asking, so here they are in all their glory. Enjoy!

I Made a Webapp Designed To Fail

No really, I did. WebStatusCodes is the product of necessity. I’m writing a Python module that provides an easy way for people to talk to a web API. I test my code, and for some of the tests I want to make sure my code reacts properly to certain HTTP errors (or in some cases, to *any* HTTP status code that’s not 200). In unit tests this isn’t hard, but when you’re starting to test the network layers and beyond, you need something on the network to provide the errors. That’s what WebStatusCodes does. It’s also a simple-but-handy reference for HTTP status codes, though it is incomplete (418 I’m a teapot is not supported). Still, worth checking out.

Interesting to note, this is my first AppEngine application, and I believe it took me 20 minutes to download the SDK, get something working, and get it deployed. It was like one of those ‘build a blog in -15 minutes’ moments. Empowering the speed at which you can create things on AppEngine, though I’d be slow to consider it for anything much more complex.

Systems and Devops People, Hack With Me!

I like systems-land, and a while back I was stuck writing some reporting code, which I really don’t like, so I started a side project to see just how much cool stuff I could do using the /proc filesystem and nothing but pure Python. I didn’t get too far because the reporting project ended and I jumped back into all kinds of other goodness, but there’s a github project called pyproc that’s just a single file with a few functions in it right now, and I’d like to see it grow, so fork it and send me pull requests. If you know Linux systems pretty well but are relatively new to Python, I’ll lend you a hand where I can, though time will be a little limited until the book is done (see further down).

The other projects I’m working on are sort of in pursuit of larger fish in the Devops waters, too, so be sure to check out the other projects I mention later in this post, and follow me on github.

Python Meetup Group in Princeton NJ

I started a Meetup group for Pythonistas that probably work in NYC or PA, but live in NJ. I work in PA, and before this group existed, the closest group was in Philly, an hour from home. I put my feelers out on Twitter, found some interest, put up a quick Meetup site, and we had 13 people at the first meetup (more than had RSVP’d). It’s a great group of folks, but more is always better, so check it out if you’re in the area. We hold meetings at the beautiful Princeton Public Library (who found us on twitter and now sponsors the group!), which is just a block or so from Triumph, the local microbrewery. I’m hoping to have a post-meeting impromptu happy hour there at some point.

Python Cookbook Progress

The Python Cookbook continues its march toward production. Lots of work has been done, lots of lessons have been learned, lots of teeth have been gnashed. The book is gonna rock, though. I had the great pleasure of porting all of the existing recipes that are likely to be kept over to Python 3. Great fun. It’s really amazing to see just how it happens that a 20-line recipe is completely obviated by the addition of a single, simple language feature. It’s happened in almost every chapter I’ve looked at so far.

If you have a recipe, or stumble upon a good example of some language feature, module, or other useful tidbit, whether it runs in Python 3 or not, let me know (see ‘Contact Me’). The book is 100% Python 3, but I’ve gotten fairly adept at porting things over by now :) Send me your links, your code, or whatever. If we use the recipe, the author will be credited in the book, of course.

PyRabbit is Coming

In the next few days I’ll be releasing a Python module on github that will let you easily work with RabbitMQ servers using that product’s HTTP management API. It’s not nearly complete, which is why I’m releasing it. It does some cool stuff already, but I need another helper or two to add new features and help do some research into how RabbitMQ broker configuration affects JSON responses from the API. Follow me on github if you want to be the first to know when I get it released. You probably also want to follow myYearbook on github since that’s where I work, and I might release it through the myYearbook github organization (where we also release lots of other cool open source stuff).

Python Asynchronous AMQP Consumer Module

I’m also about 1/3 of the way through a project that lets you write AMQP consumers using the same basic model as you’d write a Tornado application: write your handler, import the server, link the two (like, one line of code), and call consume(). In fact, it uses the Tornado IOLoop, as well as Pika, which is an asynchronous AMQP module in Python (maintained by none other than my boss and myYearbook CTO,  @crad), which also happens to support the Tornado IOLoop directly.

Book Review: Python Standard Library by Example

By , June 2, 2011 7:08 pm

Quick Facts:

  • Author: Doug Hellmann
  • Pages: 1344
  • Publisher: Addison-Wesley (Developer’s Library)
  • ETA: June 5, 2011
  • Amazon link: http://www.amazon.com/Python-Standard-Library-Example-Developers/dp/0321767349/ref=sr_1_1?ie=UTF8&qid=1307109464&sr=1-1-spell

What this book says it does:

From the book’s description:

This book is a collection of essays and example programs demonstrating how to use more than 100 modules from Python standard library. It goes beyond the documentation available on python.org to show real programs using the modules and demonstrating how you can use them in your daily programming tasks.

What this book actually does:

This book actually kinda rocks, in part because of its a unique take on documentation of the Python standard library. The Python standard library documentation is actually a pretty good high-level reference, and this book doesn’t seek to duplicate what’s there. Instead, it specifically seeks out places in the existing documentation that are underdocumented, undocumented, don’t have clear enough examples, or just don’t provide the value to the end user that they should for whatever reason. Even as good as the standard library documentation is, Doug easily cranked out 1000+ pages of invaluable information that has given me a much greater insight into the standard library modules that I use on a regular basis (and plenty that I don’t use on a regular basis).

How it works

The book is simply laid out by module. Using the multiprocessing module? It’s right there in the Table of Contents. It’s as easy to use as the standard library docs from a navigational perspective, and the index, it could be argued, is an improvement over docs.python.org’s  search behavior.

When you get to the module you’re looking for, you’ll primarily see code. There is enough English text to explain what the code actually does, but the main illustrative tool in this book is the code. This is not an easy thing to accomplish, but Doug provides a very nice and balanced presentation of the real meaty parts of your favorite standard library modules.

What’s Great About it

First, it provides both depth and breadth, it’s easy to find whatever you’re looking for, and if it’s not needed (usually because it’s well-covered in the standard library docs) it’s not there.

Second, the book is written by an authoritative, knowledgeable, experienced, and prolific Python developer. While he’s a creative thinker, his work is balanced by a healthy dose of pragmatism and grounded in best practices. Contrived as the examples might get at times, you won’t typically find code written by Doug that would garner sideways glances by experienced Python developers.

Third, it’s not a rehashing of the docs. In fact it skips coverage of things that are well-documented in the docs. Yes, the book does contain simple introductory material for each module to give the uninitiated some context, but that’s different from a book that takes existing docs and just moves the letters around. Doug does a great job of getting you into the good stuff without much fluff.

Fourth, there’s almost zero fluff. I’d love to see a statistical breakdown of the number of lines of code vs. text in this book. And the good part really isn’t that he’s put so much code in the book, it’s that he presents with code alongside the text in a way that insures readers don’t get lost.

Fifth, this wasn’t a rush job. Doug has been writing this content in the form of his Python Module of the Week blog series for a few years now. Most of the work was editing, finessing, updating, and testing (and retesting) the code, not developing the content from scratch. So what’s there in the book is not just a braindump from Doug’s brain: it’s had the benefit of peer review and feedback from the blog, email, etc., and that adds a ton of value to the final product in my eyes.

What’s Not Great About it

I insist on including bad things about everything I review, because nothing is perfect, and the more people talk about things they don’t like, the more makers start to listen and make things better.

To be honest, the only thing I found lacking in this book is the index. This should not shock anyone who is a tech bibliophile. Most indexes, at least on tech books, are pretty bad (ironic since tech books often serve as references, which makes the index pretty crucial). Also, consider that this review is based on a review copy of the book, so it’s possible that the final version will have a totally awesome index.

Ah, one other thing (which also might be due to this being a review copy): there are no tab markers. Since Python defines scope using whitespace, not having indentation markers in any medium containing page breaks can lead to confusion if a code sample crosses page boundaries. The alternative to having the markers is to insure that the code samples don’t cross page boundaries. It’s not possible for me to know if they’ll do one or both of these before the final printing.

The Final Word

My petty complaints about the index and indentation markers are not only trivial, but they both may be fixed in the final printing. I saw nothing so bad that I wouldn’t highly recommend this book, and I’ve seen tons and tons of stuff that would make me highly recommend this book. I’m using this book myself on a fairly regular basis, and it’s an effective, easy-to-use tool that makes a great companion reference to the standard library.

Buy it.

‘Grokking Python’ Going to PICC Conference!

By , April 7, 2011 8:45 pm

In conjunction with my involvement as co-author of the upcoming Python Cookbook, 3rd Ed. (not yet released), a tutorial at this year’s PyCon in Atlanta, an internal (and ongoing) lunchtime seminar series entitled ‘Snakes On a Plate’, and other recent Python-related projects, I’ve also been refining and revising what I can now call a completely awesome 3-hour introduction to the Python programming language.

If you’re a sysadmin, operations engineer, devops engineer, or just want to get your hands dirty with Python, I can’t think of a better more cost-effective way to do it than to attend the ‘Grokking Python’ tutorial at this year’s PICC conference, which is being held in New Brunswick, NJ, April 29-30.

While I do plan for the tutorial to run through the basics, I also assume attendees have programmed in some other language before. In addition, I firmly believe that, properly presented, most would find that Python is a very simple language to get to know and understand. That being the case, the most basic elements of the language (control statements, loops, etc) will be covered in the first hour (and the materials will be available for later reference).

Once we’re through that, it’s head first into what admin/ops engineers do for a living. Python was developed by a systems programmer for systems programming. As such, support for a huge swath of admin tasks (and far, far beyond) is baked into the language, and enormous tomes have been written covering third party tools and modules to do anything else you can possibly imagine.

We’re going to look at some of the more ho-hum parts of scripting, like accepting input from users, command line options and arguments, and file handling, but before it’s over we’re going to have a look at the basics of email, networking, multiprocessing, threading, coroutines, SSH, and more.

We’re also going to cover use of the Python interactive shell, which will not only help speed your mastery of the language and its standard library, but also holds promise as a sysadmin tool in its own right.

The blowing of minds is a goal of the tutorial. Bring a laptop, and bring some bandages ;-)

Lessons Learned Porting Dateutil to Python 3

By , February 25, 2011 11:04 pm

The dateutil module is a very popular third-party (pure) Python module that makes it easier (and in some cases, possible) to perform more advanced manipulations on dates and date ranges than simply using some combination of Python’s ‘included batteries’ like the datetime, time and calendar modules.

Dateutil does fuzzy date matching, Easter calculations in the past and future, relative time delta calculations, time zone manipulation, and lots more, all in one nicely bundled package.

I decided to port dateutil to Python 3.

Why?

For those who haven’t been following along at home, David Beazley and I are working on the upcoming Python Cookbook 3rd Edition, which will contain only Python 3 recipes. Python 2 will probably only get any real treatment when we talk about porting code.

When I went back to the 2nd edition of the book to figure out what modules are used heavily that might not be compatible with Python 3, dateutil stuck out. It’s probably in half or more of the recipes in the ‘Time and Money’ chapter in the 2nd Edition. I decided to give it a look.

How Long Did it Take?

Less than one work day. Seriously. It was probably 4-5 hours in total, including looking at documentation and getting to know dateutil. I downloaded it, I ran 2to3 on it without letting 2to3 do the in-place edits, scanned the output for anything that looked ominous (there were a couple of things that looked a lot worse than they turned out to be), and once satisfied that it wasn’t going to do things that were dumb, I let ‘er rip: I ran 2to3 and told it to go ahead and change the files (2to3 makes backup copies of all edited files by default, by the way).

What Was the Hardest Part?

Well, there were a few unit tests that used the base64 module to decode some time zone file data into a StringIO object before passing the file-like object to the code under test (I believe the code under test was the relativedelta module). Inside there, the file-like StringIO object is subjected to a bunch of struct.unpack() calls, and there are a couple of plain strings that get routed elsewhere.

The issue with this is that there are NO methods inside the base64 module that return strings anymore, which makes creating the StringIO object more challenging. All base64 methods return Python bytes objects. So, I replaced the StringIO object with a BytesIO object, all of the struct.unpack() calls “just worked”, and the strings that were actually needed as strings in the code had a ‘.decode()’ appended to them to convert the bytes back to strings. All was well with the world.

What Made it Easier?

Two things, smaller one first:

First, Python built-in modules for date handling haven’t been flipped around much, and dateutil doesn’t have any dependencies outside the standard library (hm, maybe that’s 2 things right there). The namespaces for date manipulation modules are identical to Python 2, and I believe for the most part all of the methods act the same way. There might be some under-the-hood changes where things return memoryview objects or iterators instead of lists or something, but in this and other porting projects involving dates, that stuff has been pretty much a non-event most of the time

But the A #1 biggest thing that made this whole thing take less than a day instead of more than a week? Tests.

Dateutil landed on my hard drive with 478 tests (the main module has about 3600 lines of actual code, and the tests by themselves are roughly 4000 lines of code). As a result, I didn’t have to manually check all kinds of functionality or write my own tests. I was able to port the tests fairly easily with just a couple of glitches (like the aforementioned base64 issue). From there I felt confident that the tests were testing the code properly.

In the past couple of days since I completed the ‘project’, I ported some of the dateutil recipes from the 2nd edition of the book to Python 3, just for some extra assurance. I ported 5 recipes in under an hour. They all worked.

Had You Ported Stuff Before?

Well, to be honest most of my Python 3 experience (pre-book, that is) is with writing new code. To gain a broader exposure to Python 3, I’ve also done lots of little code golf-type labs, impromptu REPL-based testing at work for things I’m doing there, etc. I have ported a couple of other small projects, and I have had to solve a couple of issues, but it’s not like I’ve ever ported something the size of Django or ReportLab or something.

The Best Part?

I had never seen dateutil in my life.

I had read about it (I owned the Python Cookbook 2nd Edition since its initial release, after all), but I’d never been a user of the project.

The Lessons?

  1. This is totally doable. Stop listening to the fear-inducing rantings of naysayers. Don’t let them hold you back. The pink ponies are in front of you, not behind you.
  2. There are, in fact, parts of Python that remain almost unchanged in Python 3. I would imagine that even Django may find that there are swaths of code that “just works” in Python 3. I’ll be interested to see metrics about that (dear Django: keep metrics on your porting project!)
  3. Making a separation between text and data in the language is actually a good thing, and in the places where it bytes you (couldn’t resist, sorry), it will likely make sense if you have a fundamental understanding of why text and data aren’t the same thing. I predict that, in 2012, most will view complainers about this change the same way we view whitespace haters today.

“I Can’t Port Because…”

If you’re still skeptical, or you have questions, or you’re trying and having real problems, Dave and I would both love for *you* to come to our tutorial at PyCon. Or just come to PyCon so we can hack in the hallway on it. I’ve ported, or am in the process of porting, 3 modules to Python 3. Dave has single-handedly ported something like 3-5 modules to Python 3 in the past 6 weeks or so. He’s diabolical.

I’d love to help you out, and if it turns out I can’t, I’d love to learn more about the issue so we can shine a light on it for the rest of the community. Is it a simple matter of documentation? Is it a bug? Is it something more serious? Let’s figure it out and leverage the amazing pool of talent at PyCon to both learn about the issue and hopefully get to a solution.

PyCon 2011 Predictions

By , January 28, 2011 4:39 pm

PyCon 2011 is right around the corner. Are you going? You should. The talks are sick. You can still register – it’s too late to be an early bird, but registration is still open!

Well, I am, and I’m here to get the rumor mill started by sharing some predictions for this year’s PyCon.

Packaging

While last year there was some hubbub around packaging, most notably the poster images declaring that pip and distribute were ‘the new black’, this year that’s old news. The pip and distribute mantra will bear repeating for those who are still behind the times of course, but this year I think there will be more hand-wringing over packaging, probably focusing more on the data and distribution mechanism than the tools. I, for one, am looking forward to any light Tarek Ziade and other’s on the distutils2 team can shed on new metadata elements, new build procedures (I hear setup.py is going away someday), and what’s going on with distutils, distribute, pip, distutils2, PyPI, etc.

My prediction? Packaging, while not very exciting, will garner a lot of attention again this year.

Python 3

Dave Beazley and I will be doing a tutorial on Python 3, but besides that, I think there’s going to be a lot of hallway (and lounge) discussion about all things Python 3. This would mirror the clamor around the topic recently online, which I don’t really expect to slow down. Python 3 is, of course, discussed every year, but I think this year’s PyCon will be Python 3′s Woodstock: the time and place where minds merge onto an idea together and harmonize and all that stuff.

My prediction? We’ll see a good number of commitments from projects to port to Python 3, GSOC projects related to Python 3, and more funny looks at projects not getting their porting underway in 2011 as a direct result of PyCon 2011.

Scale

Multiprocessing and threading will be hot topics again for two reasons: First, lots of people just look at the GIL and reflexively get heartburn without doing any actual analysis to see if they even need to care about it in the context of a particular application. Second, the uptake for Python in ‘web scale’ environments continues to grow, so that’s bound to be reflected in the attendee demographics if you will, and therefore the perceived interest in the topic.

PyPy is going to loom large at this year’s conference. Some may decide after PyCon 2011 that PyPy is the future for them. Unladen Swallow will be declared dead, in spite of some eager undergrads making empty promises about picking it up. Most who care will opt for the project that appears to be gaining steam, not losing it.

Generator-based coroutines will continue to be a really cool but relatively unused oddity, and Eventlet and the like will continue to have their adherents and go with the flow.

Event-based/async frameworks will garner some attention. Tornado, gUnicorn, and others will do well. Twisted followers will say that all others are either redundant or fatally flawed.

My prediction? Tornado and ZeroMQ will make a big splash, this being the first year it’s being covered (I think). People will still advocate multiprocessing over threading after the conference, and Twisted will start to see declines in usership after PyCon 2011.

Potpourri

  • Python core development contributions will increase thanks to efforts by the PSF and core development communities, though not without some growing pains.
  • Michael Foord’s ‘mock‘ library will take off as a de facto standard module in the toolbelts of Python developers everywhere.
  • Fabric’s parallel branch will be merged as the default either before or as a result of activity at PyCon 2011.
  • Flask will announce a merger with another web framework in 2011, in part due to conversations and activity at PyCon.
  • PyCharm will become recognized as the preeminent Python IDE as a result of everyone getting to PyCon only to find that everyone else they were going to advocate it to is already using it.
  • Some outside the scientific community will attend a Python-Science talk anyway and come away with valuable techniques they can apply to distributed computing problems. Some outside the web scalability domain will attend related talks anyway and take something back to the enterprise computing space. Some outside the cloud space will attend a cloud talk and find a cool project to put in the cloud. As a result:
  • At least 30 new Python projects will be created as a direct result of PyCon 2011.

Python 3: Informal String Formatting Performance Comparison

By , January 2, 2011 11:03 pm

If you haven’t heard the news, Dave Beazley and I have officially begun work on the next edition of the Python Cookbook, which will be completely overhauled using absolutely nothing but Python 3. Yay!

Right now, I’m going through some string formatting recipes from the 2nd edition to see if they still work, and if Python 3 offers any preferred alternatives to the solutions provided. As usual, it turns out that the answer to that is often ‘it depends’. For example, you might decide on a slower solution that’s more readable. Conversely, you might need to run an operation in a loop a million times and really need the speed.

New string formatting operations like the built-in format() function (separate from the str.format method) and the format mini-language are available in 2.6, and made nicer in 2.7. All of it is backported from the 3.x tree to my knowledge, and I’ll be using a Python 3.2b2 interpreter session for my examples.

I want to focus specifically on string alignment here, because there are very obviously multiple ways to solve alignment needs. Here’s an example solution from the 2nd edition:

>>> print '|' , 'hej'.ljust(20) , '|' , 'hej'.rjust(20) , '|' , 'hej'.center(20) , '|'
| hej             |             hej |       hej       |

Note that this is of course in Python 2.x syntax, but this works in Python 3.2 if you just make it a function call instead of a statement (so, just add parens and it works). The string methods used here are still in Python 3.2, with no notices of deprecation or preference for newer methods available now. That said, this looks messy to me, and so I wondered if I could make it more readable without losing performance, or at least without losing so much performance that it’s not worth any gains in the area of readability.

Single String Formatting

Here are three ways to get the same string alignment behavior in Python 3.2b2:

>>> '{:+<20s}'.format('hej')
'hej+++++++++++++++++'
>>> format('hej', '+<20s')
'hej+++++++++++++++++'
>>> 'hej'.ljust(20, '+')
'hej+++++++++++++++++'

Ok, so they all work the same. Now I’m going to wrap each one in a function and use the timeit module to help me get an idea what the difference is in terms of performance.

>>> def runit():
...     format('hej', '+<20s')
...
>>> def runit2():
...     'hej'.ljust(20, '+')
...
>>> def runit3():
...     '{:+<20s}'.format('hej')
...
>>> timeit(stmt=runit3, number=1000000)
0.6168370246887207
>>> timeit(stmt=runit3, number=1000000)
0.6109819412231445
>>> timeit(stmt=runit3, number=1000000)
0.6166291236877441
>>> timeit(stmt=runit2, number=1000000)
0.49651098251342773
>>> timeit(stmt=runit2, number=1000000)
0.4870288372039795
>>> timeit(stmt=runit2, number=1000000)
0.49135899543762207
>>> timeit(stmt=runit, number=1000000)
0.7751290798187256
>>> timeit(stmt=runit, number=1000000)
0.7771239280700684
>>> timeit(stmt=runit, number=1000000)
0.7805869579315186

Turns out using the old, tried and true str.* methods are fastest in this case, though I think in a more complex case like the recipe from the 2nd edition I’d opt for something more readable if I had the chance.

One String, Three Ways

Let’s look at a more complex case. Let’s take each of the methodologies used in runit, runit2, and runit3, and see how things pan out when we want to do something like the 2nd edition recipe. I’ll start with the bare interpreter operation to compare the output:


>>> '|' + format('hej', '+<20s') + '|' + format('hej', '+^20s') + '|' + format('hej', '+>20s') + '|'
'|hej+++++++++++++++++|++++++++hej+++++++++|+++++++++++++++++hej|'
>>> '|' + 'hej'.ljust(20, '+') + '|' + 'hej'.center(20, '+') + '|' + 'hej'.rjust(20, '+') + '|'
'|hej+++++++++++++++++|++++++++hej+++++++++|+++++++++++++++++hej|'
>>> '|{0:+<20s}|{0:+^20s}|{0:+>20s}|'.format('hej')
'|hej+++++++++++++++++|++++++++hej+++++++++|+++++++++++++++++hej|'

Unless you go through the rigamarole of creating a sequence and using ‘|’.join(myseq), the last method seems the most readable to me. I’d really just like to use the built-in print function with a “sep=’|'” argument, but that won’t cover the pipes at the beginning and end of the string unless I’ve missed something.

Here are the functions and timings:


>>> def threeways():
...     '|' + format('hej', '+<20s') + '|' + format('hej', '+^20s') + '|' + format('hej', '+>20s') + '|'
...

>>> def threeways2():
...     '|' + 'hej'.ljust(20, '+') + '|' + 'hej'.center(20, '+') + '|' + 'hej'.rjust(20, '+') + '|'
...

>>> def threeways3():
...     '|{0:+<20s}|{0:+^20s}|{0:+>20s}|'.format('hej')
...

>>> timeit(stmt=threeways, number=1000000)
2.4910600185394287
>>> timeit(stmt=threeways, number=1000000)
2.50291109085083
>>> timeit(stmt=threeways, number=1000000)
2.4913830757141113
>>> timeit(stmt=threeways2, number=1000000)
1.9027390480041504
>>> timeit(stmt=threeways2, number=1000000)
1.8975908756256104
>>> timeit(stmt=threeways2, number=1000000)
1.8957319259643555
>>> timeit(stmt=threeways3, number=1000000)
1.311446189880371
>>> timeit(stmt=threeways3, number=1000000)
1.3099820613861084
>>> timeit(stmt=threeways3, number=1000000)
1.3031558990478516

The threeways3 function has a bit of an advantage in not having to muck with concatenation at all, and this probably explains the difference. Changing threeways() to use a list and '|'.join() brought it from about 2.50 to about 2.30. Better. Changing threeways2() in the same way was also a small improvement from ~1.90 to ~1.77. No big wins there, and they’re not particularly readable in either case. For this one arguably trivial corner case, the new formatting mini-language wins in both performance and (IMO) readability.

Big Assumptions

This of course assumes I didn’t overlook something in creating the comparison functions, that there’s not yet a different way to do this that blows all of my work out of the water. If you see a completely different way to do this that’s both readable and performant, or I did something bone-headed, please let me know in the comments. :)

Panorama Theme by Themocracy