Category Archives: Django

Tornado’s Big Feature is Not ‘Async’

I’ve been working with the Tornado web server pretty much since its release by the Facebook people several months ago. If you’ve never heard of it, it’s a sort of hybrid Python web framework and web server. On the framework side of the equation, Tornado has almost nothing. It’s completely bare bones when compared to something like Django. On the web server side, it is also pretty bare bones in terms of hardcore features like Apache’s ability to be a proxy and set up virtual hosts and all of that stuff. It does have some good performance numbers though, and the feature that seems to drive people to Tornado seems to be that it’s asynchronous, and pretty fast.

I think some people come away from their initial experiences with Tornado a little disheartened because only upon trying to benchmark their first real app do they come face to face with the reality of “asynchronous”: Tornado can be the best async framework out there, but the minute you need to talk to a resource for which there is no async driver, guess what? No async.

Some people might even leave the ring at this point, and that’s a shame, because to me the async features in Tornado aren’t what attract me to it at all.

Why Tornado, if Not For Async?

For me, there’s an enormous win in going with Tornado (or other things like it), and to get this benefit I’m willing to deal with some of Tornado’s warts and quirks. I’m willing to deal with the fact that the framework provides almost nothing I’m used to having after being completely spoiled by Django. What’s this magical feature you ask? It’s simply the knowledge that, in Tornado-land, there’s no such thing as mod_wsgi. And no mod_python either. There’s no mod_anything.

This means I don’t have to think about sys.path, relative vs. absolute paths, whether to use daemon or embedded mode, “Cannot be loaded as Python module” errors, “No such module” errors, permissions issues, subtle differences between Django’s dev server and Apache/mod_wsgi, reconciling all of these things when using/not using virtualenv, etc. It means I don’t have to metascript my way into a working application. I write the app. I run the app.

Wanna see how to create a Tornado app? Here’s one right here:

import tornado.httpserver
import tornado.ioloop
import tornado.web

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("This is a Tornado app")

application = tornado.web.Application([
    (r"/", MainHandler),
])

if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

Save this to whatever file you want, run it, and do ‘curl http://localhost:8888′ and you’ll see ‘This is a Tornado app’ on your console.

Simplistic? Yes, absolutely. But when you can just run this script, put it behind nginx, and have it working in under five minutes, you dig a little deeper and see what else you can do with this thing. Turns out, you can do quite a bit.

Can I Do Real Work With This?

I’ve actually been involved in a production launch of a non-trivial service running on Tornado, and it was mind-numbingly easy. It was several thousand lines of Python, all of which was written by two people, and the prototype was up and running inside of a month. Moving from prototype to production was a breeze, and the site has been solid since its launch a few months ago.

Do You Miss Django?

I miss *lots* of things about Django, sure. Most of all I miss Django’s documentation, but Tornado is *so* small that you actually can find what you need in the source code in 2 minutes or less, and since there aren’t a ton of moving parts, when you find what you’re looking for, you just read a few lines and you’re done: you’re not going to be backtracking across a bunch of files to figure out the process flow.

I also miss a lot of what I call Django’s ‘magic’. It sure does a lot to abstract away a lot of work. In place of that work, though, you’re forced to take on a learning curve that is steeper than most. I think it’s worth getting to know Django if you’re a web developer who hasn’t seen it before, because you’ll learn a lot about Python and how to architect a framework by digging in and getting your hands dirty. I’ve read seemingly most books about Django, and have done some development work in Django as well. I love it, but not for the ease of deployment.

I spent more time learning how to do really simple things with Django than it took to:

  1. Discover Tornado
  2. Download/install and run ‘hello world’
  3. Get a non-trivial, commercial application production-ready and launch it.

Deadlines, indeed!

Will You Still Work With (Django/Mingus/Pinax/Coltrane/Satchmo/etc)?

Sure. I’d rather not host it, but if I have to I’ll get by. These applications are all important, and I do like developing with them. It’s mainly deployment that I have issues with.

That’s not to say I wouldn’t like to see a more mature framework made available for Tornado either. I’ve worked on one, though it’s not really beyond the “app template” phase at this point. Once the app template is able to get out of its own way, I think more features will start to be added more quickly… but I digress.

In the end, the astute reader will note that my issue isn’t so much with Django-like frameworks (though I’ll note that they don’t suit every purpose), but rather with the current trend of using mod_wsgi for deployment. I’ll stop short of bashing mod_wsgi, because it too is an important project that has done wonders for the state of Python in web development. It really does *not* fit my brain at all, though, and I find when I step into a project that’s using it and it has mod_wsgi-related problems, identifying and fixing those problems is typically not a simple and straightforward affair.

So, if you’re like me and really want to develop on the web with Python, but mod_wsgi eludes you or just doesn’t fit your brain, I can recommend Tornado. It’s not perfect, and it doesn’t provide the breadth of features that Django does, but you can probably get most of your work done with it in the time it took you to get a mod_wsgi “Hello World!” app to not return a 500 error.

Released django-taxonomy on github

Hi all,

I did a post several months ago about creating a generic taxonomy app for Django that was loosely coupled, unintrusive, and could evolve with an app that needed categories today, but then tags later, or labels later, or some other classification mechanism later. I wanted one app to just be generic enough to deal with it, so I created django-taxonomy…. and then did nothing with it.

Well, *I* did stuff with it, but I never put it anywhere where anyone else could do anything with it. So now I have: django-taxonomy is up on github. Please fork it and send me merge requests, because this app is not super high on my priority list, which is why I’m releasing it: it’ll get further and be more useful to you with the help of the community :)

New Job, Car, Baby, and Other News

New Baby!

I know this is my geek blog, but geeks have kids too, so first I want to announce the birth of our second daughter, Sadie, who was born on September 15th. She’s now over a month old. This is the first time I’ve stayed up late enough to blog about her. Everyone is healthy, if slightly sleep-deprived :)

New Job!

The day before Sadie’s birth, I got a call with an offer for a job. A *full-time* job, as a Senior Operations Developer for MyYearbook.com. After learning about the cool and very geeky things going on at MyYearbook during the interview process, I couldn’t turn it down. I started on October 5, and it’s been a blast digging into all of the cool stuff going on there. While I’m certainly doing my fair share of PHP code review, maintenance, and general coding, I’m also getting plenty of hours in working out the Python side of my brain. I’m finding that while it’s easier switching gears than I had anticipated, I do make some really funny minor syntax errors, like using dot notation to access object attributes in PHP ;-P

What I find super exciting is something that might turn some peoples’ stomachs: at the end of my first week, I sat back and looked at my monitors to find roughly 15 tabs in Firefox open to pages explaining various tools I’d never gotten to use, protocols I’ve never heard of, etc. I had my laptop and desktop both configured with 2 virtual machines for testing and playing with new stuff. I had something north of 25 terminal windows open, and 8 files open in Komodo Edit.

Now THAT, THAT is FUN!

The projects I’m working on run the gamut from code cleanups that nobody else has had time to do (a good tool for getting my brain wrapped around various parts of the code base), to working on scalability solutions and new offerings involving my background in coding *and* system administration. It’s like someone cherry-picked a Bay Area startup and dropped it randomly 30 minutes from my house.

My own business is officially “not taking new clients”. I have some regular clients that I still do work for, so my “regulars” are still being served, but they’ve all been put on notice that I’m unavailable until the new year.

New Car!

I’m less excited about the new car, really. I used to drive a Jeep Liberty, and I loved it. However, in early September, before Sadie’s arrival, it became clear to me that putting two car seats in that beast wasn’t going to happen. The Jeep is great for drivers, and it has some cargo space. It’s not a great vehicle for passengers, though.

At the same time, I was running a business (this was before the job offer came along), and I was finding myself slightly uncomfortable delivering rather serious business proposals in a well-used 2003 Jeep. So, I needed something that could fit my young family (my oldest is 2 yrs), and that was presentable to clients. So, I got a Lexus ES350.

I like most things about the car, except for the audio system. It seems schizophrenic to me to have like 6 sound ‘zones’ to isolate the audio to certain sets of speakers, but then controls like bass and treble only go from 0 to 5. Huh? And the sound always sounds like it’s lying on the floor for some reason. It’s not at all immersive. The sound system on my Jeep completely kicked ass. I miss it. A lot.

Other News

I’ve submitted an article to Python Magazine about my (relatively) recent work with Django and my (temporarily stalled) overhaul of LinuxLaboratory.org, and my experiences with various learning resources related to Django. If you’re looking to get into Django, it’s probably a good read.

I’ve been getting into some areas of Python that were previously dark, dusty corners, so hopefully I’ll be writing more about Python here, because writing about something helps me to solidify things in my own brain. Short of that, it serves as a future reference point in case it didn’t get solidified enough :)

My sister launched The Dance Jones, a blog where she talks about fitness, balance, dance, and stuff I should probably pay much more attention to (I’m close to declaring war on my gut). Also, if you ever wanted to know how to shoulder shimmy (and who hasn’t wanted to do that?), you should check it out :)

Sys/DB Admin and Coder Seeks Others To Build Web “A-Team”

UPDATE: There’s no location requirement. I kind of assume that I’m not going to find the best people by geographically limiting my search for potential partners. :)

Me: Live in Princeton, NJ area. Over 10 years experience with UNIX/Linux administration, databases and data modeling, and PHP/Perl. About 3 years experience using Python for back-end scripting and system automation, and less than a year of Django experience. Former Director of Technology for AddThis.com (it was bought out), Infrastructure Architect at cs.princeton.edu, and systems consultant/trainer. Creator of Python Magazine, former Editor in Chief of both php|architect *and* Python Magazine, and co-author of “Linux Server Hacks, volume 2″ (O’Reilly).

You are one of these:

  • Web graphic designer who has worked on several web-based projects for clients in various industries, understands current best practices and standards, has the tools and experience necessary to create custom graphics, and has some familiarity (secondarily) with PHP and/or Python, Javascript and Ajax. If you regularly make use of table-based web designs or ActiveX controls, this isn’t you.
  • Hardcore web developer with at least 6 years experience doing nothing but web-based projects using Javascript and (at some point) *both* PHP and Python, and has worked with or has an interest in Django, Cake, and other frameworks, and understands that client needs often don’t coincide with the religion of fanboyism. Knowledge of Javascript, Ajax, web standards and security is essential here. If your last “big project” was volunteer work to build a website for your kid’s soccer team, this isn’t you.
  • A generalist webmaster (sysadmin/db admin/scripter) with at least 6 years experience working in production *nix environments with good familiarity in the areas of high availability, web servers (specifically Apache), proxy servers and monitoring, and has worked with/supported users like the ones mentioned above on web-based projects. If you have to look at the documentation to figure out how to implement a 301 redirect, this probably isn’t you.

Experience working on a team in larger projects with multiple people would be good. Note that I’m looking for people to partner with on projects, I’m not hiring full time employees. Future partnership in a proper business is certainly a possibility, but… baby steps! I do have a couple of domains that would be great for use with this kind of project if it ever progresses that far :-)

I know that other people are out there looking for people to partner with on projects, but there doesn’t appear to be a common place for them to interact. Maybe that can be a project we undertake together :)  — if there *is* a place where people meet up for this kind of thing, let me know!

Let’s have fun, and take over the world! Shoot me an email at “bkjones” @ Google’s mail domain.

If You Code, You Should Write

The Practice of Programming

Programmers are, in essence, problem solvers. They live to solve problems. When
they identify a problem that needs solving, they cannot resist the temptation
to study it, poke and prod it, and get to know it intimately. They then start
considering solutions. At this point, the programmer is not often thinking in
code — they’re thinking about the problem using high-level concepts and terms
that most non-programmers would understand.

Consider the problem of how to post a news story to a website. The programmer
might think about the solution this way:

  • Log in
  • Go to ‘new story’ page
  • Enter title and text
  • Press ‘submit’

Of course, there are a million details in between those points, and after them
as well. The programmer knows this, but defers thinking about details until the
higher-level solution makes sense and seems reasonable/plausible. Later in the
process they’ll think about things like the site’s security model, WYSIWYG
editors, tags and categories, icons, avatars, database queries and storage, and
the like.

Once they’ve reached a point where they’re satisfied that their solution will
work and is thoughtful of the major points to be considered in the solution,
they open an editor, and begin to type things that make no sense to their
immediate family. Programmers express their solutions in code, of course, but
they express them nonetheless, and this is not a trivial point.

The Parallels Between Programming and Writing

Writers often take the exact same course as do programmers. Programmers and
writers alike are often given assignments. Assignments take the form of a
problem that needs solving. For a programmer it’s a function or method or class
that needs implementing to perform a certain task. For a writer it’s an article
or column or speech that covers a particular topic. So in these cases, the
problem identification is done for you (not that more discovery can’t be done
– in both cases).

Next is the conception of the solution. Programmers puzzle over the problem,
its context in the larger application or system, its scope, and its complexity.
Writers puzzle over their topic space, its breadth and depth, and its context
in the bigger picture of what their publication tries to accomplish. In both
cases, writer and programmer alike take some time and probably kill some trees
as they attempt to organize their thoughts.

At some point, for both writer and programmer, the time comes to use some tool
to express their thoughts using some language. For a writer, they open a text
editor or word processor and write in whatever language the publication
publishes in. For the programmer, they open an IDE or editor and write using the
standard language for their company, or perhaps their favorite language, or (in
rare cases), the best language for accomplishing the task.

In neither case is this the end of the story. Programmers debug, tweak, and
reorganize their code all the time. Writers do the exact same thing with their
articles (assuming they’re of any length). Both bounce ideas off of their
colleagues, and both still have work to do after their first take is through.
Both will go at it again, both with (hopefully) a passion that exists not
necessarily for the particular problem they’re solving, but for the sheer act
of solving a problem (or covering a topic), whatever it may be.

Finally, once things are reviewed, and all parts have been carefully
considered, the writer submits his piece to an editor for review, and the
programmer submits to a version control system which may also be attached to an
automated build system. Both may have more work to do.

Starting Out

The process is essentially the same. If you’re a new programmer, you can expect
to have more than your fair share of bugs. If you’re a new writer, you can
likewise expect your piece to look a bit different in final form than it did
when you submitted it to the editor.

Just like programming, writing isn’t something you do perfectly from day one.
It’s something that takes practice. At first it seems like an arduous process,
but you get through it. As time passes, you start to realize that you’re going
faster, and stumbling less often. Eventually you get to a point where you can
crank out 1500-2000 words on your lunch hour without needing too much heavy
revising.

You Should Write

So, I say “you should write”. As someone who owes his career to books and
articles (not to mention friendly people far more experienced than myself), I
consider it giving back to the medium that launched my career, and helping
others like others helped me. I hope I can make the technological landscape
better in some small way. If we all did that, we’d be able to collectively
raise the bar and improve things together.

If altruism isn’t your bag, or you’re just hurting from the recent economic
crisis, know that it’s also possible to make money writing as well. It’s not
likely to become your sole occupation unless you happen to live in a VW Bus, or
you do absolutely nothing else but write full time, all the time. However, it
can be a nice supplement to a monthly salary, and if done regularly over the
course of a year is more than enough to take care of your holiday shopping
needs.

I’ve had good experiences writing for editors at php|architect and Python
Magazine (I *was* an editor at both magazines, but you don’t edit your own
work!), O’Reilly (oreillynet.com and a book as well), Linux.com (when it was
under the auspices of the OSTG), TUX and Linux Magazine (both now defunct), and
others. I encourage you to go check out the “write for us” links on the sites
of your favorite publications, where you’ll find helpful information about
interacting with that publications editors.

Cool Mac/Mobile Software for Sysadmins, Programmers, and People

I recently upgraded my primary workhorse (a MacBook Pro) to Snow Leopard. Before I did, I decided to go through and take stock of all of the documents and software I’d accumulated. While I was doing this, I simultaneously got into a conversation with a buddy of mine about the software he uses on his Macs. Turns out he maintains a whole page devoted to (mostly non-geek, but still somehow geeky) Mac software he uses.

I decided to go ahead and list the software I use for stuff whether it was geeky or not. Then I realized that pretty much all of the software I use is kinda geeky. I guess if you’re someone who’s going to create a list of software you use, it’s pretty hopeless.

So… here’s what I’m using. Suggestions welcome in the comments!

Social Media

My Twitter account updates my Facebook status. My Brightkite checkins update the location information on my Twitter account. It also sends a tweet… which updates my Facebook status. I pay less attention to the ongoing status in my LinkedIn account, but it gets updated automatically as well, I just don’t remember how or by what anymore.

I’ve tried a bunch of Twitter clients. Tweetie is “good enough”. It’s the one I use most often. If I need something hardcore I use Tweetdeck or TweetGrid, which has the benefit of being web-based.

TwitterLocal lets you put in your location and a radius, and then shows you tweets from people who are discernibly near you. I think Brightkite does a better overall job with this, since its whole reason for being is to be location-aware, but it seems like I get fewer updates than with TwitterLocal.

Communication

  • Colloquy
  • Tweetie
  • Mail
  • Skype
  • Google Talk

Right. Twitter is also a communication tool. I have, in fact, checked in with people via Twitter. It’s not how I typically use it, but I think it counts :)

I have to use both Skype and Google Talk because I’m on the road a lot (I’m a consultant) and there are enough hotels who do stupid things with their network that I’m forced to use whichever one works on that particular network. Though I mostly use GMail for mail, it’s gone down a few times on me, so it’s good to have Mail around. I’ve recently found GMail notifier to be almost useless as well, so when I use Mail, I find that getting alerted to incoming messages frees my brain. I use Mail.appetizer to show me previews of incoming mail so I don’t have to switch gears from what I’m doing to see the latest spam. Note, however, that it’s not quite ready for Snow Leopard.

I haven’t tried Mail in Snow Leopard yet. If they ever fix the search functionality (I find it useless) I’ll stop using the GMail interface. I’ve tried thunderbird, but its search is even worse (or was, the last time I tried it).

Fun Stuff

I play guitar and piano, and have also played drums, saxophone, and lots of other noise-making apparatuses. I like that GarageBand will let me put down bass and drum tracks without having to own a bass or drum set.

I also enjoy photography, though I don’t often get out on long quiet hikes in nature or gastronomical adventures that would make for the kinds of stunning things I see on Flickr all the time. However, I do have a family, and we do travel, so while not even 10% of my pics on Flickr are stock quality photos, at least 90% of them are interesting to me personally :)

iPhoto I see as a necessary evil these days. I used to love it, but now that it tries to help me out by autocategorizing on things that, as it turns out, are pretty arbitrary in the context of my life, I don’t like it as much. It’s good for quick touch-ups though. I’ve saved a number of pics with it.

StellaOSX is an Atari 2600 emulator for the Mac that comes with like, I dunno, thousands of ROMs? If you miss your old Atari games, and you have a Mac, it’s all you’ll ever need.

Sim City 4 is a city-building game. If you haven’t heard of Sim City before, it’s not like the Sims. At all. I don’t get that game, in fact. Sim City is a game where you have to try to build a city, build its wealth and prestige, and try to keep the residents happy as well.

Productivity

Things for Mac is the first application I’ve personally seen that seamlessly syncs with Things for my iPhone. It works great. It’s not a full-blown project management solution, but it’s more than a todo list. It’s not about work-related stuff, either. Things is really about keeping my personal things in order. I have to call the township for an inspection on my recent AC replacement, schedule for a followup doctor visit for my dog, hire an insulation contractor by the fall, send out my quarterly taxes, make a dentist appointment… that kind of stuff. It’s also a great place to put ideas for blog posts and stuff, and since it’s right there on my iPhone, I don’t forget as many ideas anymore. I can’t say enough good things about Things, so I’ll just say go try it.

Google Calendar and iCal are kept in sync, so I don’t have to use the horrifically slow Google Calendar on my iPhone. I can sync to iCal on the desktop, sync that to my iPhone, and use iCal on the phone as well. Why the whole calendar synchronization thing has to *still* be hard after like 4 years of trying is beyond me.

Office

Keynote makes doing things that are hard in PowerPoint and impossible in OpenOffice or Google Docs easy as all getout. As a trainer, I spend a lot of time putting content together and trying to find new ways to make it more engaging, less boring, etc. (not that I’ve been accused of being boring, mind you) ;-)

I deliver all of my training from a MacBook Pro using either the remote that came with my laptop or the Remote iPhone application. Usually I can’t use Remote for iPhone because of restrictions regarding the wireless network, but I sometimes use it at home to rehearse new content.

I do use Google Docs for lots of other stuff. It’s not what I’d call full-featured, but when you discover that it’s integrated with Google Talk, it actually makes real-time collaboration pretty nice. Sadly, Microsoft Word is still the only word processing application I’ve seen with offline collaboration features that I’d call “pretty good”. Nothing I’ve seen recently can do what Word did 5 years ago in terms of collaboration. Again — sad.

Preview is a PDF viewer, but it also will do screen grabs. I know there’s a keyboard shortcut to do screen captures. I think it’s shift-command-4. I’m just as happy opening Preview, which is right there on the Dock anyway. It’s better than the old utility Apple provided for this, which would only save in TIFF format.

I feel like people look at me strange when I say that I use a dictionary every single day I’m on the computer (so… every day). I used it for this post, as a matter of fact (“apparatuses” still doesn’t sound right to me). I wish there was an app that could tell you how often you’ve used an app in the last day, week, month, etc. I’ll bet the Dictionary app outnumbers Mail (I usually only use Mail when GMail is down).

System Maintenance

  • Time Capsule/Time Machine
  • AppCleaner
  • Disk Inventory X
  • Apple Remote Desktop

I bought a Time Capsule. It’s an Apple product. It’s an enclosed 1TB hard drive inside of a wireless access point. It also has a USB port where you can connect a hub and then connect up other external USB hard drives, and a USB printer that can then be shared with the whole network without running a long-in-the-tooth Mac G4 with the mirrored doors and the fan that sounds like the landing of the mothership…. uh…. I mean… It’s really easy to use! I use it to back up all of the Macs in the house. The iPhone backs up to my Mac, so that’s covered too.

AppCleaner isn’t horribly useful, but I do use it, and it helps slightly. Maybe. It’s supposed to help you get rid of apps you no longer use, but it still leaves behind seemingly everything that would normally be left behind if you just opened Terminal and typed “sudo rm -rf ./AppName”. I give it the benefit of the doubt. Maybe it catches some stuff sometimes, and then I know all of the usual suspects that hang on to old app cruft, so I can clean some of it out manually without too much fuss.

Disk Inventory X is pretty cool. It presents a tree map view of the contents of your hard drive which makes it dead easy to spot where the disk hogs are. And here I was writing scripts for this ;-) It’s a great spotting tool, but because it’s constantly scraping the disk, it’s quite slow. You also can’t select multiple things in the interface and move them all to the trash at one time, which would be nice. Still, it definitely helped me find stuff I didn’t know was there, and that was taking up lots and lots of space.

Apple Remote Desktop isn’t something I use often, but it’s handy to have around. It lets you do all kinds of advanced stuff by connecting to the desktop of a remote Mac, but I just do simple things with it. If you didn’t know about it, it’s worth at least being aware of.

System Administration/Geekery

  • Terminal
  • Vim
  • SSH Tunnel Manager
  • VMware Fusion
  • Cisco VPN Client

This is the “where do I start” section for me. I do lots of geekery, and these tools facilitate a lot of the geekery. I stuck with the basics here. I use Terminal because tons of what I do is on the command line. There are things I do on the command line for which GUI applications exist, but to be honest, some of those cost money, and none of them are as efficient or reliable as the command line. I know that makes me sound like an old graybeard, but it’s mostly true. A GUI that really makes something you already know how to do on the command line easier is rare.

Vim, of course, runs inside of Terminal. If I’m writing a bunch of code across lots of files or something, I’ll try to use Komodo Edit (and I might upgrade to Komodo IDE), but if I’m on a remote machine, or I just need to do a quick edit here or there, one file at a time, I’ll just use Vim. Vim can do window splitting and code folding and stuff like that, so Komodo isn’t a requirement for me, it’s just slightly more convenient, and it has Vi key bindings :)

SSH Tunnel Manager is a GUI for managing SSH tunnels. Go figure. I’ve been using it for years now, but to be honest, if I don’t use it for a while, the interface becomes unintuitive to me and I go back to the command line or my SSH config file to set up tunnels.

VMware Fusion is great. I can test the latest Linux distros without devoting a whole machine to them, or I can run Windows and test web stuff in IE. There seems to be no end to the stuff I find myself using VMware Fusion for. Surprising.

I’m told there’s a VPN client built into Snow Leopard, but I haven’t tested it out yet. Some have reported issues, so hopefully they don’t bite me.

Programming/Development

Komodo Edit is my favorite editor for writing code, period. If it didn’t have Vi keybindings, I’d likely just use Vim. And I do, sometimes. My first-choice language these days is Python, but I still write plenty of PHP, shell, SQL, Perl, etc. The Mac comes with XCode as an optional install, and I should really give it another shot, but in the past I’ve felt that it was kind of overwhelming, not to mention kinda clunky and slow.

Django is a Python web framework that comes with a development stand-in web server so you can do all of your development on the laptop, test it all locally, then push out to some environment that more closely matches production.

Speaking of pushing out changes, I mostly use Mercurial for my own projects nowadays, and I rather like it, but lots of things still use Subversion, which is wildly popular. My open source project actually uses Subversion with Google Code, but Google recently announced Mercurial support for hosted projects, so I’ll need to look at changing that over.

Fabric is a deployment tool. It’s written in Python and uses the paramiko library, which I found interesting, because I’d written a couple of automation scripts using paramiko that would have been easier to do with Fabric. I’ve only done simple things with Fabric so far, but it’s worth a look if you do a lot of rsync-ish stuff, followed by some “ssh in a for loop” stuff, supported by some cron jobs…. Fabric can really ease your life.

VMware Fusion is used in a programming context in two ways: to test web stuff on IE (I have an XP VM), and to work with libraries that are more convenient to work with under Linux than on the Mac. Sometimes Linux distros have things built-in that I’d have to build from source (along with all the dependencies) on the Mac.

Firebug is just basically a necessity if you do any kind of web development. It lets you inspect the design elements on the page visually, as well as in code, which makes debugging your CSS so easy it’s almost a non-event.

So… what tools are you using?

Representing Relationships in Django Templates Without Writing Extra Code (RelatedManager and ManyRelatedManager)

I’m writing an application that deals with some slightly complex relationships. There are several offices, and each office has several workers. Workers can have multiple projects, and each project can have multiple workers. In addition, each project can serve multiple clients.

Here’s what that’d look like in a Django models.py file:

class Office(models.Model):
   office_code = models.CharField(max_length=24, blank=True)
   street_num = models.CharField(max_length=24)
   street_name = models.CharField(max_length=64)
   bldg_no = models.CharField(max_length=12, blank=True)
   suite = models.CharField(max_length=12, blank=True)
   city = models.CharField(max_length=100)
   state = USStateField() # grandiose assumption
   zipcode = models.CharField(max_length=10)
   main_phone = PhoneNumberField()
   site_mgr = models.ForeignKey(User, unique=True)

class Worker(models.Model):
   user = models.ForeignKey(User, unique=True)
   extension = models.CharField(max_length=8)
   office = models.ForeignKey(Office)

class Client(models.Model):
   fname = models.CharField(max_length=64)
   lname = models.CharField(max_length=64)
   street_num = models.CharField(max_length=24, blank=True)
   street_name = models.CharField(max_length=128, blank=True)
   apt_no = models.CharField(max_length=24, blank=True)
   city = models.CharField(max_length=128, blank=True)
   state = USStateField(blank=True)
   zipcode = models.CharField(max_length=10, blank=True)

class Project(models.Model):
   date_started = models.DateTimeField()
   worker = models.ManyToManyField(Worker)
   office = models.ForeignKey(Office)
   client = models.ManyToManyField(Client)

While writing the template for my worker detail page, I decided that I didn’t want to just list the projects for that worker, but I also wanted to list the clients for each project. I ran into a bit of an issue at first in doing this. I tried something like this:

{% block content %}
   <h2>Projects for {{object.user.first_name}} {{object.user.last_name}}</h2>
   <ul>
   {% for project in object.projects %}
      <li><a href="{{project.get_absolute_url}}">{{project.id}} (Opened: {{project.date_started.date}})</a>
      <ul>
         {% for obj in project.get_clients %}
            <li>{{obj.lname}}, {{obj.fname}}</li>
         {% endfor %}
      </ul>
   <h2>Clients for project {{project.id}}</h2>
{% endfor %}</ul>
{% endblock %}

Looking back at the models, you’ll note that there’s no “projects” attribute of the Worker class. There’s also no “get_clients” method for the Project class. After reading some forum and blog posts, I got the idea to add these to my models manually. It seems a lot of people solve similar issues this way, and I don’t believe it’s necessary, which is why I’m posting this. What I added to my models looked something like this:

###
### in Worker model
###
def projects(self):
   return self.project_set.filter(worker = self.pk)

###
### in Project model
###
def get_clients(self):
   return self.client_set.all()

Adding these to the models actually does solve the problem, but it’s reinventing the wheel. Perhaps at some point in history Django’s ORM didn’t have the functionality it does now, but these days Django takes care of accessing the objects of related entities for you through the use of the RelatedManager (for one-to-one, foreign key relationships) and the ManyRelatedManager (for many-to-many relationships).

When you create a ForeignKey field or ManyToMany field in a Django model, Django’s ORM becomes aware of the relationship, and implements lots of shortcuts to help you in managing/exploiting it.

After reading some online documentation (see the “Related Objects” area, for one), I was able to get all of the data I wanted into my template without adding a single character of code to my models, which is what I had hoped for. Here’s the right way to do this, unless I’m mistaken (please let me know the best way if I am):

{% block content %}
<h2>Projects for {{object.user.first_name}} {{object.user.last_name}}</h2>
<ul>
   {% for project in object.project_set.all %}
      <li><a href="{{project.get_absolute_url}}">{{project.id}} (Opened: {{project.date_started.date}})</a>
      <ul>
         <h2>Clients for project {{project.id}}</h2>
         {% for obj in project.client.all %}
            <li>{{obj.lname}}, {{obj.fname}}</li>
         {% endfor %}
      </ul>
   {% endfor %}
</ul>
{% endblock %}

I’ve replaced “object.projects” with “object.project_set.all”. Note that, in a Django template, unless you specify otherwise, the name of a single object passed to a template is “object”, so in this case, “object” is a “Worker” object. The Worker model makes no mention at all of projects, and yet I’m able to easily grab data about project objects. This is because Django’s ORM actually gives you access to related object data from either side of a relationship. Since the Project model has a ManyToMany field referencing Worker, you can access project data from the worker object by using “object._set.all”, where is replaced with the lower-cased name of the model that points to “object”. Hence, “object.project_set.all”.

Now, in the second case, I’ve replaced “project.get_clients” with “project.client.all”. The Project model directly contains a field named “client” that is a ForeignKey to the Client model. When this condition exists, Django will happily traverse the relationship for you by just referencing the model’s field directly! The “all” method is a standard method of any Manager object I’m aware of, and it’s inherited by the RelatedManager and ManyRelatedManager objects.

One interesting thing I found, too, was that there’s no mention of “RelatedManager” or “ManyRelatedManager” in the online Django documentation. This is highly unusual. In my experience, Django’s documentation blows away the docs for just about any project in existence. Did I miss something?

Lessons Learned While Creating a Generic Taxonomy App for Django

So, when I first picked up a guitar, the first song I sat down to learn, by ear, was Stairway to Heaven, not “Twinkle, Twinkle, Little Star”. So goes my experience with Django :)

The Background

I was humming along on my recreation of LinuxLaboratory.org. I got a simple blog in place in just a couple of days, a code-sharing app in place a few days later (if that), and a very simple CMS I threw together using flatpages. A good bit of the base code I used came from the 2nd edition of “Practical Django Projects”, but I soon veered off in other directions, and started analyzing the work I’d already done a bit more closely.

One of the things that was glaringly obvious to me was that my method of classifying content was a little schizophrenic. I had three separate apps to represent different types of content, which is great, but each separate app had its own “Category” model. Yuck. On top of that, I was using django-tagging to enable tagging in addition to the categorization each app supported.

The Problem

So… for one type of classification (Categories), it’s built into the specific application, and for the other (tagging), it’s not built in, but it’s pretty tightly coupled. There are a few fundamental drawbacks to this approach:

First, you have to make a pretty big commitment to these things. The easiest way to implement them in your app is to add support for them at the outset, because adding them in later is going to be a bit of a headache. Categories aren’t quite so bad — I implemented them the way the book does, which is with a ManyToMany field. In Django, when you create a ManyToMany field in a model, there’s no corresponding field in that model’s table in the database. Instead, Django creates a lookup table for you, which is nice, because it means you *could* add categories at a later time without *too* much trouble. Tags use the django-tagging app, which implements tags as a multi-valued field in the database table representing the model that will use the tags. So adding this in later is a little bit more of a hassle.

The second issue is that this approach doesn’t treat classification in a consistent manner. One is in the app, the other is a separate app, one is a field, the other is a model, one affects the model’s table, the other doesn’t, etc. One place where this inconsistency becomes obvious is in your templates, where you’re likely to want to give users the ability to browse by category, or browse by tag. Browsing by category across all the different content types is going to be pretty tough if they all have their own implementation. Tags are a little easier, but it’s still a little cumbersome.

The third issue is specific to Categories, and has to do with maintenance: if I come up with some fantastic idea for the Category implementation (like, I dunno, subcategories?), I have to implement it separately in all of the apps that are using categories. No Bueno™.

The Dream

Wouldn’t it be nice if you could just say “give me a list of the taxonomy types used to classify this piece of content, whatever it is, whichever app it comes from, and also a list of the taxonomy terms involved”? Wouldn’t it be nice if you could just add support for categories and tags using a single app that doesn’t add anything to your existing tables? Wouldn’t it be nice to be able to come up with your own taxonomies and, perhaps, hierarchical taxonomies and more complex relationships? Wouldn’t it be nice to be able to say “here’s a category name, show me all of the content associated with it, sorted by content type” and then change your mind and say “no wait, show it to me sorted by title, with a content type indicator over here”, and then change your mind again and say “er, how about showing the taxonomy type, then the term, then all of the content objects under that type:term pairing”?

The Solution

I think it would, so I started creating this beast that I just call “taxonomy”. Right now it’s pretty simplistic, and it’ll likely change slightly based on some things I’ve learned, but surprisingly, I think my first shot at it is really darn close! I’ve stopped being surprised at how quickly I’m able to prototype in Django: getting this together, including the creation of the models, getting it into the admin interface, and getting it linked with any random content type from any app that wants to use it within the admin interface (to add taxonomies and labels to a piece of content in the ‘edit’ interface) took probably 4 hours, including time to read documentation and fall down a few times.

The admin interface for taxonomy lets you create a taxonomy, so if none of your apps currently don’t support the notion of a “Category”, you can go create a taxonomy called “Category”. Once that’s there, you can create a “taxonomy term”, where you’d select the “type” for this term (your new Category), and then a term. So if your term was “Django”, then you would have just created a category called “Django” that could be used by any other app/model in your project. The same, of course, would go for tags, and whatever other classification devices you want.

There’s support for parent-child relationships at the taxonomy term level (so you can have subcategories, or even subtags if you want, etc. I guess you could even categorize tags, and tag categories! They’re coming to take me away!!!). I haven’t given much thought to having hierarchical relationships for the taxonomies themselves. That would be a little overboard, no? I’m interested to hear realistic use cases for that :)

Once you’ve created a taxonomy and a term, the next thing to do is figure out how to associate your actual content to it. So, if our taxonomy is “Category” and the category name (the “term”) is “Django”, the way I’ve implemented it is that you’d go into the edit interface for the article, and a form for associating it with your category appears. This was created using a GenericInlineModelAdmin, which was a gem of a find in the documentation. Inlines let you easily create a form to update a piece of content using concepts and attributes from other models, and even other applications. If you don’t know much about Django, this sounds like a big mess, but in reality, it’s fairly elegant.

I’ve done some testing to see that I can pull things out of the database and associate things properly in the presentation layer, but I’d like to work on making it smoother before I go releasing code or anything like that…. which reminds me that I *did* look and ask around about an app that maybe already did this and came up dry. If anyone sees this and says “why not just use x”, let me know, because it’s not really a goal to write code for the sake of coding. I actually thought this was an interesting feature and couldn’t find it.

Lessons Learned #1: URLConf is a Choice, Not a Requirement

First, I learned that it’s completely possible to create an application that doesn’t have a URLConf at all. Currently, taxonomies actually work in testing, and there’s no URLConf. There actually *will* be one when I figure out how I want the data to be used on my own site, and how to enable users to do whatever they want with it as well. One thought, for example, is that it would be really awesome to be able to go to “/categories/django” and have my app somehow “just know” that “categories”, when singularized, is “category”, which is a taxonomy. From there, the taxonomy app takes over, and magic happens. I have faith that I can make this happen without having the word “taxonomy” in the url. We’ll see.

Anyway, the point is that you don’t have to have a URLconf, and that hadn’t really occurred to me. For the record, django-tagging also doesn’t have a URLconf.

Lessons Learned #2: ContentTypes Let Your Models Be All-Knowing

The second thing I learned was that, using the ContentTypes framework within Django, it’s possible to create a model that will deal with data, and relationships to data, in a dynamic way, such that you don’t have to know what type of data your models will be working with at the time you create them.

For example, my taxonomy app can be used with my blog’s “Entry” model, my code-sharing app’s “Snippet” model, and my CMSs “Page” model. If I pass the app to you, you can use it for your news site’s “Story” model, your ad network’s “Ad” model, and your Twitter clone’s “Tweet” model. No problem. This is in the docs, but here’s what I’ve done:

class Taxonomy(models.Model):
 """A facility for creating custom content classification types"""
 type = models.CharField(max_length=50, unique=True)

class TaxonomyTerm(models.Model):
 """Terms are associated with a specific Taxonomy, and should be generically usable with any contenttype"""
 type = models.ForeignKey(Taxonomy)
 term = models.CharField(max_length=50)
 parent = models.ForeignKey('self', null=True,blank=True)

class TaxonomyMap(models.Model):
 """Mappings between content and any taxonomy types/terms used to classify it"""
 term        = models.ForeignKey(TaxonomyTerm, db_index=True)
 type        = models.ForeignKey(Taxonomy, db_index=True)
 content_type = models.ForeignKey(ContentType, verbose_name='content type', db_index=True)
 object_id      = models.PositiveIntegerField(db_index=True)   
 object         = generic.GenericForeignKey('content_type', 'object_id')

Note that I’ve removed some stuff from the model defs — what you see here are just the fields, which are the relevant bit for what I’m explaining.

The TaxonomyMap model (a model is a class definition, by the way) has foreign keys to map to a human readable ‘term’ and ‘type’ in the other models. TaxonomyMap is just to store mappings between content objects and taxonomies (lower-level details of this might change to make it cleaner/more efficient – I know it’s not perfect). So, how does my app know that I’m storing a mapping to an “Entry” from my blog app? How does it get the id for that Entry? What’s going on?

Well, Django stores a list of every content type used by Django and any installed apps, and I’ve made a foreign key to ContentType so I can access the content type of the object that’s being dealt with and get its ID. I also have a “GenericForeignKey” field, which essentially creates a “dynamic” foreign key to the table that represents the object that’s being dealt with, so if I’m dealing with an “Entry” object from “monk” (which is the name of my blog app), then the foreign key will point to “monk_entry”, which is the table that stores my blog entries. When you create a taxonomy, and a term, and associate them to a piece of content, the resulting rows in the affected tables look like this:

mysql> select * from taxonomy_taxonomy;
+----+--------------+
| id | type         |
+----+--------------+
|  1 | TestCategory |
+----+--------------+
1 row in set (0.01 sec)

mysql> select * from taxonomy_taxonomyterm;
+----+---------+----------+-----------+
| id | type_id | term     | parent_id |
+----+---------+----------+-----------+
|  1 |       1 | TestTerm |      NULL |
+----+---------+----------+-----------+
1 row in set (0.00 sec)

mysql> select * from taxonomy_taxonomymap;
+----+---------+---------+-----------------+-----------+
| id | term_id | type_id | content_type_id | object_id |
+----+---------+---------+-----------------+-----------+
|  1 |       1 |       1 |              10 |         2 |
+----+---------+---------+-----------------+-----------+
1 row in set (0.00 sec)

Note that the table for the model that’s using the taxonomy app is untouched. Only taxonomy tables are used.

Seeing this, you might think that it’d be hard to put a form in the admin interface for arbitrary content types to classify them with taxonomies. Not so — which brings me to more lessons I learned.

Lessons Learned #3: Collecting Data About a Model Without Extending the Model and Creating Database Badness

If you have a model (we’ll use “Entry” again), and it has a core set of attributes, but you want to associate data with instances of this model not represented in the model definition (like, say, a taxonomy, for an arbitrary example), you can add a form to the admin interface for that model in about 5 minutes. This rocks, for those who didn’t know, because the alternative would either involve really ugly code, or really ugly data (you’d have to store the taxonomies in the table for the model, creating either tons of duplicate data, or multi-valued fields… and you’d still have duplicate data).

Typically, it seems that the normal use case for this is to relate models in the admin interface that are part of the same application and are explicitly related through a direct foreign key reference. This might even be enforced in the case of “InlineModelAdmin” objects, but I haven’t dealt with those personally. However, while reading about “GenericInlineModelAdmin” objects, it occurred to me that it shouldn’t matter that the related items are from different apps. I tried it, and it worked. Here’s what I did:

from django.contrib import admin
from django.contrib.contenttypes import generic
from monk.models import Entry
from taxonomy.models import TaxonomyMap

class TaxonomyMapInline(generic.GenericTabularInline):
   model = TaxonomyMap

class EntryAdmin(admin.ModelAdmin):
   prepopulated_fields = { 'slug': ['title'] }
   inlines = [ TaxonomyMapInline, ]

admin.site.register(Entry, EntryAdmin)

Again, I’ve edited out the irrelevant bits. The above comes from my blog app’s admin.py file. What I did was created an “inline” called “TaxonomyMapInline”, and then associated that inline with the “EntryAdmin” ModelAdmin object using ModelAdmin’s ‘inlines’ attribute, which takes a Python list, which means you can keep adding more inlines all day long if you like.

The result is that, when I go to edit a blog entry, there’s now a form at the bottom that lets the user select a taxonomy type and term (i.e. “Category” “Django”), and associate it with the post. When I added the inline to the admin.py file, it was a test to see what would happen. Since TaxonomyMap doesn’t hold anything but numeric IDs, I assumed I would have to go back and manually map the IDs to human readable values. Not true. Apparently, if the field being presented in the admin form maps to a ForeignKey field, Django automagically does the lookup for you and presents the human-readable text! And, when you save, it converts everything back to numeric IDs before going to the database, so everything “just works”. So the work I thought I’d be doing myself was already done for me!

Using a robots.txt File With Django and Apache (on Webfaction)

I’ve developed in a few different environments, including multi-tier ones with middle tier Java app servers and stuff, but it always seemed pretty straightforward to serve something directly from disk. And in the case of PHP, everything is served from disk. There’s no middleware to speak of, so you can throw a robots.txt file in place and it “just works”. With Django, it’s slightly different because of two things:

  1. Django shouldn’t be serving static content (and therefore makes it a little inconvenient though not impossible to do so).
  2. Django works kinda like an application server that expects to receive URLs, and expects there to be some configuration in place telling it how to deal with that URL.

If you have Django serving static content, you’re wasting resources, so I’m not covering that here. My web host is webfaction, and they give you access to the configuration of your own Apache instance in addition to your Django installation’s configuration (in fact, I’m just running an svn checkout of django-trunk), so this gives you a lot of flexibility in how you deal with static files like CSS, images, or a robots.txt file. To handle robots.txt on my “staging” version of my site, I added the following lines to my apache httpd.conf file:

LoadModule alias_module modules/mod_alias.so
<Location "/robots.txt">
 SetHandler None
</Location>
alias /robots.txt /home/myusername/webapps/mywsgiapp/htdocs/robots.txt

If you don’t add mod_alias, you’ll get an error saying that the keyword “alias” is a misspelling or is not supported by Apache. I use “<Location>” here instead of “<File>” or “<Directory>” because I’m applying the rule only to incoming requests for “/robots.txt” explicitly, and it isn’t likely that I’ll have more than one way of reaching that file, since I’m not aware of engines that look for robots.txt in some other way. <Directory> applies rules to an entire directory and its subdirectories, and <File> applies rules to a file on disk so the rules will apply even if there’s more than one URL that maps to the file.

Django Settings in Dev and Production: Why the hoops?

So, I’ve taken a break from active development on my project to take a step back and really get a good development workflow together. I’ve been fighting with various components of my development workflow, and in the end decided to compromise: I won’t have something that looks exactly like production, but I’ll have something that works and is easy to use. I’ll make up for it by having a staging environment on the same host as the production deployment, which will catch any differences between dev and production that result from my non-identical setup on my laptop.

In getting things going, one thing I ran into immediately was that things on my dev box are different from production: database credentials, paths to media, etc., and since I’m not using Apache and a reverse proxy some of the paths in settings.py will also be different. So what to do?

Turns out there’s an entire page on the Django wiki detailing the ways in which people keep their dev and production settings from trampling each other. There are also various blog posts where people have come up with interesting ways to make things work properly, usually by taking advantage of the fact that your app’s settings are just python code. As such, you can perform any valid Python wizardry you want to make the right things happen.

I don’t think there’s anything necessarily wrong with doing all of that stuff, but what I’m wondering is this: why not just tell your version control system of choice to *ignore* your settings.py file? This way, you can have a settings.py file on your dev box that works perfectly for your dev environment, and a separate one in production that works perfectly for that environment. Never the twain shall meet.

If you’re using one of the methods described on the Django site or any of the blog posts, what are you getting out of it that couldn’t be accomplished by ignoring settings.py? It just seems like it’s simpler and cleaner to do it that way, but I assume there’s some benefit to jumping through these hoops that I’m missing.

Input is hereby solicited. Please tweet/link/post this wherever, ‘cos I’d like some opinions on the matter.