Tornado’s Big Feature is Not ‘Async’

I’ve been working with the Tornado web server pretty much since its release by the Facebook people several months ago. If you’ve never heard of it, it’s a sort of hybrid Python web framework and web server. On the framework side of the equation, Tornado has almost nothing. It’s completely bare bones when compared to something like Django. On the web server side, it is also pretty bare bones in terms of hardcore features like Apache’s ability to be a proxy and set up virtual hosts and all of that stuff. It does have some good performance numbers though, and the feature that seems to drive people to Tornado seems to be that it’s asynchronous, and pretty fast.

I think some people come away from their initial experiences with Tornado a little disheartened because only upon trying to benchmark their first real app do they come face to face with the reality of “asynchronous”: Tornado can be the best async framework out there, but the minute you need to talk to a resource for which there is no async driver, guess what? No async.

Some people might even leave the ring at this point, and that’s a shame, because to me the async features in Tornado aren’t what attract me to it at all.

Why Tornado, if Not For Async?

For me, there’s an enormous win in going with Tornado (or other things like it), and to get this benefit I’m willing to deal with some of Tornado’s warts and quirks. I’m willing to deal with the fact that the framework provides almost nothing I’m used to having after being completely spoiled by Django. What’s this magical feature you ask? It’s simply the knowledge that, in Tornado-land, there’s no such thing as mod_wsgi. And no mod_python either. There’s no mod_anything.

This means I don’t have to think about sys.path, relative vs. absolute paths, whether to use daemon or embedded mode, “Cannot be loaded as Python module” errors, “No such module” errors, permissions issues, subtle differences between Django’s dev server and Apache/mod_wsgi, reconciling all of these things when using/not using virtualenv, etc. It means I don’t have to metascript my way into a working application. I write the app. I run the app.

Wanna see how to create a Tornado app? Here’s one right here:

import tornado.httpserver
import tornado.ioloop
import tornado.web

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("This is a Tornado app")

application = tornado.web.Application([
    (r"/", MainHandler),
])

if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

Save this to whatever file you want, run it, and do ‘curl http://localhost:8888′ and you’ll see ‘This is a Tornado app’ on your console.

Simplistic? Yes, absolutely. But when you can just run this script, put it behind nginx, and have it working in under five minutes, you dig a little deeper and see what else you can do with this thing. Turns out, you can do quite a bit.

Can I Do Real Work With This?

I’ve actually been involved in a production launch of a non-trivial service running on Tornado, and it was mind-numbingly easy. It was several thousand lines of Python, all of which was written by two people, and the prototype was up and running inside of a month. Moving from prototype to production was a breeze, and the site has been solid since its launch a few months ago.

Do You Miss Django?

I miss *lots* of things about Django, sure. Most of all I miss Django’s documentation, but Tornado is *so* small that you actually can find what you need in the source code in 2 minutes or less, and since there aren’t a ton of moving parts, when you find what you’re looking for, you just read a few lines and you’re done: you’re not going to be backtracking across a bunch of files to figure out the process flow.

I also miss a lot of what I call Django’s ‘magic’. It sure does a lot to abstract away a lot of work. In place of that work, though, you’re forced to take on a learning curve that is steeper than most. I think it’s worth getting to know Django if you’re a web developer who hasn’t seen it before, because you’ll learn a lot about Python and how to architect a framework by digging in and getting your hands dirty. I’ve read seemingly most books about Django, and have done some development work in Django as well. I love it, but not for the ease of deployment.

I spent more time learning how to do really simple things with Django than it took to:

  1. Discover Tornado
  2. Download/install and run ‘hello world’
  3. Get a non-trivial, commercial application production-ready and launch it.

Deadlines, indeed!

Will You Still Work With (Django/Mingus/Pinax/Coltrane/Satchmo/etc)?

Sure. I’d rather not host it, but if I have to I’ll get by. These applications are all important, and I do like developing with them. It’s mainly deployment that I have issues with.

That’s not to say I wouldn’t like to see a more mature framework made available for Tornado either. I’ve worked on one, though it’s not really beyond the “app template” phase at this point. Once the app template is able to get out of its own way, I think more features will start to be added more quickly… but I digress.

In the end, the astute reader will note that my issue isn’t so much with Django-like frameworks (though I’ll note that they don’t suit every purpose), but rather with the current trend of using mod_wsgi for deployment. I’ll stop short of bashing mod_wsgi, because it too is an important project that has done wonders for the state of Python in web development. It really does *not* fit my brain at all, though, and I find when I step into a project that’s using it and it has mod_wsgi-related problems, identifying and fixing those problems is typically not a simple and straightforward affair.

So, if you’re like me and really want to develop on the web with Python, but mod_wsgi eludes you or just doesn’t fit your brain, I can recommend Tornado. It’s not perfect, and it doesn’t provide the breadth of features that Django does, but you can probably get most of your work done with it in the time it took you to get a mod_wsgi “Hello World!” app to not return a 500 error.

  • http://christophermahan.com/ Christopher Mahan

    Have you had a look at fapws?

  • http://www.tomaz-muraus.info Tomaž Muraus

    I can see where you are going with the sys.path, mod_wsgi and all the other environment related issues (I’ve unfortunately spent a lot of time dealing with those issues as well).

    Even when you have a good deployment script setup up, things tend to break :\

    That is also one of the reasons I like to develop apps on App Engine using webapp framework which is (at least to me) is pretty simple and I haven’t had any major issues with it so far.

  • http://blog.dscpl.com.au Graham Dumpleton

    As much as bashing up on mod_wsgi is getting more popular, the problems often have got nothing to do with mod_wsgi at all. Take for example my recent post at ‘http://blog.dscpl.com.au/2010/03/improved-wsgi-script-for-use-with.html’. This goes into quite a bit of detail about why there are differences in running on Django development server versus mod_wsgi.

    As it turns out, mod_wsgi is doing nothing wrong at all. The problem is that the documented way for running under WSGI hosting mechanisms for Django is too simplified and doesn’t actually do things in the same way as the development server does. As a result, it fails to setup the Django process run time environment the same. If the setup isn’t done the same, how can you expect it run the same way.

    This problem isn’t isolated to Django and is a problem with other WSGI capable frameworks which embed their own development WSGI server. They do hidden things to make it so the development servers just work, but at the same time don’t ensure that things are as simple when deploying to an alternate production WSGI server.

    A lot of the problems would therefore be solved through better documentation on WSGI hosting being provided by the frameworks and actually providing better WSGI application entry points which do all the required initialisation, the same way, rather than just providing a handler and leaving you to work out the magic incantations to work out how to set up the environment yourself.

  • m0j0

    @Graham

    I haven’t read up much on the mod_wsgi bashing, and I’m sorry this came across as purposefully bashing mod_wsgi, I guess it’s hard to make this kind of post *not* come across that way. I *do* mention that mod_wsgi is a very important development in the world of Python web hosting, and I and countless others *do* very much appreciate your work in that area.

    That said, I wonder if you’ve thought about the sheer number of hours you’ve spent defending mod_wsgi against bashers, clarifying usage, documenting and redocumenting, and the like? It’s great that you’re such a devoted and dilligent project leader, but I wonder if all of that work isn’t necessitated by either some piece of mod_wsgi or its use that is overly complex and causing issues, or a lack of a “one true document” or “one true set of documents” that clearly explains the execution path and how it relates to the various configuration settings that are possible? Or maybe it’s as simple as something like “mod_wsgi needs better error messages” or something like that?

    I’d *really* like to see (and link to it if it exists please!) a document that says something like “When a request for yourapp.com/ comes in, Apache looks for your wsgi script in the location defined by $x, and from there, Python needs to know the locations of $y and $z, which are defined by $a and $b respectively. If those aren’t found, you might see an error like $m or $n. This can also happen if there’s no __init__.py or if $p. If there are no errors up to this point, then the code in $r is executed, and you should see its output.” This document would, ideally, have nothing to do with any specific framework. Maybe I’ll write one some day, because clearly I need it, maybe more than most; 90% of my work is *not* related to mod_wsgi, so I’m often coming at it after some time away from it.

    You can point the finger at Django or other frameworks, but you should know that I’ve never actually read the Django docs regarding mod_wsgi, so it’s not really applicable here. I’ve read your documentation and tried to work out the differences in my various configurations, and then I’ve read at least 20 blog posts about how to get it to work, and all of them do something different, and seem to work for whoever the blogger is, even though their environmental setups don’t appear to be drastically different from one another. Certainly there are problems with HOWTO posts like that: they almost never explain why they chose the settings they did except for the ones that are already completely obvious. :-/

    Anyway, I’m not telling people to move away from….anything. I *am* pointing to a tool that can get them up and running quickly, but I’m also pointing out that Django/et al and mod_wsgi are both still extremely important, to me and the rest of the webfaring public, and I continue to use them both. I just wish I could find some consistency in how things are deployed.

    PS – I read your post the day it came out, and tried your script. The project du jour continues to be a failure. The major frustration, of course, is that “I’ve done this successfully before, why won’t it work now?”

  • http://xtat.net Todd Troxell

    Couldn’t agree more– for me the best feature of tornado is that it’s not everything but the kitchen async :)

  • http://syntacticbayleaves.com Carlo Cabanilla

    I’ve been working on a Tornado app for the past month and it’s a great experience. I have to agree that you come for the async but stay for the simplicity. I really wish it had an async db api though, like Twisted adbapi. To mitigate slow db calls, I have nginx proxying to a pool of 3 Tornados. It also helps to delegate static file handling to nginx.

  • http://ludovf.net Ludovico Fischer

    Ah, mod_php, my beloved: what else compares to your gardens of unattainable bliss?

  • http://glyph.twistedmatrix.com/ Glyph Lefkowitz

    If your concern is deployment, why not just deploy Django with Twisted? It only takes a few seconds to get started:

    http://blog.dreid.org/2009/03/twisted-django-it-wont-burn-down-your.html

    if you want a slightly more robust deployment setup, you can also check this out:

    http://clemesha.org/blog/2009/apr/23/Django-on-Twisted-using-latest-twisted-web-wsgi/

  • http://www.thinkingscreen.com Robert Mela

    mod_wsgi’s been great for us and no problem at all to configure. We’ve used it with standalone wsgi, with Django, and with Flask.

    The problems you describe seem to be related to getting Python path set up. This to me is a generic Python issue, not specific to mod_wsgi.

    The way we do it for our apps is pretty simple — we include our wsgi files in the app repository, typically under the root of the application.

    So, for a simple WSGI or Flask app –

    myapp/__init__.py
    myapp/myscript.py
    myapp/wsgifiles/myapp.wsgi

    Then, in myapp.wsgi —

    import sys,os

    # Append parent directory of this wsgi file to python path
    pathElements = os.path.dirname(__file__).split(‘/’)[0:-1]
    sys.path.append(“/%s” % “/”.join(pathElements) )

    # Now that the path is set, load the symbol for the wsgi application

    from myapp import foobar as application # where foobar is a wsgi application

    There — now that isn’t so hard…. tho Django IIRC requires the parent of the application directory be in the path — and that IIRC was a minor headache.

  • Mark

    With Tornado’s fork Anzu you can do this:

    @location(‘/login’)
    class LoginHandler(BaseHandler):
    # …

    @error_handler(get)
    @validate(validators={’email': validators.Email(not_empty=True, min=7, max=64),
    ‘password': validators.UnicodeString(not_empty=True, min=6, max=32, strip=False)})
    def post(self):
    # …

  • Pingback: Marrying Boto to Tornado: Greenlets bring them together | Josh Haas's Web Log

  • whardier

    I honestly like the simplicity of smaller frameworks as long as there is a method for dealing with thread safe database calls. I spent far too many hours dealing with how Django preferred to, even after lazy loading, deal with queries and joins. I was impressed by how the ORM worked but above and beyond slow unscalable queries mapping/pickling/reclassing/etcing the data even if I was using MongoDB or something similar eventually became a huge bottleneck.

    So I see a lot of benefit from the async side of tornado.. but the biggest feature is simple and direct access for people that are beyond the prototype stage and don’t need the handy helpers but still want a python oriented solution that has enough basic modules to be useful.

    I’m using tornado for sockets, for redis access, for mongodb and I’m using it to create tools that compliment the front end.. so I’m lovin it. I hope you’re still on the boat.

  • Pingback: My setup for running Python Tornado in production | skipperkongen.dk