Quick Loghetti Update

For the familiar and impatient: Loghetti has moved to github and has been updated. An official release hasn’t been made yet, but cloning the repository and installing argparse will result in perfectly usable code. More on the way.

For the uninitiated, Loghetti is a command line log sifting/reporting tool written in Python to parse Apache Combined Format log files. It was initially released in late 2008 on Google Code. I used loghetti for my own work, which involved sifting log files with tens of millions of lines. Needless to say, it needed to be reasonably fast, and give me a decent amount of control over the data returned. It also had to be easy to use; just because it’s fast doesn’t mean I want to retype my command because of confusing options or the like.

So, loghetti is reasonably fast, and reasonably easy, and gives a reasonable amount of control to the end user. It’s certainly a heckuva lot easier than writing regular expressions into ‘grep’ and doing the ol’ ‘press & pray’.

Loghetti suffered a bit over the last several months because one of its dependencies broke backward compatibility with earlier releases. Such is the nature of development. Last night I finally got to crack open the code for loghetti again, and was able to put a solution together in an hour or so, which surprised me.

I was able to completely replace Doug Hellmann’s CommandLineApp with argparse very, very quickly. Of course, CommandLineApp was taking on responsibility for actually running the app itself (the main loghetti class was a subclass of CommandLineApp), and was dealing with the options, error handling, and all that jazz. It’s also wonderfully generic, and is written so that pretty much any app, regardless of the type of options it takes, could run as a CommandLineApp.

argparse was not a fast friend of mine. I stumbled a little over whether I should just update the namespace of my main class via argparse, or if I should pass in the Namespace object, or… something else. Eventually, I got what I needed, and not much more.

So loghetti now requires argparse, which is not part of the standard library, so why replace what I knew with some other (foreign) library? Because argparse is, as I understand it, slated for inclusion in Python 3, at which point optparse will be deprecated.

So, head on over to the GitHub repo, give it a spin, and send your pull requests and patches. Let the games begin!

  • slestak

    Do you think in the end though that Doug will end up porting CommandLineApp to use argparse though? Not being critical, just wondering. Logging has been onmymind all day, thanks for your post on loghetti.

  • m0j0

    @slestak — I’m not sure it would make sense for Doug. His code sorta steps in the middle of things done by something like argparse, and something you often have to write code to do yourself. It’s not that he couldn’t use argparse, but it might not add much value in CommandLineApp. I’m not an expert on his code — I only peered in to see how a couple of things were handled.

    I didn’t switch to argparse because CLA couldn’t handle the job. The main reason I moved over is because I wanted one less external module requirement going forward, and argparse is slated for inclusion somewhere in 3.0-land. That means two things:

    1. Users don’t have to run easy_install or do anything extra to get things working.

    2. I don’t have to think about whether the module is going to be maintained in perpetuity (more or less), or if it’s going to break my code, or if I’m gonna have to fork and maintain my own version of it, taking time away from loghetti.

    Doug is a contributor to loghetti outside of CommandLineApp, and it was Doug who pointed me at argparse in the first place, so I’m not committing an atrocity or an affront to Doug by going in this direction or anything :)

  • http://saintaardvarkthecarpeted.com/blog Saint Aardvark

    Just out of curiosity, why did you move from Google Code? Was it just the switch to git (which rocks), or something else?

  • John

    Hi m0j0,

    Just curious: did you consider using [apachelog](http://pypi.python.org/pypi/apachelog)?

    Also, I’m unable to find Loghetti at the Cheeseshop. Any plans to upload there?

    Thanks.

  • m0j0

    I hadn’t ever had a request to put it on the Cheeseshop, and it never occurred to me to do it. I suppose I could do that if there were some actual demand. I have no idea how to actually do that, either, but I’m sure the cheeseshop has instructions, right?

    I should probably do a good code review and/or at the very least write tests for the thing before I do that, though. I don’t really want to have a breakneck release cycle that involves deployment to the cheeseshop and end users being constantly 3 versions behind.

    Thanks for the suggestion, though — I’ll make it a goal to get things in shape for that to happen (you could submit an issue on github as an added reminder, too). :)

  • m0j0

    @john

    Whoops, missed the other question: I didn’t know about *that* apachelog project at the time loghetti was developed. In the meantime, I found one that was written pretty simply and sanely, and we were able to get it to do what we wanted pretty easily. It probably wouldn’t be too difficult to replace what we have, but it’s just not clear to me that the project you point to adds a lot of value for the added overhead. That project is also written pretty simply (and sanely), but I’m not a fan of how it puts together the dictionary, and the module that ships with loghetti is a generator model, which I also like for dealing with very large log files (several million lines) without taxing resources too much, etc. I’m willing to be sold, though, if you wish to do a comparison of how the two work and report back any benefits you perceive in using the newer module.

    I also would rather not be tracking changes of yet another project to insure that loghetti is always in a working state.

  • John

    For creating and uploading a package to the Cheeseshop, there’s some (still developing) instructions here: http://guide.python-distribute.org/