A simple nanny script in Python

I have a support issue with a provider of mine, but was able to reverse engineer the problem and put in a stop-gap measure to keep it from ruining my weekend. The issue is a misconfigured daemon supplied by the provider, and occasionally, this daemon just goes away. I don’t know much about the daemon, but the underlying system is standard CentOS, so what I really needed is a way to detect if the daemon failed, and then restart it if that’s the case. The script that does this exists in every shop I’ve ever worked in, and is traditionally called a “nanny script”.

There are actually some nice looking projects that deal with this issue and others, but I didn’t really have time to read all the docs (yet), and I wasn’t sure it wasn’t overkill — but it might be nice to have a daemon instead of a script running from cron.

Anyway, I was shocked that I was unable to find a simple nanny script out on the web – in *any* language. Maybe my google-fu is out of whack. So I went ahead and wrote one up *very* quickly using Python. If you need a script to run every minute or few out of cron and restart a misbehaving daemon if it’s not running, feel free to use my nanny script.

  • http://na root

    If you’ve got root on the box, you should look at /etc/inittab, and man inittab.

    If you’re not root, you’ll need your own process to monitor another process.

    I’ve not heard that called a ‘nanny script’. I’ve heard it called a ‘watchdog process’. For example, I found a perl watchdog script here:

    http://snippets.dzone.com/posts/show/1737

    I didn’t look too hard for python watchdog scripts, but I’m sure they’re out there.

  • Larry Hastings

    I was shocked too. Here’s one that presented at PyCon this year:
    http://supervisord.org/

  • http://www.doughellmann.com/ Doug Hellmann

    There was a good presentation on supervisor at PyCon this year: http://supervisord.org/

  • http://www.protocolostomy.com m0j0

    I *might* consider running a small, non-resource-intensive, non-production daemon out of inittab, but generally I try to avoid running daemons out of inittab if I can help it, and generally it’s not hard to avoid. I know there are apps (even large commercial ones) that *recommend* this practice, but I think there needs to be one of those “…Considered Harmful” articles written about that. I don’t profess to know that it’s a bad idea *always*, but I’ve seen it cause as many problems as it solves. If you’ve done this a lot without issue, congratulations, but when you have issues, you’ll understand precisely what I’m talking about :-)

    Of course, the problem with any solution that seeks to restart failed daemons in an automated way at all is that it has a tendency to put off the debugging process indefinitely, which is bad. In my case, though, I’m just looking to keep something running until some support goon gets around to fixing the issue.

  • http://here.the.ycros.be Ycros

    Give Monit (http://www.tildeslash.com/monit/) a whirl, I set it up a few days ago on my server, only took me 5 minutes to work out how to use it and it works great.

  • http://felimwhiteley.wordpress.com Félim Whiteley

    Eh I’m not sure if I’m doing something wrong but I don’t seem to see it on the download page… unless both Konq and Firefox are playing up on me ??

  • http://www.protocolostomy.com m0j0

    My bad. Try it again. I apparently used Drupal’s Project module incorrectly when trying to make that available. It’s fixed now.

  • http://felimwhiteley.wordpress.com Félim Whiteley

    Ah so it is :) Cheers !