Design Patterns in System Administration

Most readers of my blog know that I consult, in addition to usually having a day job. I started my career working for a consulting firm, and couldn’t let go of the endless fascinating problems that exist in the “technological landscape”, and in addition, the seemingly endless numbers of ways to solve them. I’ve learned more than tons about how people, and institutions, approach technical problems in system design, and maybe more importantly, how they think about the problems and solutions.

I’ve worked in huge enterprises (several Fortune 100 companies), academia (cs.princeton.edu, to be more exact), government (gfdl.noaa.gov, for example), and a few startups and small businesses. I also grew up around small business around the time that technology was starting to become affordable enough to creep into even small offices (I helped run wire for my father’s first modem-connected office network around 1988 or so, and my mother’s office — admittedly much larger — had a mainframe and a few terminals, from which ascii posters of JFK and MLK were printed and hung on my walls when I was as young as 7 or 8). Observing and working with people to solve technical problems continues to bring me a lot of joy, and present plenty of challenges.

Over the years, I’ve done a decent bit of what I’ll loosely call “programming”. 10 years ago I might not have qualified that, but working for 6 years in support of graduate computer science research has a way of humbling a guy (and, really, for most grad students, actually *doing* 6 years of graduate research is probably just as humbling, if not more). One thing I’ve tried to do is keep up with trends in how programs are deployed, how the teams of workers in what are considered separate problem domains interact to get the applications to be useful to people, how the systems are organized, and how programs are designed, and finally, how to program…. um… “better” (for some undefined but surely long-winded definition of that term). As I’m starting to witness something of a convergence of programming and systems work (at least in my neck of the woods), programming is something I’m spending even more time doing, and learning.

Design patterns, whether in the context of extreme programming, agile methodologies, or whatever the project management philosophy is, appear to be extremely useful, but I’ve wondered why there doesn’t appear to be any movement in the system administration community toward defining some patterns for solving problems in the realm of systems infrastructure architecture. A few years back I stumbled upon infrastructures.org, which I think is an excellent general methodology for building infrastructures, but I think a fuller treatment of the topic could be had. Preferably one that addresses a broader set of problems prevalent in a wider variety of environments. For example, I found the tools and methodologies there to map perfectly in government and academic environments, and portions of that work can be mapped onto small business problems, but it leaves enterprise environments, and some larger government environments with some unanswered questions or unaddressed problems.

I don’t blame the folks at infrastructures.org — on the contrary, I applaud their work! But why has it been so difficult to find solutions to problems those nice folks just didn’t have, or didn’t have to focus on in their part of the organization?

So much of what we do is tribal knowledge, or knowledge earned “the hard way” — in the trenches, at 4am, on a Sunday, uphill, both ways… etc., but while many of these stories sound similar enough to discern a pattern, and while horror stories at conferences are universally met with “me too”, and “you should’ve done x, y and z, and it wouldn’t have been an issue”, I have yet to see these patterns codified in any meaningful way in a single work, or perhaps, an organized volume of works (no, mailing lists do *not* constitute an organized volume of works).

If something as complex and diverse as programming can have patterns applied to it, I have to believe that the same could hold true for building systems. If there were such a work, it could potentially serve as a de facto “best practices” reference — one that could be referred to by both technicians and higher-level decision makers, define a common language that both could understand, and help overcome some of the inevitable “people issues” that sysadmins (and, indeed, managers) often blame for a lack of forward movement.

Does such a work exist? Is this in the works now? Though I try to keep my finger on the pulse of the publishing market, I have yet to see any real commitment to the idea that a large swath of problems in systems can be solved using variants of pre-defined patterns. It’s not that we’re not using them, of course, and it’s not that there aren’t large numbers of us who could probably recite them off the top of our heads, but if you’re one of those people, you’re a “senior” system administrator (or better), and if that’s the case, imagine what your career might’ve been like if you had such a reference, and also, let me know what the “you” with 1 year of sysadmin experience would’ve loved to have, or what the “you” of today would love to see the junior folks reading.

  • Andrew

    Practice of System and Network Administration is probably the best book out there for this. While it doesn’t give exact how-to instructions, it is a great general resource for Systems Administrators. I would have loved this book if I had discovered it in the early days of my career.

    http://www.amazon.com/Practice-System-Network-Administration-2nd/dp/0321492668/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1217868163&sr=8-1

    Something I have been wondering about is why we like to Open Source our software, but then treat our networks as proprietary information. Security? Shame? Laziness?

    – A.

  • m0j0

    TPOSANA is a great timeless tome for system administrators. I read the first version all the way through, but I haven’t yet made it through volume two. As for your point about the paradox between open software and closed network designs, I’ve often wondered the same thing. Certainly lots of people are under a blanket of secrecy put on them by their employer. In other cases, it’s a matter of specialization: in a LOT of places, no single person really knows all there is to know.

    As for not wanting to expose badness, I tend to find two trends:

    1. People readily talk about badness in the hopes that someone can give them a clue about how to make it better, or

    2. The people I’m talking to (clients, usually) don’t even know the badness exists, otherwise it might not!

  • http://www.build-doctor.com/ Julian Simpson

    Have you read this book? http://www.intel.com/intelpress/sum_book2.htm

    I found it really interesting: an attempt to address the lack of patterns in infrastructure design. It really only scratches the surface by describing things like n-tier designs. But it’s a start.

    Great blog post, BTW!

  • http://www.1060.org/blogxter/publish/5 Steve Loughran

    I’m building a ‘patterns of deployment’ wiki, where deployment==the act of getting a working system up and running. The focus is mostly on CM-tool deployment (and java apps). And, patterns-style, it likes to look at the disadvantages of various approaches too: http://wiki.smartfrog.org/wiki/display/sf/Patterns+of+Deployment