Every now and then, something comes around that is as useful as it is novel. Of course, the notion of virtualized systems isn’t new. In fact, systems running something like what we now call a “hypervisor” have existed for, literally, decades. But what about that next level of virtualization? What if you could not only run multiple system images on a single piece of hardware, but run an entire infrastructure — load balancers and switches and the like, in addition to your web servers and such — all in a virtual data center?
I’m happy to say that it’s working quite well. I’m an infrastructure architect, and I work and consult for a few different clients. So far, only one has allowed me to actually migrate the infrastructure to an AppLogic grid, but successes there make me more confident in approaching this as a solution for others. Maybe someday I’ll be able to use AppLogic’s tools to be a completely remote consulting infrastructure architect who can design, lay out, demo, deploy, and support rather complex infrastructures without leaving the comfort of my favorite chair (shown at right, for effect).
It’s completely possible. In fact, the AppLogic deployment I’ve already done was done from that very chair. So let me get to some of the questions I’ve been asked about AppLogic:
What’s the big deal?
It’s more cost-effective to leverage some of AppLogic’s features than it is to buy those features in a hosted environment. If you already have your own colo or physical machine room, depending on what stage of growth you’re in, there may not be a big deal here – but there probably is
AppLogic has redundancy built-in, for example. You can keep enough resources in reserve on your grid such that (without any other work on your part) the failure of any single component results in that component being immediately brought back up using those resources on another node within the grid. This isn’t something you have to configure – it’s inherent in how the system operates. This kind of redundancy can be costly to build and manage.
AppLogic also makes it more cost-effective to build more complex architectures than you can in a hosted environment. In a hosted environment, you pay some number of thousands of dollars for servers, and nothing but servers. Load balancers are hundreds of dollars more, more storage is extra, and you don’t have much control over the OS install, and sometimes you’re forced to manage the systems through some weird interface. No Bueno(tm).
How long did it take?
I worked the equivalent of “full time” on a project to simultaneously rearchitect and move an infrastructure from a hosted VPS solution that was beginning to struggle, to a much more robust AppLogic-based solution in less than 60 days. This includes the time it took to get to know how the existing architecture and applications worked together, which was probably two or three weeks of that time. This 60-day period also included the Christmas and New Year’s holidays. Actual working days to complete the deployment and get into production? 35 days. Note, also, that I was the sole designer and engineer on the project. I did the design work, and I also executed everything. This also includes the time it took to learn AppLogic.
Is it as easy as they say?
No. It’s not. However, I have to say that the support I’ve received so far has been really top notch, and they deal with learning issues in addition to problems with systems and software. So if I went to AppLogic and told them that I wanted to do “x” but couldn’t find anything about it in the docs (and I’m more than happy to read them if I missed them), they either pointed me to the proper document, or told me how to do it, or told me how AppLogic works so I could figure out my own compromise if need be. They’ve been pretty open about telling me roughly how AppLogic works under the covers, and it has, in some instances, helped me get work done.
What was the hardest part to get used to?
Application reboots — but you can architect your way out of that compromise, thankfully. You see, when you purchase access to a grid that is managed by AppLogic, you get a graphical interface that kinda reminds me of Visio to lay out your architecture. You do this within the context of an “application”. So one application can contain a couple of web servers, a caching server or two, a couple of databases, a monitoring server, a management bastion host of some kind, a load balancer, the works. It’s great, because when you’re done setting it all up, you can clone the entire application to a grid running at a completely different provider to create a completely redundant site in relatively little time. However, the downside to architecting like this is that if you need to make a change that affects how AppLogic initializes the component in question – like you need to allocate more disk or ram to one of the web servers – the *entire application* – all of those components I mentioned a second ago – needs to be rebooted. And it can take upwards of 10 minutes to do this. This is bad.
However, there’s nothing stopping you from setting up multiple applications on the same grid, putting 2 web servers in one, two in another, create a third application to load balance between them, and then if you need to do something that requires an application restart, you’ve minimized the pain, and the load balancer helps to insure that your users don’t incur any downtime.
Do you have to use a GUI?Â
Nope. Not to manage a deployed grid, anyway. I don’t know why you *wouldn’t* use the GUI to lay everything out and get a rough architecture into test mode quickly, but I guess technically you don’t have to use a GUI for that either. I have personally not used the GUI since the grid was in production. I use the standard UNIX-guy key-based ssh tunnels for everything. If I need to make a change to a component, I do it using a key-based ssh tunnel to the “grid controller” which is assigned to you when you get your grid.
Do I have to run <some bad distro here>
You can technically run whatever you want if you want to set it up manually. I don’t know of any restrictions as to Linux distribution. There are some special instructions regarding building custom kernels so that they work with the virtualization at the kernel level that AppLogic uses, but that’s all I’ve run into so far. They also provide some stock “template” components that can save you some headaches getting started. If you want a CentOS 5 MySQL server, it’s already there, and it’s not laden with AppLogic nonsense. It’s all stock as supplied by Cent. If you want a blank-slate install so you can add only what you need when you need it, that’s already there as well. What I’ve found so far is that the AppLogic builds do a much better job of creating a functional, but minimal, installation than I tend to do myself.
Our VPS uses all kinds of software on our OS and it eats up resources. How’s AppLogic in that regard?
Tell me about it. I was looking at a server that ran like three mail servers, 3 different server management software packages (virtuozzo, cpanel, and one other one), several web-based apps that weren’t even being used, tons of monitoring stuff, and… enough stuff to bog down a “blazing fast” server before any real “stuff” was running on it.
In contrast, AppLogic’s presence on your servers is really not noticeable at all. There is a stock monitoring device that they supply, and it talks to a daemon that runs on every server by default (I think – I haven’t used *every* stock catalog component they supply – yet). I don’t use the monitoring device, so I was able to kill off that daemon (I got confirmation that that was ok to do, by the way). The monitoring device wasn’t particularly useful. It was kind of just pretty pictures showing activity for the last 3 minutes and that was basically it. I need traps, alerts, and some history to help with capacity planning, so I’m rolling my own in that department. You can, of course, install whatever the heck you want on the systems, by the way — I coded a MySQL backup routine in Python (I’ll release that code at some point if anyone cares), installed snmptrapd and stuff on all of the servers, and there’s more to come as well.
What was the biggest problem you had?
I had set up redundant Apache servers with KeepAlives set to “on”, not realizing that one of the load balancers (which I’d treated as a black box up until this point) was running either a version or configuration of the “pound” daemon that didn’t like that at all. I was getting crazy log messages, timeouts, 500 errors, the works. I got on the phone with support and they got the right people involved in just a couple of minutes. It turned out that I fixed the issue myself by just turning off KeepAlives in Apache, but it’s nice to know they’re there.
There’s some debate about using KeepAlives when you’re using a load balancer or in high-performance scenarios in general — that much I knew. I also knew that some reverse proxies puked on KeepAlives. So I turned off KeepAlives, and the problem magically disappeared immediately.
Would you do it again?
Yeah, if the circumstances were right, I definitely would. I have one government client that I don’t think could ever get an approval to use something like this, and I have a smaller client who might consider it overkill, but for a privately held small-to-midsize company looking for a cost-effective way to get some benefits like redundancy and easier scalability into their architecture, this is good stuff.