Thursday, August 25, 2005

On the Utility of the GRID, part II

In the previous post on this matter, I have defined Utility Computing not as a technology but rather as a "goes-without-saying", trivial, merciless business requirement for today's real-time, adaptive, partially or entirely virtual Enterprises. I have defined GRID as the technical solution for the Utility Computing business requirement. Also, in a very simplistic, yet deliberate manner I've described the essence of the GRID as an automation layer for a scaling-out procedure. Scale-out architecture preceded the GRID in years; it’s the automation of the scaling-out that's new.

But don't let me trick you so easily. Our "simple" automation procedure, if well realized, brings the best IT Management system an Enterprise could possibly dream of. And that's because this simple automation process can work ONLY when the following exist:

1. An up to the second updated inventory with all available nodes. In order to perform an automatic scale-out, our simple process must know which nodes are available and which are already in use.

2. This inventory cannot be just a list of available nodes; it MUST be a CMDB (Configuration Management DataBase). Our simple process cannot scale-out to any available node on the inventory nodes' freelist. The candidate node must meet the business application' system configuration requirements! That said, configuration information must be meticulously managed so no erroneous scale-out happens.

3. Our simple process MUST be able to install on-the-fly whatever software required for the operation of the business application; it must also connect the new node into the relevant disks (San/Nas/Jbods), network & storage switches, load balancers and so on.

4. In order to correctly perform the above, our simple process must be aware of whatever Enterprise policies, as well as vendors' restrictions (Looks like our simple process is actually an automation of all the sys admins provisioning checklists…)

Hang-on a second: what invoked our simple process in the 1st place?

5. Our simple process MUST be tightly integrated with a monitoring system that invokes it when the business application is in a need for a scale-out.

6. When a fault occurs to a sub-component of the Business Application (say a disk array), there's no way the monitoring system will be able to link the fault to the business application. Unless – it has an access to the Enterprise data center world topology, and it has the capability to perform impact analysis.

And so on.

What has been furtively described here is the dream of any Enterprise: a complete life-cycle data center management, where data center objects (hardware & software), policies and users are all linked together to yield Utility Computing IT infrastructures.

This is the basis for the removal of all reasons for downtime, except application bugs. As all provisioning processes are automated by the GRID layer, misconfigurations (reason 3) can no longer happen, and users can no longer abuse security holes in systems and run unauthorized programs (reason 4).

If you're interested in all the prerequisites for an Enterprise Grid solution, do have a look at this excellent Enterprise Grid Reference Architecture, by the Enterprise Grid Alliance.


Post a Comment

<< Home