Solutions Services Industry Solutions Products Client Success Stories Partners Support & Documentation Site Map
Log In (optional) | Create an Account | Request Password

Perhaps the most timely proof that fault tolerance is relevant in today’s world is the fact many high-tech companies boldly profess to offer five 9s availability in some flavor (ref IBM’s recent webcast), and IT departments still pursue uptime perfection because their clients expect nothing less. Good-enough technology just is not good enough for some businesses operations.

What's your point of view?


Comments

Ref discussion above. Is it the industry or the application that determines availability needs? (Actually it's the user that determines importance!). Here is an interesting example. Beijing constructed an underground roadway to ease surface congestion, connect parking lots, and improve access to 2008 Olympic event venues. Abundant data collection devices and video cameras report back to a control center to enable rapid response to conditions and mishaps within the 5.5 km loop. This traffic management and control software is defined by the environment to be mission-critical, even if only for the duration of the event. Similar considerations can affect a virtualized infrastructure when, at certain times during the day, week or month, a particular application can become mission-critical. For that period of time the application is migrated to a server resource pool that is fault-tolerant.

I think virtualization could be driving more mainstream consideration of continuous availability and fault tolerance. Putting day-to-day apps running in a virtual environment alongside a number of other virtual environments on one machine tends to make that machine a mission critical piece of the business. So, the calculus of how reliable is reliable enough begins to change, IMHO.

The fact that the physical platform can be a single point of failure (along with the virtualization layer itself) is a fact more IT managers need to realize, ideally before they have a crisis. Regular servers and clusters are not up to the task.

I wonder if fault-tolerance is really only relevant in obvious industries (the banks, the airports and so on). I'd be surprised if some standard enterprise would go fault-tolerant just to support day-to-day apps.

I would suggest the FT is relevant by application more so than by industry, and that application value is defined not only by the cost of downtime but by the value users of the service attach to it. Exchange software is a perfect example; more than a few users and businesses will define its availability as essential to their operations.

Compared to alternative solutions like clusters, FT carries a pretty hefty premium. Even basic servers are incredibly reliable, so I have to wonder if fault tolerance is really worth that price difference.

Alternative solutions definitely have their place in the availability hierarchy. But consider Gartner’s estimate that the average cost of downtime is $108K hour – that’s only hard dollars – and the price differential between a 99.95% solution and a 99.999% solution is rendered meaningless. System price is only one data point and, when factored in with cost of admin, managing and servicing over the lifecycle of an app, relatively minor.

Post new comment
This is a moderated discussion forum. All comments may be edited for brevity and clarity. By using this forum, you are agreeing to the forum terms of use.

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Captcha
This question is used to make sure you are a human visitor and to prevent spam submissions. The characters are case-sensitive and lowercase.
Log in to avoid this question.
Copy the characters (respecting upper/lower case) from the image.