Thursday, November 16, 2006

Reliability and Availability

One of the pleasures of being an industry analyst is that you get to read a lot of vendor material.

Yesterday I came across the following statement in a white paper by Jonathan Purdy of Tangosol on Data Grids and SOA.
"As Business Services are integrated into increasingly complex work-flows, the added dependencies decrease availability. If a Business Process depends on a number of services, the availability of the process is actually the product of the weaknesses of all the composed services. For example, if a Business Process depends on six services, each of which achieves 99% uptime, then the Business Process itself will have a maximum of 94% uptime, corresponding to more than 500 hours of unplanned downtime each year."
This might be true under certain circumstances, but it depends on how the business process is designed, and the degree of coupling within the process. A typical design objective is to compose services in a way that doesn't multiply the dependencies in this way. One of the principles of distributed systems has always been to avoid single points of failure, and this principle is surely inherited by SOA. When used intelligently, loose coupling and asynchrony should make a business more robust.

It is sometimes possible to orchestrate services in a way that that the reliability of the whole is greater than the reliability of the parts. Firstly, there may be underlying services that are not required for every transaction, so the reliability of these underlying services only partially impacts the reliability of the process. Secondly, there may be services that provide multiple or alternative provision of a given capability - alternative process paths can be defined to make the process more fault-tolerant.

That's not going to make the problem of reliability and availability go away of course, and there are undoubtedly some useful things that vendors such as Tangosol can offer in the physical implementation of SOA. And Purdy is right to warn his readers of the risks associated with complexity.

But what this raises for me is the general difficulty of reasoning about non-functional requirements. Do you add them, do you multiply them, do you average them, or do you need to perform a more complicated bit of algebra?

Technorati Tags:

No comments: