Showing posts with label cloud. Show all posts
Showing posts with label cloud. Show all posts

Thursday, March 09, 2017

Inspector Sands to Platform Nine and Three Quarters

Last week was not a good one for the platform business. Uber continues to receive bad publicity on multiple fronts, as noted in my post on Uber's Defeat Device and Denial of Service (March 2017). And on Tuesday, a fat-fingered system admin at AWS managed to take out a significant chunk of the largest platform on the planet, seriously degrading online retail in the Northern Virginia (US-EAST-1) Region. According to one estimate, performance at over half of the top internet retailers was hit by 20 percent or more, and some websites were completely down.

What have we learned from this? Yahoo Finance tells us not to worry.
"The good news: Amazon has addressed the issue, and is working to ensure nothing similar happens again. ... Let’s just hope ... that Amazon doesn’t experience any further issues in the near future."

Other commentators are not so optimistic. For Computer Weekly, this incident
"highlights the risk of running critical systems in the public cloud. Even the most sophisticated cloud IT infrastructure is not infallible."

So perhaps one lesson is not to trust platforms. Or at least not to practice wilful blindness when your chosen platform or cloud provider represents a single point of failure.

One of the myths of cloud, according to Aidan Finn,
"is that you get disaster recovery by default from your cloud vendor (such as Microsoft and Amazon). Everything in the cloud is a utility, and every utility has a price. If you want it, you need to pay for it and deploy it, and this includes a scenario in which a data center burns down and you need to recover. If you didn’t design in and deploy a disaster recovery solution, you’re as cooked as the servers in the smoky data center."

Interestingly, Amazon itself was relatively unaffected by Tuesday's problem. This may have been because they split their deployment across multiple geographical zones. However, as Brian Guy points out, there are significant costs involved in multi-region deployment, as well as data protection issues. He also notes that this question is not (yet) addressed by Amazon's architectural guidelines for AWS users, known as the Well-Architected Framework.

Amazon recently added another pillar to the Well-Architected Framework, namely operational excellence. This includes such practices as performing operations with code: in other words, automating operations as much as possible. Did someone say Fat Finger?




Abel Avram, The AWS Well-Architected Framework Adds Operational Excellence (InfoQ, 25 Nov 2016)

Julie Bort, The massive AWS outage hurt 54 of the top 100 internet retailers — but not Amazon (Business Insider, 1 March 2017)

Aidan Finn, How to Avoid an AWS-Style Outage in Azure (Petri, 6 March 2017)

Brian Guy, Analysis: Rethinking cloud architecture after the outage of Amazon Web Services (GeekWire, 5 March 2017)

Daniel Howley, Why you should still trust Amazon Web Services even though it took down the internet (Yahoo Finance, 6 March 2017)

Chris Mellor, Tuesday's AWS S3-izure exposes Amazon-sized internet bottleneck (The Register, 1 March 2017)

Shaun Nichols, Amazon S3-izure cause: Half the web vanished because an AWS bod fat-fingered a command (The Register, 2 March 2017)

Cliff Saran, AWS outage shows vulnerability of cloud disaster recovery (Computer Weekly, 6 March 2017)

Monday, March 18, 2013

Cloud and Continuity of Supply Risk

@dougnewdick points out the risk of a company becoming over-dependent on Google. His particular example is prompted by Google's announcement that Google Reader will be discontinued.

I have previously commented on the subject of Creeping Business Dependency, the fact that many companies have allowed themselves to become dependent on a particular company, product or technology. Especially Google. If Google decides your website offends against some search engine rules, it is perfectly capable of making your website disappear from searches. (BMW disappeared from Google for three days in 2006 - see my post BMW Search Requests.) A company might well go bust before it could sort the problem out.

Of course, you can't avoid some dependencies, but I think it is important that any significant dependency should be clearly visible in the business architecture. (In general, business architects usually neglect this kind of dependency until I point out specific examples to them.)

When looking at this kind of dependency, it is important to remember the principles of asymmetry - the Product is not the Technology, and the Company is not the Product. There have been a few popular products and platforms whose owners lost interest - these included Bloglines (formerly owned by Ask) and Delicious (formerly owned by Yahoo) - but were revived under new ownership. Users of a popular platform may feel that a large user base provides grounds for optimism that someone will want to keep it going, even if the original owner doesn't wish to. However, there are many products and platforms that have not survived.

More fundamental is the question of the underlying technology. A few years ago, there was considerable confidence and investment in RSS and Atom feeds, and a number of products and platforms were developed to exploit this technology. If there is a healthy ecosystem of different products and platforms, with relatively low switching costs, it doesn't matter much if one product drops out. But if Google and others are losing interest in this technology, that's a much more fundamental problem for anyone who is heavily committed to it.

If Google stops providing a free service, those who really want it may have to pay to get a decent service elsewhere. But this alters the economics of the service ecosystem, with unpredictable consequences. Clearly there is a risk that the service you want (or the service you need your customers to use) is increasingly expensive, inconvenient and ultimately unavailable.


Doug Newdick, Cloud and Continuity of Supply Risk (March 2013)

Monday, January 26, 2009

Cloud Computing or what?

In previous posts, I have contrasted SOA with JBOWS (just a bunch of web services) and CEP with JBOCE (just a bunch of complex events).

I am now happy to announce a new acronym to contrast with Cloud Computing: JSOWD (just a shower of water droplets). I did have a ruder version but I thought it was a bit early in the morning.

Seriously, we are now facing the usual semantic confusion. On his blog, J.P. Morgenthal explains How To Define SOA and Cloud Computing. He is not offering a definition as such, merely suggesting the kind of work that would be required to produce a satisfactory definition.

"Instead of trying to define them in a narrative manner, I believe we need to define them as a scheme with many components that are interrelated."
In a comment to JP's blog, John Evdemon writes

"Let the pundits debate their concepts and bizarre acronym versioning schemes - I'd rather help customers solve their problems."

I certainly don't want to get bogged down into debates about what Cloud Computing is "all about" - or even the negative debate about what it's not all about. Like most other bits of jargon, cloud computing involves a loose bundle of characteristics, and it is unlikely that everyone will agree as to which of these characteristics are essential or even desirable.

Which is where a schema comes in. A good schema should help practical people reason about the dependencies between real-life problems, specific technology characteristics, and potential benefits, as well as sharing practical knowledge. John's customers don't want to know what abstract labels and dogmas to attach to their projects, but I guess they do want practical and specific guidance. (If I can put in a plug for the CBDI Forum, this is the approach we have taken with SOA. See the CBDI Forum SAE Model.)

Tuesday, July 15, 2008

Clouds and Clocks 4

An interesting debate on SaaS and Control


Gianpaolo starts out by suggesting a simple trade-off between user control and the economics of scale. With SaaS, the user forgoes some control, in order to benefit from (some share of) the economics of scale achieved by the service provider. Gianpaolo divides control into two aspects: Control of Features and Control of SLA.

Charlie thinks it comes down to a perspective of what is "control". He suggests a different notion of control, whereby a contract with a service provider would give you more legal control than doing the work in-house.

But as far as I can see, both of them are talking about Command (setting the target conditions) rather than Control (regulating the system to satisfy the target conditions). See Alberts and Hayes, Understanding Command and Control (pdf).

Gianpaolo associates the Control of Features with who builds the service (Implementation) and the Control of SLA with who runs the service (Deployment). (This is a bit of a simplification; I shall ignore that for the moment.) But in Service-Oriented Architecture, there is a third and perhaps the most important role: who specifies the service. It is possible for the user (or a community of users, or an independent regulatory body) to provide the specification, and for the provider merely to deliver an acceptable implementation and deployment. Returning to the distinction between Command and Control, we may associate the Command of both Features and SLA with who specifies the service.

But what kind of service are we talking about here? The economics of scale are most obviously associated with simple one-size-fits-all services that are symmetric and replicable - what Philip Boxer calls r-type services. In most cases such services are unilaterally specified by the service provider, offer little if any contextual variety to the service user, and control tends to focus on quality control and cost control.

With more complex types of service, the service user typically demands a much greater fit to a specific use-context, and the question of control expands to include requisite variety and the economics of governance. But does this mean that the economics of scale go out of the window? The challenge for very large and complex service-based systems is to combine the economics of scale AND scope AND governance.

Friday, February 22, 2008

Software + Services

This week I visited Microsoft campus in Reading, to hear some presentations on their Software-and-Service (S+S) strategy, and have a chat with Gianpaolo Carraro.

Phil Wainewright recently described Microsoft's S+S strategy as bunkum, and accused Gianpaolo of drinking too much Kool-Aid. Phil's point was that people didn't want to worry about buying software AND buying services.

However, Gianpaolo isn't really focusing on the commercial side of the equation. He spends most of his time talking about deployment. From this viewpoint, it makes sense in many situations to have a combination of stuff running on your own machines (called software) and stuff running on other people's machines (called services). Actually that's what most users have anyway - both corporate users and domestic consumers.

But isn't it all software anyway? And who cares where the stuff is being run? Perhaps all of the people care some of the time, and some of the people care all of the time. I care when it affects my service level. For example, there are some places where broadband is unavailable, or at least very expensive. So if I want to work on an aeroplane, I may need to make sure the relevant documents are physically loaded onto my laptop beforehand.

The original vision of open distributed processing (ODP) included not only location transparency but other forms of transparency including migration and relocation. This means I don't notice when stuff moves around. Suppose I'm working on a bunch of documents that are currently on the server. Imagine my laptop knows my travel plans, knows that I'm going to be on an airplane this afternoon, works out which documents I'm going to need, and quietly and efficiently moves them while I'm in mid-edit, without dropping a comma.

I presume that Ray Ozzie, founder of Groove Networks and now Microsoft Chief Architect, understands these kind of requirements. He is pushing the S+S story hard.

But there is a conceptual problem with this semi-transparency - the fact that sometimes these aspects are visible and sometimes they aren't. ODP handles this by mandating several parallel viewpoints of a distributed system.

The CBDI method for service architecture and engineering (SAE) inherits this principle, although our viewpoints are not identical to the ODP ones. Meanwhile Microsoft has four perspectives, but these are defined in terms of process: Build, Run, Consume and Monetize.

When you want to think about deployment and location, you use one viewpoint; when you want to think about functionality and semantics, you use a different viewpoint; and when you want to think about commercial terms, costs and liabilities, you use a different one again.

I think this helps to explain the apparent disagreement between Gianpaolo and Phil. The S+S world looks very different from a commercial/organizational viewpoint and a deployment viewpoint. When I raised this with Gianpaolo, he made the valid point that these viewpoints cannot be totally isolated from one another. What happens from a deployment viewpoint inevitably has an impact upon the commercial viewpoint, and vice versa. Technological progress may help us reduce this impact, but it isn't going to disappear altogether any time soon.

But that doesn't mean the viewpoints simply merge into one unmanageable mess. The viewpoints are important precisely because they help us understand how things in one viewpoint relate to things in another viewpoint. And this in turn raises another important challenge. How are architects supposed to manage all this complexity? If you try to optimize the commercial/organizational arrangements alone, you may get unsatisfactory performance or service levels; if you try to optimize the physical deployment alone, you may not get the best commercial deal or organization structure; and if you try to optimize everything simultaneously, your head will explode. Gianpaolo's best advice at present is to do things at a very coarse level of granularity, which reduces the number of permutations to consider. But that's clearly not ideal.

The industry currently lacks decent tools to support this kind of architectural reasoning. We don't even have decent notations - you aren't going to get very far with colour-coded UML diagrams.

But what about enterprise SOA? Some people are working towards a world where all software is rendered as services - whether it is running on an enterprise server safely inside the firewall, or on a third-party server farm in Mumbai or Kiev. (By Day In Bollywood, By Night In The Ukraine.) Some of the same technical and conceptual issues here, but the terminology is different. S+S suggests that it only counts as a service if it is remote - this clashes with the enterprise SOA terminology.

Software-and-Services? The name may well generate some misunderstanding, especially if it is taken too literally, but that's probably true of any jargon. Not the name I'd have chosen, but then I'm not in charge of Microsoft's marketing strategy. Gianpaolo and his team have been working hard, producing some interesting material and examples, and the tools and techniques and future challenges coming out of this kind of work will undoubtedly be relevant to Enterprise SOA as well.


References

Gianpaolo Carraro: S+S: Real or have I drunk too much Kool-Aid? :) (Feb 2008)
Phil Wainewright: Microsoft Kool-Aid and the cloud
Gianpaolo Carraro: I think Phil drank the Kool-Aid too but he has not realized it yet :) (Feb 2008)
Ray Ozzie: MIX07 keynote, interview
Broadway Musical: "By day in Hollywood, by night in the Ukraine"
Reference Model for Open Distributed Processing (RM-ODP): Wikipedia, specification

Monday, May 21, 2007

Clouds and Clocks 2

Pat Helland regrets ...

Pat Helland has resumed blogging, having recently returned to Microsoft from a sojourn at Amazon. In his latest post SOA and Newton's Universe, he renounces the quasi-Newtonian paradigm of distributed systems to which he adhered for most of his 30-year career, and outlines an alternative paradigm with some resemblance to the Special Theory of Relativity.

The Newtonian paradigm of distributed systems is that we are trying to make many systems appear as the One System. This paradigm may be linked with the idea of the Global Schema or Universal Ontology. Helland contrasts this with an alternative paradigm of distributed systems, in which the systems are entirely dependent on the point of view of the observer, and there is no Universal Ontology. (I'd have wanted to use the term relativistic semantics here, but it has already been bagsied for academic linguistics - see for example Catfood and Buses.)

Helland sees this in terms of a relaxation of consistency. I disagree. Distributed systems-of-systems (including SOA) must follow a consistent logic - but not necessarily a traditional two-valued logic. Flexibility comes from being underdetermined (clouds) rather than overdetermined (clocks).

See my earlier posts Beyond Binary Logic and On Clouds and Clocks. See also Philip Boxer on Modelling Structure-Determining Processes.

Tuesday, January 23, 2007

Clouds and Clocks

I've looked at clouds from both sides now ...
The classic distinction is between clouds and clocks. Clocks follow precise rules, they are under "iron control". Clouds are more complex, and are under "plastic control".

In SOA terms, the term "cloud" is generally used to describe what happens outside the enterprise boundaries, and outside the corporate firewall. SOA is commonly expected to start in areas that are under greater control, and then spread to more complex, less well-controlled areas. Numerous pundits are now predicting this trend for the current year.

But the contrast between clouds and clocks may be a misleading one. In physics, one of the central questions of the twentieth century was whether atoms behaved like clocks (predictable mechanisms) or like clouds (statistical mechanisms). In 1965, Karl Popper gave a lecture called "On Clouds and Clocks", in which he contrasted the assumption of classical atomism (even clouds are composed of tiny clock-like bits) with the modern alternative (even clock-like structures are composed of tiny cloud-like bits).

We can ask a similar question for SOA. Is there really such a strong distinction between (tightly controlled) behaviours inside the enterprise and (loosely controlled or uncontrolled) behaviours outside the enterprise? Or is the cloud (cloudiness) fractal?

Much of my consultancy time is spent with organizations that are just too large and complex to be able to draw simple boundaries. Especially in the public sector and defence sector. As soon as you get past tightly controlled pilot projects, everything typically becomes more cloud-like.

Obituary of Karl Popper, 1902–1994, by John Watkins
Printed in Proceedings of the British Academy, Volume 94, pp. 645–684. 1997

Technorati Tags: