Wednesday, October 26, 2016

The Shelf-Life of Algorithms

@mrkwpalmer (TIBCO) invites us to take what he calls a Hyper-Darwinian approach to analytics. He observes that "many algorithms, once discovered, have a remarkably short shelf-life" and argues that one must be as good at "killing off weak or vanquished algorithms" as creating new ones.

As I've pointed out elsewhere (Arguments from Nature, December 2010), the non-survival of the unfit (as implied by his phrase) is not logically equivalent to the survival of the fittest, and Darwinian analogies always need to be taken with a pinch of salt. However, Mark raises an important point about the limitations of algorithms, and the need for constant review and adaptation, to maintain what he calls algorithmic efficacy.

His examples fall into three types. Firstly there are algorithms designed to anticipate and outwit human and social processes, from financial trading to fraud. Clearly these need to be constantly modified, otherwise the humans will learn to outwit the algorithms. And secondly there are algorithms designed to compete with other algorithms. In both cases, these algorithms need to keep ahead of the competition and to avoid themselves becoming predictable. Following an evolutionary analogy, the mutual adaptation of fraud and anti-fraud tactics resembles the co-evolution of predator and prey.

Mark also mentions a third type of algorithm, where the element of competition and the need for constant change is less obvious. His main example of this type is in the area of predictive maintenance, where the algorithm is trying to predict the behaviour of devices and networks that may fail in surprising and often inconvenient ways. It is a common human tendency to imagine that these devices are inhabited by demons -- as if a printer or photocopier deliberately jams or runs out of toner because it somehow knows when one is in a real hurry -- but most of us don't take this idea too seriously.

Where does surprise come from? Bateson suggests that it comes from an interaction between two contrary variables: probability and stability --
"There would be no surprises in a universe governed either by probability alone or by stability alone."
--  and points out that because adaptations in Nature are always based on a finite range of circumstances (data points), Nature can always present new circumstances (data) which undermine these adaptations. He calls this the caprice of Nature.
"This is, in a sense, most unfair. ... But in another sense, or looked at in a wider perspective, this unfairness is the recurrent condition for evolutionary creativity."

The problem with adaptation being based solely on past experience also arises with machine learning, which generally uses a large but finite dataset to perform inductive reasoning, in a way that is non-transparent to the human. This probably works okay for preventative maintenance on relatively simple and isolated devices, but as devices and their interconnections get more complex, we shouldn't be too surprised if algorithms, whether based on human mathematics or machine learning, sometimes get caught out by the caprice of Nature. Or by so-called Black Swans.

This potential unreliability is particularly problematic in two cases. Firstly, when the algorithms are used to make critical decisions affecting human lives - as in justice or recruitment systems. (See for example, Zeynap Tufekci's recent TED talk.) And secondly, when preventative maintenance has safety implications - from aeroengineering to medical implants.

One way of mitigating this risk might be to maintain multiple algorithms, developed by different teams using different datasets, in order to detect additional weak signals and generate "second opinions". And get human experts to look at the cases where the algorithms strongly disagree.

This would suggest that we maybe shouldn't be too hasty to kill off algorithms with poor efficacy, but sometimes keep them in the interests of algorithmic biodiversity.  (There - I'm using the evolutionary metaphor.)

Gregory Bateson, "The New Conceptual Frames for Behavioural Research". Proceedings of the Sixth Annual Psychiatric Institute (Princeton NJ: New Jersey Neuro-Psychiatric Institute, September 17, 1958). Reprinted in G. Bateson, A Sacred Unity: Further Steps to an Ecology of Mind (edited R.E. Donaldson, New York: Harper Collins, 1991) pp 93-110

Mark Palmer, The emerging Darwinian approach to analytics and augmented intelligence (TechCrunch, 4 September 2016)

Zeynap Tufekci, Machine intelligence makes human morals more important (TED Talks, Filmed June 2016)

Tuesday, October 25, 2016

85 Million Faces

It should be pretty obvious why Microsoft wants 85 million faces. According to its privacy policy
Microsoft uses the data we collect to provide you the products we offer, which includes using data to improve and personalize your experiences. We also may use the data to communicate with you, for example, informing you about your account, security updates and product information. And we use data to help show more relevant ads, whether in our own products like MSN and Bing, or in products offered by third parties. (retrieved 25 October 2016)
Facial recognition software is big business, and high quality image data is clearly a valuable asset.

But why would 85 million people go along with this? I guess they thought they were just playing a game, and didn't think of it in terms of donating their personal data to Microsoft. The bait was to persuade people to find out how old the software thought they were.

The Daily Mail persuaded a number of female celebrities to test the software, and printed the results in today's paper.

Talking of beards ...

Kyle Chayka, Face-recognition software: Is this the end of anonymity for all of us? (Independent, 23 April 2014)

Chris Frey, Revealed: how facial recognition has invaded shops – and your privacy (Guardian, 3 March 2016)

Rebecca Ley, Would YOU  dare ask a computer how old you look? Eight brave women try out the terrifyingly simple new internet craze (Daily Mail, 25 October 2016)

Friday, September 02, 2016

Single Point of Failure (Comms)

Large business-critical systems can be brought down by power failure. My previous post looked at Airlines. This time we turn our attention to Telecommunications.

Obviously a power cut is not the only possible cause of business problems. Another single-point of failure could be a single rogue employee.

Gavin Clarke, Telecity's engineers to spend SECOND night fixing web hub power outage (The Register, 18 November 2015)

Related Post: Single Point of Failure (Airlines) (August 2016)

Monday, August 08, 2016

Single Point of Failure (Airlines)

Large business-critical systems can be brought down by power failure. Who knew?

In July 2016, Southwest Airlines suffered a major disruption to service, which lasted several days. It blamed the failure on "lingering disruptions following performance issues across multiple technology systems", apparently triggered by a power outage.
In August 2016 it was Delta's turn.

Then there were major problems at British Airways (Sept 2016) and United (Oct 2016).

The concept of "single point of failure" is widely known and understood. And the airline industry is rightly obsessed by safety. They wouldn't fly a plane without backup power for all systems. So what idiot runs a whole company without backup power?

We might speculate what degree of complacency or technical debt can account for this pattern of adverse incidents. I haven't worked with any of these organizations myself. However, my guess is that some people within the organization were aware of the vulnerability, but this awareness didn't somehow didn't penetrate the management hierarchy. (In terms of orgintelligence, a short-sighted board of directors becomes the single point of failure!) I'm also guessing it's not quite as simple and straightforward as the press reports and public statements imply, but that's no excuse. Management is paid (among other things) to manage complexity. (Hopefully with the help of system architects.)

If you are the boss of one of the many airlines not mentioned in this post, you might want to schedule a conversation with a system architect. Just a suggestion.

American Airlines Gradually Restores Service After Yesterday's Power Outage (PR Newswire, 15 August 2003)

British Airways computer outage causes flight delays (Guardian, 6 Sept 2016)

Delta: ‘Large-scale cancellations’ after crippling power outage (CNN Wire, 8 August 2016)

Gatwick Airport Christmas Eve chaos a 'wake-up call' (BBC News, 11 April 2014)

Simon Calder, Dozens of flights worldwide delayed by computer systems meltdown (Independent, 14 October 2016)

Jon Cox, Ask the Captain: Do vital functions on planes have backup power? (USA Today, 6 May 2013)

Jad Mouawad, American Airlines Resumes Flights After a Computer Problem (New York Times, 16 April 2013)

 Marni Pyke, Southwest Airlines apologizes for delays as it rebounds from outage (Daily Herald, 20 July 2016)

Alexandra Zaslow, Outdated Technology Likely Culprit in Southwest Airlines Outage (NBC News, Oct 12 2015)

Updated 14 October 2016.

Sunday, August 07, 2016

Why does my bank need more personal data?

I recently went into a High Street branch of my bank and moved a bit of money between accounts. I could have done more, but I didn't have any additional forms of identification with me.

At the end, the cashier asked me for my nationality. British, as it happens. Why do you want to know? The cashier explained that this enabled a security control: if I ever bring my passport into a branch as a form of identification, the system can check that my passport matches my declared nationality.

Really? Really? If this is really a security measure, it's a pretty feeble one. Does my bank imagine I'm going to say I'm British and then produce a North Korean passport? Like a James Bond film?

After she had explained how the bank would use my nationality data, she then asked for my National Insurance number. I declined, choosing not to quiz her any further, and left the branch planning to write a stiff letter to the head of data protection at the bank's head office.

As a data expert, I am always a little suspicious of corporate motives for data collection. So the thought did occur to me that my bank might be planning to use my personal data for some purpose other than that stated.

Of course, my bank is perfectly entitled to collect data for marketing purposes, with my consent. But in this case, I was explicitly told that the data were being collected for a very narrowly defined security purpose.

So there are two possibilities. Either my bank doesn't understand security, or it doesn't understand data protection. (Of course there will be individuals who understand these things, but the bank as an organization appears to have failed to embed this understanding into its systems and working practices.) I shall be happy to provide advice and guidance on these topics.

New White Paper - TotalData™

My latest white paper for @GlueReply has been posted on the Reply website.

It outlines four dimensions of TotalData - reach, richness, assurance and agility - and presents a Value Chain from Raw Data to the Data-Fueled Business.

TotalData™: Start making better use of Data (html) (pdf)

(Now I need to write some more detailed stuff, based on a few client projects.)

Saturday, June 04, 2016

As How You Drive

I have been discussing Pay As You Drive (PAYD) insurance schemes on this blog for nearly ten years.

The simplest version of the concept varies your insurance premium according to the quantity of driving - Pay As How Much You Drive. But for obvious reasons, insurance companies are also interested in the quality of driving - Pay As How Well You Drive - and several companies now offer a discount for "safe" driving, based on avoiding events such as hard braking, sudden swerves, and speed violations.

Researchers at the University of Washington argue that each driver has a unique style of driving, including steering, acceleration and braking, which they call a "driver fingerprint". They claim that drivers can be quickly and reliably identified from the braking event stream alone.

Bruce Schneier posted a brief summary of this research on his blog without further comment, but a range of comments were posted by his readers. Some expressed scepticism about the reliability of the algorithm, while others pointed out that driver behaviour varies according to context - people drive differently when they have their children in the car, or when they are driving home from the pub.

"Drunk me drives really differently too. Sober me doesn't expect trees to get out of the way when I honk."

Although the algorithm produced by the researchers may not allow for this kind of complexity, there is no reason in principle why a more sophisticated algorithm couldn't allow for it. I have long argued that JOHN-SOBER and JOHN-DRUNK should be understood as two different identities, with recognizably different patterns of behaviour and risk. (See my post on Identity Differentiation.)

However, the researchers are primarily interested in the opportunities and threats created by the possibility of using the "driver fingerprint" as a reliable identification mechanism.

  • Insurance companies and car rental companies could use "driver fingerprint" data to detect unauthorized drivers.
  • When a driver denies being involved in an incident, "driver fingerprint" data could provide relevant evidence.
  • The police could remotely identify the driver of a vehicle during an incident.
  • "Driver fingerprint" data could be used to enforce safety regulations, such as the maximum number of hours driven by any driver in a given period.

While some of these use cases might be justifiable, the researchers outline various scenarios where this kind of "fingerprinting" would represent an unjustified invasion of privacy, observe how easy it is for a third party to obtain and abuse driver-related data, and call for a permission-based system for controlling data access between multiple devices and applications connected to the CAN bus within a vehicle. (CAN is a low-level protocol, and does not support any security features intrinsically.)


Miro Enev, Alex Takakuwa, Karl Koscher, and Tadayoshi Kohno, Automobile Driver Fingerprinting Proceedings on Privacy Enhancing Technologies; 2016 (1):34–51

Andy Greenberg, A Car’s Computer Can ‘Fingerprint’ You in Minutes Based on How You Drive (Wired, 25 May 2016)

Bruce Schneier, Identifying People from their Driving Patterns (30 May 2016)

See also John H.L. Hansen, Pinar Boyraz, Kazuya Takeda, Hüseyin Abut, Digital Signal Processing for In-Vehicle Systems and Safety. Springer Science and Business Media, 21 Dec 2011

Wikipedia: CAN bus, Vehicle bus

Related Posts

Identity Differentiation (May 2006)

Pay As You Drive (October 2006) (June 2008) (June 2009)

Monday, May 30, 2016

Globally Integrated Enterprise 2

In my post on the Globally Integrated Enterprise (June 2006), I reported a comment by James Martin about the growth of the Indian CEO. A few years later, Megha Bahree reported Eight Indian CEOs At Big U.S. Companies (Forbes, December 2009). Now both Microsoft and Google CEOs were born in India. See India-born CEOs are taking the U.S. by storm (CNN MoneyInvest, August 2015).

@juliapowles notes that the relationship between Google and Microsoft started to improve in September 2015, shortly after Sundar Pichai became Google’s chief executive. Coincidence?

Meanwhile, the relationship between Google and Oracle remains tense. Google has just won the latest battle in the ongoing legal war over its use of Java code in the Android operating system.

In case you were wondering, Oracle does not have an Indian CEO. Unusually, it has two CEOs - an Israeli woman and an American man - as well as Larry Ellison remaining as CTO and executive chairman. (Not exactly normalized, eh Larry?)

Ms Catz has been leading the Oracle legal battle against Google. No doubt she has Mr Hurd's full support ...

Diane Brady, Oracle Hunger Games: Larry Ellison Creates Co-CEOs (Bloomberg, 19 September 2014)

Julia Powles, Google and Microsoft have made a pact to protect surveillance capitalism (Guardian, 2 May 2016)

Nicky Woolf, Google wins six-year legal battle with Oracle over Android code copyright (Guardian, 26 May 2016)

Claire Zillman, With co-CEOs, companies flirt with disaster (Fortune, 20 September 2016)

Wednesday, May 04, 2016

Beyond Bimodal

Ten years ago (March 2006) I attended the SPARK workshop in Las Vegas, hosted by Microsoft. One of the issues we debated extensively was the apparent dichotomy between highly innovative, agile IT on the one hand, and robust industrial-strength IT on the other hand. This dichotomy is often referred to as bimodal IT.

In those days, much of the debate was focused on technologies that supposedly supported one or other mode. For example SOA and SOAP (associated with the industrial-strength end) versus Web 2.0 and REST (associated with the agile end).

But the interesting question was how to bring the two modes back together. Here's one of the diagrams I drew at the workshop.

Business Stack

As the diagram shows, the dichotomy involves a number of different dimensions which sometimes (but not always) coincide.
  • Scale
  • Innovation versus Core Process
  • Different rates of change (shearing layers or pace layering)
  • Top-down ontology versus bottom up ontology ("folksonomy")
  • Systems of engagement versus systems of record
  • Demand-side (customer-facing) versus supply side
  • Different levels of trust and security

Even in 2006, the idea that only industrial-strength IT can handle high volumes at high performance was already being seriously challenged. There were some guys from MySpace at the workshop, handling volumes which were pretty impressive at that time. As @Carnage4Life put it, My website is bigger than your enterprise.

Bimodal IT is now back in fashion, thanks to heavy promotion from Gartner. But as many people are pointing out, the flaws in bimodalism have been known for a long time.

One possible solution to the dichotomy of bimodalism is an intermediate mode, resulting in trimodal IT. Simon Wardley has characterized the three modes using the metaphor of Pioneers, Settlers, and Town Planners. A similar metaphor (Commandos, Infantry and Police) surfaced in the work of Robert X Cringely sometime in the 1990s. Simon reckons it was 1993.

Trimodal doesn't necessarily mean three-speed. Some people might interpret the town planners as representing ‘slow,’ traditional IT. But as Jason Bloomberg argues, Simon's model should be interpreted in a different way, with town planners associated with commodity, utility services. In other words, the town planners create a robust and agile platform on which the pioneers and settlers can build even more quickly. This is consistent with my 2013 piece on hacking and platforms. Simon argues that all three (Pioneers, Settlers, and Town Planners) must be brilliant.

Characterizing a mode as "slow" or "fast" may be misleading, because (despite Rob England's contrarian arguments) people usually assume that "fast" is good and "slow" is bad. However, it is worth recognizing that each mode has a different characteristic tempo, and differences in tempo raise some important structural and economic issues. See my post on Enterprise Tempo (Oct 2010).

Updated - corrected and expanded the description of Simon's model.  Apologies for Simon for any misunderstanding on my part in the original version of this post.

Jason Bloomberg, Bimodal IT: Gartner's Recipe For Disaster (Forbes, 26 Sept 2015)

Jason Bloomberg, Trimodal IT Doesn’t Fix Bimodal IT – Instead, Let’s Fix Slow (Cortex Newsletter, 19 Jan 2016)

Jason Bloomberg, Bimodal Backlash Brewing (Forbes, 26 June 2016)

Rob England, Slow IT (28 February 2013)

Bernard Golden, What Gartner’s Bimodal IT Model Means to Enterprise CIOs (CIO Magazine, 27 January 2015)

John Hagel, SOA Versus Web 2.0? (Edge Perspectives, 25 April 2006)

Dion Hinchcliffe, How IT leaders are grappling with tech change: Bi-modal and beyond (ZDNet, 14 January 2015)

Dion Hinchcliffe, IT leaders inundated with bimodal IT meme (ZDNet, 1 May 2016)

Dare Obasanjo, My website is bigger than your enterprise (March 2006)

Richard Veryard, Notes from the SPARK workshop (March 2006), Enterprise Tempo (October 2010), A Twin-Track Approach to Government IT (March 2011),

Richard Veryard, Why hacking and platforms are the future of NHS IT (The Register, 16 April 2013)

Richard Veryard and Philip Boxer, Metropolis and SOA Governance (Microsoft Architecture Journal, July 2005)

Simon Wardley, Bimodal IT - the new old hotness (13 November 2014)

Simon Wardley, On Pioneers, Settlers, Town Planners and Theft (13 March 2015)

Lawrence Wilkes and Richard Veryard, Extending SOA with Web 2.0 (CBDI Forum for IBM, 2007)

updated 27 June 2016

Wednesday, November 18, 2015

Update on Deconfliction

The obscure word #deconfliction has started to appear in the news, referring to the coordination or lack of coordination between American and Russian operations in the Middle East, especially Syria.

The Christian Science Monitor suggests that the word "deconfliction" sounds too cooperative, and quotes the New York Times.

“Defense Secretary Ashton B. Carter sharply took issue with suggestions, particularly in the Arab world, that the United States was cooperating with Russia, and he insisted that the only exchanges that the Pentagon and the Russian military could have on Syria at the moment were technical talks on how to steer clear of each other in the skies above the country.”

But that's exactly what deconfliction is - "how to steer clear of each other" - especially in the absence of tight synchronization and strong coordination.

The Guardian quotes Gary Rawnsley, professor of public diplomacy at Aberystwyth University, who says such jargon is meaningless and is designed to confuse the public. But I think this is unfair. The word has been used within military and other technical circles for many decades, with a fairly precise technical meaning. Obviously there is always a problem (as well as a risk of misunderstanding) when technical jargon leaks into the public sphere, especially when used by such notorious obfuscators as Donald Rumsfeld.

In the current situation, the key point is that cooperation and collaboration require something more like a dimmer switch rather than a simple on-off switch. The Americans certainly don't want total cooperation with the Russians - either in reality or in public perception - but they don't want zero cooperation either. Meanwhile Robbin Laird of SLD reports that the French and the Russians have established "not only deconfliction but also coordinated targeting ... despite differences with regard to the future of Syria". In other words, Franco-Russian coordination going beyond mere deconfliction, but stopping short of full alignment.

Thus the word "deconfliction" actually captures the idea of minimum viable cooperation. And this isn't just a military concept. There are many business situations where minimum viable cooperation makes a lot more sense than total synchronization. We could always call it loose coupling.

Helene Cooper, A Semantic Downgrade for U.S.-Russian Talks About Operations in Syria (New York Times, 7 October 2015)

Jonathan Marcus, Deconflicting conflict: High-stakes gamble over Syria (BBC News, 6 October 2015)

Robbin Laird, The RAF Unleashed: The UK and the Coalition Step up the Fight Against ISIS (SLD, 6 December 2015)

Ruth Walker, Feeling conflicted about deconfliction (Christian Science Monitor, 22 October 2015)

Matthew Weaver, 'Deconflict': buzzword to prevent risk of a US-Russian clash over Syria (Guardian 1 October 2015)

Ben Zimmer, In Conflict Over Russian Role in Syria, ‘Deconfliction’ Draws Critics (Wall Street Journal, 9 October 2015)

More posts on Deconfliction

Updated 7 December 2015