Friday, April 09, 2021

Near Miss

A serious aviation incident in the news today. A plane took off from Birmingham last year with insufficient fuel, because the weight of the passengers was incorrectly estimated. This is being described as an IT error.

As Cathy O'Neil's maxim reminds us, algorithms are opinions embedded in code. The opinion in this case was the assumption that the prefix Miss referred to a female child. According to the official report, published this week, this is how the prefix is used in the country where the system was programmed

In this particular flight, 38 adult women were classified as Miss, so the algorithm estimated their weight as 35 kg instead of 69 kg.

The calculation error was apparently compounded by several human factors.

  • A smaller discrepancy had been spotted and corrected on a previous flight. 
  • The pilot noticed that there seemed to be an unusually high number of children on the flight, but took no action because the pandemic had disrupted normal expectations of passenger numbers.
  • The software was being upgraded, but the status of the fix at the time of the flight was unclear. There were other system-wide changes being implemented at the same time, which may have complicated the fix.
  • Guidance to ground staff to double-check the classification of female passengers was not properly communicated and followed, possibly due to weekend shift patterns.

As Dan Nguyen points out, there have been previous incidents resulting from incorrect assumptions about passenger weight. But I think we need to distinguish between factual errors (what is the average weight of an adult passenger) and classification errors (what exactly does the Miss prefix signify).

There is an important lesson for data management here. You may have a business glossary or data dictionary that defines an attribute called Prefix and provides a list of permitted values. But if different people (different parts of your organization, different external parties) understand and use these values to mean different things, there is still scope for semantic confusion unless you make the meanings explicit.



AAIB Bulletin 4/2021 (April 2021) https://www.gov.uk/government/publications/air-accident-monthly-bulletin-april-2021

Tui plane in ‘serious incident’ after every ‘Miss’ on board was assigned child’s weight (Guardian, 9 April 2021)

For further discussion and related examples, see Dan Nguyen's Twitter thread https://twitter.com/dancow/status/1380188625401434115



2 comments:

  1. Amazing - and yet not - how such a simple classification/cultural mismatch can scale up. While the same issue _could_ affect individuals just as easily (eg medical software prescribes the wrong quantity of a drug), the fascinating thing here is a) how much the classification needs to scale up before it becomes a significant problem (eg light aircraft vs transatlantic airliner), and b) as it scales up, at what point does the "significance" demand a fundamentally different approach to risk management, and what funding/skills difference does that entail?

    Putting in manual tests and checks (like the pilot and the ground staff) is one thing, but who's to say 'the human touch' is more or less infallible than other check processes? Would heavily-specced software QA checks have caught this, or was the mis(s)-classification an inherent result of the company _needing_ to classify lighter passengers to save fuel?

    And if manually-driven testing processes aren't sufficient, would something like chaos testing be able to cover it? Generate random passenger data millions of times and then simulate the resulting flights and check if fuel levels and any other parameters are within acceptable levels.

    In this case, there doesn't seem to be much detail on what level of error is acceptable in the passenger modelling. Maybe the software error was actually acceptable in this case? (I know little of airplane safety...)

    ReplyDelete
  2. Thanks Scribe

    Whether this counts as a software error or something else is debatable, and I'm not convinced that improved software processes would have picked this up.

    In this particular case, it seems that the inaccuracy had some observable consequences (difference in the speed of the plane) but did not prevent the safe completion of the flight. What I do know about aviation safety is that near misses are taken a lot more seriously than in other forms of transport, which is why we are now able to read about this incident. (Atul Gawande has written about what healthcare has learned from aviation.)

    Meanwhile, I imagine that if the algorithm had more personal data instead of a simple (and possibly misleading) classification, up to and including counting how many drinks you had at the airport after checking in, it would be able to produce a much more accurate estimate, and this would allow greater fuel efficiency without compromising safety. There is always a plausible argument for greater surveillance, isn't there Scribe?

    ReplyDelete