Showing posts with label safety. Show all posts
Showing posts with label safety. Show all posts

Friday, July 29, 2022

Testing Multiples

From engineering to medicine, professionals are often forced to rely on tests that are not always completely accurate. In this post, I shall look at the tests that millions of people were obliged to use during the pandemic, to check whether they had COVID-19. The two most common tests were Lateral Flow and PCR. Lateral Flow was quicker and more convenient, while PCR took longer (because the sample had to be sent to a lab) and was supposedly more accurate.

There was also a difference in the data collected from these tests. Whereas all the results from the PCR tests should have been available in the labs, the results from the lateral flow tests were only reported under certain circumstances. There was no obligation to report a negative test unless you needed access to something, and people sometimes chose not to report positive tests because of the restrictions that might follow. And of course people only took the tests when they had to, or wanted to. When people had to pay for the tests, this obviously made a big difference.

To compensate for these limitations, some random screening was carried out, which was designed to produce more reliable and representative datasets. However, these datasets were much smaller.

 

So what can we do with this kind of data? Firstly, it tells us something about the disease - whether it is distributed evenly across the country or concentrated in certain places, how quickly it is spreading. If we can combine the test results with other information about the test subjects, we may be able to get some demographic information - for example, how is the disease affecting people of different age, gender or race, how is it affecting different job categories. And if we have information from the health service, we can estimate how many of those testing positive end up in hospital.

This kind of information allows us to make predictions - for example, future demand for hospital beds, possible shortages of key workers. It also allows us to assess the effects of various protective measures - for example, to what extent does mask-wearing, social distancing and working from home reduce the rate of transmission.

Besides telling us about the disease, the data should also be able to tell us something about the tests. And the accuracy of the predictions provides a feedback loop, which may enable us to reassess either the test data or the predictive models.

 

In her book The Body Multiple, Annemarie Mol discusses the differences between two alternative tests for atherosclerosis, and describes how clinicians deal with cases where the two tests appear to provide conflicting results, as well as cases where there may be other reasons to question the test results. Instead of having a single view of the disease, she talks about its multiplicity or manyfoldedness.

But questioning the test results in a particular case, or highlighting particular issues with a given test, does not mean denying the overall value of the test. Most of the time we can continue to regard a test as useful, even as we are considering ways of improving it.

If and when we introduce a new or improved test, we may then wish to translate data between tests. In other words, if test A produced result X, then we would have expected test B to produce result Y. While this kind of translation may be useful for statistical purposes, we need to be careful about its use in individual cases.

For many people, the second discourse appears to undermine the first discourse. If we can't always trust the data, can we ever trust the data? During the COVID pandemic, many rival interpretations of the data emerged; some people chose interpretations that confirmed their preconceptions, while others turned away from any kind of data-driven reasoning.

 

The COVID pandemic became a politically contentious field, so what if we look at other kinds of testing? In safety engineering, components and whole products are subjected to a range of tests, which assess the risk of certain kinds of failure. Obviously there are manufacturers and service providers with a commercial interest in how (and by whom) these tests are carried out, and there may be regulators and researchers looking at how these tests can be improved, or to detect various forms of cheating, but ordinary consumers don't generally spend hours on YouTube complaining about their accuracy and validity.

Meanwhile even basic corporate reporting may be subject to this kind of multiplicity, as illustrated in my recent post on Data Estimation (July 2022).

So there is a level of complexity here, which not all data users may feel comfortable with, but which data professionals may not feel comfortable about hiding. In a traditional report, these details are often pushed into footnotes, and in an online dashboard there may be symbols inviting the user to drill down for further detail. But is that good enough?


Annemarie Mol, The Body Multiple: Ontology in Medical Practice (Duke University Press 2002)

Wikipedia: COVID-19 testing

Related posts: Data-Driven Reasoning - COVID (April 2022), Data Estimation (July 2022)

Friday, April 09, 2021

Near Miss

A serious aviation incident in the news today. A plane took off from Birmingham last year with insufficient fuel, because the weight of the passengers was incorrectly estimated. This is being described as an IT error.

As Cathy O'Neil's maxim reminds us, algorithms are opinions embedded in code. The opinion in this case was the assumption that the prefix Miss referred to a female child. According to the official report, published this week, this is how the prefix is used in the country where the system was programmed

In this particular flight, 38 adult women were classified as Miss, so the algorithm estimated their weight as 35 kg instead of 69 kg.

The calculation error was apparently compounded by several human factors.

  • A smaller discrepancy had been spotted and corrected on a previous flight. 
  • The pilot noticed that there seemed to be an unusually high number of children on the flight, but took no action because the pandemic had disrupted normal expectations of passenger numbers.
  • The software was being upgraded, but the status of the fix at the time of the flight was unclear. There were other system-wide changes being implemented at the same time, which may have complicated the fix.
  • Guidance to ground staff to double-check the classification of female passengers was not properly communicated and followed, possibly due to weekend shift patterns.

As Dan Nguyen points out, there have been previous incidents resulting from incorrect assumptions about passenger weight. But I think we need to distinguish between factual errors (what is the average weight of an adult passenger) and classification errors (what exactly does the Miss prefix signify).

There is an important lesson for data management here. You may have a business glossary or data dictionary that defines an attribute called Prefix and provides a list of permitted values. But if different people (different parts of your organization, different external parties) understand and use these values to mean different things, there is still scope for semantic confusion unless you make the meanings explicit.



AAIB Bulletin 4/2021 (April 2021) https://www.gov.uk/government/publications/air-accident-monthly-bulletin-april-2021

Tui plane in ‘serious incident’ after every ‘Miss’ on board was assigned child’s weight (Guardian, 9 April 2021)

For further discussion and related examples, see Dan Nguyen's Twitter thread https://twitter.com/dancow/status/1380188625401434115



Monday, April 22, 2019

When the Single Version of Truth Kills People

@Greg_Travis has written an article on the Boeing 737 Max Disaster, which @jjn1 describes as "one of the best pieces of technical writing I’ve seen in ages". He explains why normal airplane design includes redundant sensors.

"There are two sets of angle-of-attack sensors and two sets of pitot tubes, one set on either side of the fuselage. Normal usage is to have the set on the pilot’s side feed the instruments on the pilot’s side and the set on the copilot’s side feed the instruments on the copilot’s side. That gives a state of natural redundancy in instrumentation that can be easily cross-checked by either pilot. If the copilot thinks his airspeed indicator is acting up, he can look over to the pilot’s airspeed indicator and see if it agrees. If not, both pilot and copilot engage in a bit of triage to determine which instrument is profane and which is sacred."

and redundant processors, to guard against a Single Point of Failure (SPOF).

"On the 737, Boeing not only included the requisite redundancy in instrumentation and sensors, it also included redundant flight computers—one on the pilot’s side, the other on the copilot’s side. The flight computers do a lot of things, but their main job is to fly the plane when commanded to do so and to make sure the human pilots don’t do anything wrong when they’re flying it. The latter is called 'envelope protection'."

But ...

"In the 737 Max, only one of the flight management computers is active at a time—either the pilot’s computer or the copilot’s computer. And the active computer takes inputs only from the sensors on its own side of the aircraft."

As a result of this design error, 346 people are dead. Travis doesn't pull his punches.

"It is astounding that no one who wrote the MCAS software for the 737 Max seems even to have raised the possibility of using multiple inputs, including the opposite angle-of-attack sensor, in the computer’s determination of an impending stall. As a lifetime member of the software development fraternity, I don’t know what toxic combination of inexperience, hubris, or lack of cultural understanding led to this mistake."

He may not know what led to this specific mistake, but he can certainly see some of the systemic issues that made this mistake possible. Among other things, the widespread idea that software provides a cheaper and quicker fix than getting the hardware right, together with what he calls cultural laziness.

"Less thought is now given to getting a design correct and simple up front because it’s so easy to fix what you didn’t get right later."

Agile, huh?


Update: CNN finds an unnamed Boeing spokesman to defend the design.

"Single sources of data are considered acceptable in such cases by our industry".

OMG, does that mean that there are more examples of SSOT elsewhere in the Boeing design!?




How a Single Point of Failure (SPOF) in the MCAS software could have caused the Boeing 737 Max crash in Ethiopia (DMD Solutions, 5 April 2019) - provides a simple explanation of Fault Tree Analysis (FTA) as a technique to identify SPOF.

Mike Baker and Dominic Gates, Lack of redundancies on Boeing 737 MAX system baffles some involved in developing the jet (Seattle Times 26 March 2019)

Curt Devine and Drew Griffin, Boeing relied on single sensor for 737 Max that had been flagged 216 times to FAA (CNN, 1 May 2019) HT @marcusjenkins

George Leopold, Boeing 737 Max: Another Instance of ‘Go Fever”? (29 March 2019)

Mary Poppendieck, What If Your Team Wrote the Code for the 737 MCAS System? (4 April 2019) HT @CharlesTBetz with reply from @jpaulreed

Gregory Travis, How the Boeing 737 Max Disaster Looks to a Software Developer (IEEE Spectrum, 18 April 2019) HT @jjn1 @ruskin147

And see my other posts on the Single Source of Truth.


Updated  2 May 2019

Monday, January 15, 2018

Bus Safety Announcement

Transport for London (TfL) reckons around 3000 people are injured every year by slips, trips and falls on London buses. So it is running trials of an automated system that announces the departure of the bus from the stop.
"Please hold on, the bus is about to move"

or as Bon Jovi might say
"We've gotta hold on ready or not."

The problem is that these alerts often come after the bus is already halfway down the road.
"Whoa, we're half-way there."

As the BBC News explains, the timing of the alert is based on the average amount of time a bus would spend at a bus stop, and is often hopelessly inaccurate. Passengers have taken to social media in droves to complain or mock. Many have wondered whether it was such a problem in the first place, and whether an alert would help to alleviate the problem. Others have pointed out the potential value of such an alert for certain categories of passenger - such as the elderly or visually impaired - but of course this only works if the alert comes at the right time.



I haven't spoken to anyone at TfL about this, but I can imagine what happened. In order to get a trial up and running quickly, they didn't have time (or permission) to link the alert with any of the systems on board the bus that could have sent a more accurate event signal. So we have a stand-alone system, knocked up quickly, as an experimental solution to a problem that most people hadn't previously recognized. In the trimodal scheme, this is a classic Pioneer project.
"For love we'll give it a shot."

So if the trial isn't laughed into touch, then maybe the Settlers can take over and do the alert properly.
"Take my hand, we'll make it. I swear."

And the Town Planners can come up with a joined-up long-term vision for passenger comfort and safety. Altogether now ...
"Whoa, livin' on a prayer."



Update

The wording of the announcement has changed, but the timing hasn't. It now says.
"Please hold on while the bus is moving"

What can I say to that?
"Standing on the ledge, I show the wind how to fly. When the world gets in my face, I say, have a nice day."





Londoners hit out at 'mistimed' bus safety alerts (BBC News, 14 January 2018)

Nadia Khomami, Please hold on: TfL urged to get a grip over annoying bus warnings (Guardian, 15 January 2018)

Eleanor Rose, TfL anger London commuters again with replacement bus announcement that is 'still annoying as hell' (Standard, 26 January 2018)

Londoners baffled by 'bonkers' bus safety announcements warning them 'the bus is about to move' (Evening Standard,15 January 2018).


For more on Trimodal IT, see my post Beyond Bimodal (May 2016)

Tuesday, November 10, 2009

How Dashboards Work

There are three ways of understanding a dashboard.

Symbolic. A dashboard simplifies and codifies What-Is-Going-On.

Imaginary. A dashboard creates an illusion of an accurate and comprehensive picture of What-Is-Going-On.

Real. The dashboard always falls short of representing What-Is-Going-On. There is always a shadow - something left over that eludes simple capture and representation, often remaining invisible or inaccessible.

Dashboard design and development often focuses on the symbolic - defining the contents of the dashboard as a aggregation of services and data feeds, defining the events that are to be displayed (for example as warning lights), and setting simple policies that set thresholds of attention for predictable events.

Some people seem to view this as a key element of systems thinking. For example Bas de Baar, who blogs under the name "Project Shrink", identifies Systems Thinking As A Technique To Find Project Problems and claims "If I have the right metrics I can ignore everything around me and focus just on the dashboard." He appears to justify this claim by comparing project managers with fish (Swimming Upstream The Information Flow). Fish may be reasonably well adapted to many environments, but they cannot deal with the complex threats posed by the much more intelligent dolphin, as I pointed out in my blog on Lean versus Complex.

However, an emphasis on the dashboard as a simple collection of metrics overlooks the way the dashboard is used within a socio-technical system. Isaac Asimov wrote a story called Reason in which a robot controlled a dashboard perfectly, while refusing to believe in any system beyond the dashboard, but of course that's science fiction.

If we imagine that the purpose of a dashboard is to support prompt and appropriate action in a wide range of normal and abnormal operating conditions, then it should support as much (human, organizational) intelligence as is required to maintain the viability and safety of the system in a given complex environment.

One of the lessons from the Three Mile Island disaster was that when something goes seriously wrong, all the red lights on the dashboard start flashing at the same time, and unless the people in charge of the system have some way of making sense of the emergency (emerging situation), they may make things worse rather than better. (Deming calls this tampering or meddling.)

The safe operation of the nuclear power plant is not just about the design of the dashboard, or about the training of the operators, but about the whole system producing good outcomes in all circumstances. In the Springfield Nuclear Plant, the risk comes not just from idiot operators (Homer Simpson) but from corrupt managers (Montgomery Burns).

Many dashboards are designed merely to aggregate and push information into a social system that the dashboard designer doesn't bother to understand. Prior to Three Mile Island, this was often true even in safety-critical situations. I trust that the safety-critical world has now learned this lesson, but dashboards in other contexts may not be so conscientiously designed and tested.

In management information systems, a dashboard focuses executive attention onto specific aspects of the business. But there seems to be an important difference between a random collection of KPIs or guiding ratios, and a joined-up view that helps the executive reason about the business as a whole system. The standard dashboard is lacking at least two elements of organizational intelligence: Sense-Making (helping the executive see how the different items are interrelated) and Double-Loop Learning (not just getting better at meeting fixed targets, but setting more appropriate targets).

So a dashboard needs to be designed to perform a clear function within an action-based command and control system, rather than merely a simplistic reporting function.

However, there are two traps here. Firstly the hubris of the designer, imagining that a complete understanding of a complex system is possible, or expecting to produce a perfect shadow-free dashboard. Secondly, the blinkered vision of the operator, staring at the dashboard and failing to look out of the window.

In any case, management-by-dashboard seems a pretty unauthentic and disengaged way of running a company. Perhaps it is ironic that HP, one of the leading technology companies of our time, boasts its co-founder Dave Packard as one of the earliest modern practitioners of the opposite technique - "management by walking around". (HP Timeline 1940s)



This blog is an edited version of my contributions to a discussion on the Lenscraft Linked-In group. Thanks to Aidan Ward, Geoff Elliott and Hans Lodder for their stimulating comments.

Related posts: Does Cameron's Dashboard App Improve the OrgIntelligence of Government? (January 2013), Big Data and Organizational Intelligence (November 2018)