Friday, November 15, 2019

Beyond Trimodal - Citizens and Tourists

I've been hearing a lot about citizens recently. The iPaaS vendors are all talking about citizen integration, and at the Big Data London event this week I heard several people talking about citizen data science.

Gartner's view is that development can be divided into two modes - Mode 1 (conventional) and Mode 2 (self-service), with the citizen directed towards Mode 2. For a discussion of how this applies to Business Intelligence, see my post From Networked BI to Collaborative BI (April 2016).

There is a widespread view that Gartner's bimodal approach is outdated, and that at least three modes are required - a trimodal approach. Simon Wardley's version of the trimodal approach characterizes the roles as pioneers, settlers and town planners. My initial idea on citizen integration was that the citizen-expert spectrum could be roughly fitted into this approach as follows. If the town planners had set things up properly, this would enable easy integration by pioneers and settlers. See my recent posts on DataOps - Organizing the Data Value Chain and Strategy and Requirements for the API ecosystem.

But even if this works for citizen integration, it doesn't seem to work for data science. In her keynote talk this week, Cassie Kozyrkov of Google discussed how TensorFlow had developed from version 1.x (difficult to use, and only suitable for data science pioneers) to version 2.x (much easier to use and suitable for the citizen). And in his talk on the other side of the hall, Chris Williams of IBM Watson also talked about advance in tools that made data science much easier.


That doesn't mean that everyone can do data science, at least not yet, nor that the existing data science skills will become obsolete. Citizen data scientists are those who take data science seriously, who are able and willing to acquire some relevant knowledge and skill, but are performing data science in support of their main job role rather than as an occupation in its own right. 

We may therefore draw a distinction between two types of user - the citizen and the tourist. The tourist may have a casual interest but no serious commitment or responsibility. An analytics or AI platform may well provide some self-service support for tourists as well as citizens, but these will need to be highly constrained in their scope and power.

Now if we add the citizen and the tourist to the pioneer, settler and town planner, we get a pentamodal approach. The tourists may visit the pioneers and the towns, but probably isn't very interested in the settlements. Whereas the citizens mainly occupy the settlements - in other words, the places built by the settlers.

I wonder what Simon will make of this idea?


Andy Callow, Exploring Pioneers, Settlers and Town Planners (3 January 2017)

Jen Underwood, Responsible Citizen Data Science. Yes, it is Possible (9 July 2019)


For further discussion and references on the trimodal approach, see my post Beyond Bimodal (May 2016)

Thursday, November 07, 2019

On Magic Numbers - Privacy and Security

People and organizations often adopt a metrical approach to sensemaking, decision and policy. They attach numbers to things, perhaps using a weighted scorecard or other calculation method, and then make judgements about status or priority or action based on these numbers. Sometimes called triage.

In the simplest version, a single number is produced. More complex versions may involve producing several numbers (sometimes called a vector). For example, if an item can be represented by a pair of numbers, these can be used to position the item on a 2x2 quadrant. See my post Into The Matrix.

In this post, I shall look at how this approach works for managing risk, security and privacy.



A typical example of security scoring is the Common Vulnerability Scoring System (CVSS), which assigns numbers to security vulnerabilities. These numbers may determine or influence the allocation of resources within the security field.

Scoring systems are sometimes used within the privacy field as part of Privacy by Design (PbD) or Data Protection Impact Assessment (DPIA). The resultant numbers are used to decide whether something is acceptable, unacceptable or borderline. And in 2013, two researchers at ENISA published a scoring system for assessing the severity of data breaches. Scores less than 2 indicated low severity, scores higher than 4 indicated very high severity.

The advantage of these systems is that they are (relatively) quick and repeatable, especially across large diverse organizations with variable levels of subject matter expertise. The results are typically regarded as objective, and may therefore be taken more seriously by senior management and other stakeholders.

However, these systems are merely indicative, and the scores may not always provide a reliable or accurate view. For example, I doubt whether any Data Protection Officer would be justified in disregarding a potential data breach simply on the basis of a low score from an uncalibrated calculation.

Part of the problem is that these scoring systems operate a highly simplistic algebra, assuming you can break a complex situation into an number of separate factors (e.g. vulnerabilities), and then add them back together with some appropriate weightings. The weightings can be pretty arbitrary, and may not be valid for your organization. More importantly, as Marc Rogers argues (as reported by Shaun Nichols), the more sophisticated attacks rely on combinations of vulnerabilities, so assessing each vulnerability separately completely misses the point.

Thus although two minor bugs may have low CVSS ratings, interaction between them could allow a high severity attack. It is complex, but there is nothing in the assessment process to deal with that, Rogers said. It has lulled us into a false sense of security where we look at the score, and so long as it is low we don't allocate the resources.

One organization that has moved away from the scorecard approach is the Electronic Frontier Foundation. In 2014, they released a Secure Messaging Scorecard for evaluating messaging apps. However, they later decided that the scorecard format dangerously oversimplified the complex question of how various messengers stack up from a security perspective, so they archived the original scorecard and warned people against relying on it.




Nate Cardozo, Gennie Gebhart and Erica Portnoy, Secure Messaging? More Like A Secure Mess (Electronic Frontier Foundation, 26 March 2018)

Clara Galan Manso and Sławomir Górniak, Recommendations for a methodology of the assessment of severity of personal data breaches (ENISA 2013)

Shaun Nichols, We're almost into the third decade of the 21st century and we're still grading security bugs out of 10 like kids. Why? (The Register, 7 Nov 2019)

Wikipedia: Common Vulnerability Scoring System (CVSS)

Related posts: Into The Matrix (October 2015), False Sense of Security (June 2019)