Tuesday, February 24, 2009

Modelling Complex Classification

Andrea Westerinen (Microsoft) posts some modelling guidelines, and she was told that some people's heads exploded when reading them. She identifies three fundamental modelling concepts, which she draws from the work of Guarino and Welty.
  • Essence - these are properties that are true for all instances of a class and that "define" the semantics of the class, such as the "property" of "being human"
  • Identity - properties that determine the equality of instances
  • Unity - properties that define where the boundary of an instance is, or distinguish the parts of a "whole"

In my work on information modelling (for example in my 1992 book) I have long emphasized the importance of understanding semantic identity (how does something count as being "the same again") and semantic unity (which I tend to call membership - how does something count as inside or outside). 

But I have been critical of the assumption that we always define a class in terms of essential properties. This is known as monothetic classification, and can be contrasted with polythetic classification, which defines a class in terms of characteristic properties. As I teach in my information modelling workshops, many important objects of business attention are not amenable to simple monothetic classification (for example how does a commercial firm decide who counts as a COMPETITOR, how does the police decide who counts as a SUSPECT) and require a more complex logic. 

If you are just building transaction-based systems and services, you may be able to fudge these semantic issues. But if you want information services to support business intelligence and decision-support as well as transaction systems, then you have to get the semantics right. 

Of course I can see why Guarino, Welty and Westerinen want to insist on monothetic classification (which they call rigidity). But then how can they model the more fluid and fuzzy (and I think more interesting) business requirements? 

 (Sometimes the practitioners of "ontology" like to think that they are dealing with supremely abstract and generalized stuff, but if they make too many simplifying assumptions of that kind then their ontologies aren't sufficiently generalized after all.)

 


Rodney Needham, Polythetic Classification: Convergence and Consequences (Man, 10:3, September 1975), pp. 349-369. 

Richard Veryard, Information Modelling - Practical Guidance (Prentice-Hall 1992) pp 99-100

Stanford Encyclopedia of Philosophy: Wittgenstein on Family Resemblance

Wikipedia: Family Resemblance


No comments:

Post a Comment