Saturday, March 07, 2009

Modelling Identity and Context

Language is Strange says Don Ferguson. He's talking about the curious difference in Spanish between ser and estar.

  • ser is used for permanent characteristics (identity). For example, Don is American, I am British.
  • estar is used for transient characteristics (context). I don't know exactly where Don is at the moment, but I'm in London. Don is now working for CA; he was working for Microsoft, and IBM before that.

This is an important distinction to make when designing information systems and information services. It affects the way databases and database keys can be designed (if it can change you shouldn't use it as a permanent identifier) and it affects the set of operations that have to be designed (if it can change, you have to be able to change it).

The question is: how do you decide whether something is permanent or transient. As Don points out, some of the distinctions in Spanish may seem counter-intuitive to an English-speaker. Location is always assumed to be transient, even for fairly solid things like the American Embassy, while origin (as in "I'm from Massachusetts") is assumed to be permanent.

When you are modelling a large complex enterprise, you really don't want to be working this kind of thing out on a case-by-case basis. So it is generally good enough to adopt some broad patterns (e.g. LOCATION is always going to be transient) even if it occasionally looks wrong. (It's usually better to err on the side of allowing something that isn't always going to be needed, rather than banning something someone might want.)

The "good enough" strategy is known as "satisficing". It may not be as perfect as full optimization, but it's a lot quicker. Clearly Spanish is good enough.

1 comment:

Donald said...

Hmm. I basically agree with your suggestions but not completely. I would argue that there is one operator in the model, "is." There should be a modifier (permanent, temporary). I think this is how database constraints and UML OCL work. Spanish (and English) are poorly design domain specific languages (just kidding).

I am actually interested in trying to define DSLs for various sub-domains of SOA. Let me know if you are interested.