Friday, March 07, 2008

Event Modelling

In a post entitled Right-Sizing the Event Message, Nick Malik (Microsoft) states two opposing design principles for event modelling, and suggests a technique for producing a reasonable trade-off. In this post, I'm going to suggest an alternative technique, ending with a discussion of the principles.

Principle of Sufficiency: event messages need to have sufficient information in order for the subscriber to decide if it should consume or discard the event without requesting further information from another system

Principle of Simplicity: event messages should be as small as possible (but no smaller)

Nick's Example: You have two CRM systems, separated by geography. A CRM event message therefore needs to contain some geographical information, to enable each CRM system to determine whether to act upon the event or ignore it. But what other information should we include, that might be relevant to any of the would-be consumers of the message?

For the moment, let's accept Nick's statement of the problem. I want to talk about the solution first, before raising some more general issues about these two principles and Nick's example.

Nick's Technique: Follow the structure of the associated data warehouse. The event message should be based on the "fact table" and the first tier "dimension tables" within a "star schema". (Nick's example looks more like a snowflake schema to me, but never mind.) If this structure has already been defined, then easy-peasy - otherwise, follow the same technique as you'd use for modelling a data warehouse. In other words, treat "events" as "facts".

Nick's technique is certainly worth considering, because there are some strong parallels between data mining and (complex) event (stream) processing. Many event stream processing tools allow you (roughly speaking) to build and execute SQL-like enquiries in near-real-time against a continuous stream of events.

Nick is assuming that effectively each consumer (consuming system) operates its own event filter. But in principle, it should be possible for a consuming system to publish its own event-interest-profile ("This system is interested in European-CRM-events"). There might then be some filter or attenuator that reduces the "CRM-events" stream to the "European-CRM-events" stream, or possibly splits the "CRM-events" stream into several geog-specific event streams. (This filter or attenuator might be a stand-alone utility or event processing engine, or it might be embedded in the messaging platform.) Thus each system "owns" a filter, but it delegates the execution of this filter.

But I can go one step further. If I decouple the event filter from the operational response to the event, I can then change one without changing the other. For example, let's suppose we have just opened an office in South Africa. Someone decides to direct the South African CRM events to the European CRM system. This is a coordination-level decision, which should (if we are lucky) be handled purely at the event processing / orchestration level, without making any change to the (operational) CRM system itself.

So we now have two architectural layers: a coordination/orchestration layer, for which the principle of sufficiency is paramount, and an operational layer, for which the principle of simplicity is paramount. We use some form of attenuation, possibly but not necessarily a filter, to convert rich (sufficient) events to simple events.

Furthermore, this solution allows us to handle much more complex selection criteria than simple filtering. For example, for a customer who travels a lot, it may be complicated to work out which CRM system to use. In which system or systems do we handle an event involving a Japanese employee of an American corporation visiting France?

Based on this rearchitected solution, we can reinterpret the two principles we started with. Sufficiency from a coordination perspective, simplicity from an operational perspective, with efficient and effective transduction between the two perspectives.

Finally, let's wonder why on earth there are two CRM systems at all. The situation may have resulted from previous stove-pipe management, or from a history of merger and acquisition, but we might reasonably expect to see an evolutionary development path in which the two systems become one, or at least share an increasing quantity of common services. But this provides another strong reason to remove as much of the logic (which is going to change as the two systems evolve into one) into the orchestration layer.

No comments: