In previous posts, I looked at Reach (the range of data sources and destinations), Richness (the complexity of data) and Agility (the speed and flexibility of response to new opportunities and changing requirements). Assurance is about Trust.
In 2002, Microsoft launched its Trustworthy Computing Initiative, which covered security, privacy, reliability and business integrity. If we look specifically at data, this mean two things.
- Trustworthy data - the data are reliable and accurate.
- Trustworthy data management - the processor is a reliable and responsible custodian of the data, especially in regard to privacy and security
This is of course a data assurance nightmare - the data are out of control, and it may be easier for hackers to get the data out than it is for legitimate users. And good luck handling any data subject access request!
But in most organizations, you can't eliminate this behaviour simply by telling people they mustn't. If your data strategy is to address this issue properly, you need to look at the causes of the behaviour, understand what level of reliability and accessibility you have to give people, before they will be willing to rely on your version of the truth rather than theirs.
DalleMule and Davenport have distinguished two types of data strategy, which they call offensive and defensive. Offensive strategies are primarily concerned with exploiting data for competitive advantage, while defensive strategies are primarily concerned with data governance, privacy and security, and regulatory compliance.
As a rough approximation then, assurance can provide a defensive counterbalance to the offensive opportunities offered by reach, richness and agility. But it's never quite as simple as that. A defensive data quality regime might install strict data validation, to prevent incomplete or inconsistent data from reaching the database. In contrast, an offensive data quality regime might install strict labelling, with provenance data and confidence ratings, to allow incomplete records to be properly managed, enriched if possible, and appropriately used. This is the basis for the NetCentric strategy of Post Before Processing.
Because of course there isn't a single view of data quality. If you want to process a single financial transaction, you obviously need to have a complete, correct and confirmed set of bank details. But if you want aggregated information about upcoming financial transactions, you don't want any large transactions to be omitted from the total because of a few missing attributes. And if you are trying to learn something about your customers by running a survey, it's probably not a good idea to limit yourself to those customers who had the patience and loyalty to answer all the questions.
Besides data quality, your data strategy will need to have a convincing story about privacy and security. This may include certification (e.g. ISO 27001) as well as regulation (GDPR etc.) You will need to have proper processes in place for identifying risks, and ensuring that relevant data projects follow privacy-by-design and security-by-design principles. You may also need to look at the commercial and contractual relationships governing data sharing with other organizations.
All of this should add up to establishing trust in your data management - reassuring data subjects, business partners, regulators and other stakeholders that the data are in safe hands. And hopefully this means they will be happy for you to take your offensive data strategy up to the next level.
Next post: Developing Data Strategy
Leandro DalleMule and Thomas H. Davenport, What’s Your Data Strategy? (HBR, May–June 2017)
Richard Veryard, Microsoft's Trustworthy Computing (CBDI Journal, March 2003)
Wikipedia: Trustworthy Computing