Wednesday, March 04, 2020

Economic Value of Data

How far can general principles of asset management be applied to data? In this post, I'm going to look at some of the challenges of putting monetary or non-monetary value on your data assets.

Why might we want to do this? There are several reasons why people might be interested in the value of data.
  • Establish internal or external benchmarks
  • Set measurable targets and track progress
  • Identify underutilized assets
  • Prioritization and resource allocation
  • Threat modelling and risk assessment (especially in relation to confidentiality, privacy, security)
Non-monetary benchmarks may be good enough if all we want to do is compare values - for example, this parcel of data is worth a lot more than that parcel, this process/practice is more efficient/effective than that one, this initiative/transformation has added significant value, and so on.

But for some purposes, it is better to express the value in financial terms. Especially for the following:
  • Cost-benefit analysis – e.g. calculate return on investment
  • Asset valuation – estimate the (intangible) value of the data inventory – e.g. relevant for flotation or acquisition
  • Exchange value – calculate pricing and profitability for traded data items

There are (at least) five entirely different ways to put a monetary value on any asset.
  • Historical Cost The total cost of the labour and other resources required to produce and maintain an item. 
  • Replacement Cost The total cost of the labour and other resources that would be required to replace an item. 
  • Liability Cost The potential damages or penalties if the item is lost or misused. (This may include regulatory action, reputational damage, or commercial advantage to your competitors, and may bear no relation to any other measure of value.) 
  • Utility Value The economic benefits that may be received by an actor from using or consuming the item. 
  • Market Value The exchange price of an item at a given point in time. The amount that must be paid to purchase the item, or the amount that could be obtained by selling the item. 

But there are some real difficulties in doing any of this for data. None of these difficulties are unique to data, but I can't think of any other asset class that has all of these difficulties multiplied together to the same extent.

  • Data is an intangible asset. There are established ways of valuing intangible assets, but these are always somewhat more complicated than valuing tangible assets.
  • Data is often produced as a side-effect of some other activity. So the cost of its production may already be accounted for elsewhere, or is a very small fraction of a much larger cost.
  • Data is a reusable asset. You may be able to get repeated (although possibly diminishing) benefit from the same data.
  • Data is an infinitely reproducible asset. You can sell or share the same data many times, while continuing to use it yourself. 
  • Some data loses its value very quickly. If I’m walking past a restaurant, this information has value to the restaurant. Ten minutes later I'm five blocks away, and the information is useless. And even before this point, suppose there are three restaurants and they all have access to the information that I am hungry and nearby. As soon as one of these restaurants manages to convert this information, its value to the remaining restaurants becomes zero or even negative. 
  • Data combines in a non-linear fashion. Value (X+Y) is not always equal to Value (X) + Value (Y). Even within more tangible asset classes, we can find the concepts of Assemblage and Plottage. For data, one version of this non-linearity is the phenomenon of information energy described by Michael Saylor of MicroStrategy. And for statisticians, there is also Simpson’s Paradox.

The production costs of data can be estimated in various ways. One approach is to divide up the total ICT expenditure, estimating roughly what proportion of the whole to allocate to this or that parcel of data. This generally only works for fairly large parcels - for example, this percent to customer transactions, this percentage to transport and logistics, etc.  Another approach is to work out the marginal or incremental cost: this is commonly preferred when considering new data systems, or decommissioning old ones. We can compare the effort consumed in different data domains, or count the number of transformation steps from raw data to actionable intelligence.

As for the value of the data, there are again many different approaches. Ideally, we should look at the use-value or performance value of the data - what contribution does it make to a specific decision or process, or what aggregate contribution does it make to a given set of decisions and processes. 
  • This can be based on subjective assessments of relevance and usefulness, perhaps weighted by the importance of the decisions or processs where the data are used. See Bill Schmarzo's blogpost for a worked example.
  • Or it may be based on objective comparisons of results with and without the data in question - making a measurable difference to some key performance indicator (KPI). In some cases, the KPI may be directly translated into a financial value. 
However, comparing performance fairly and objectively may only be possible for organizations that are already at a reasonable level of data management maturity.

In the absence of this kind of metric, we can look instead at the intrinsic value of the data, independently of its potential or actual use. This could be based on a weighted formula involving such quality characteristics as accuracy, alignment, completeness, enrichment, reliability, shelf-life, timeliness, uniqueness, usability. (Gartner has published a formula that uses a subset of these factors.)

Arguably there should be a depreciation element to this calculation. Last year's data is not worth as much as this year's data, and the accuracy of last year's data may not be so critical, but the data is still worth something.

An intrinsic measure of this kind could be used to evaluate parcels of data at different points in the data-to-information process. For example, showing the increase of enrichment and usability from 1. to 2. and from 2. to 3., and therefore giving a measure of the added-value produced by the data engineering team that does this for us.
    1. Source systems
    2. Data Lake – cleansed, consolidated, enriched and accessible to people with SQL skills
    3. Data Visualization Tool – accessible to people without SQL skills

If any of my readers know of any useful formulas or methods for valuing data that I haven't mentioned here, please drop a link in the comments.

Heather Pemberton Levy, Why and How to Value Your Information as an Asset (Gartner, 3 September 2015)

Bill Schmarzo, Determining the Economic Value of Data (Dell, 14 June 2016)

Wikipedia: Simpson's Paradox, Value of Information

Related posts: Information Algebra (March 2008), Does Big Data Release Information Energy? (April 2014), Assemblage and Plottage (January 2020)


  1. I still believe the best approach to managing data as an asset is one based on the Asset Management standard, ISO 55000. This standard lays out the ‘what’ of asset management very clearly and can be used for intangible assets with a little work. The standard talks about maitenance of assets which needs a little interpretation for data, for example. The framework is designed to interoperate with other ISO frameworks including Quality Assurance and Information Security and adopts a business centred approach to data, which makes it ideal for asset management purposes.
    Value measurements are difficult. You’ve described many methods above and I think there are several other in Doug Laneys book, Infornomics (apologies, I lent my copy out, so I can’t check). The method that I’ve had exec buy in for is contained in the HBR ( but the key point here is that the value of any given data assets depends on the perspective of the person answering the question. Data is worth whatever a party is prepared to pay for it, and that holds true for any of the ways of putting a monetary value on it.

  2. An example of market value serves to illustrate the point. Let’s say you’re an international insurance company specialising in marine cargo, and you have a data set containing details of all marine vessels entering and leaving Rotterdam over the last 3 years, along with how many containers were reported missing for each vessel. You ‘sell’ the data set on a data marketplace such as Dawex. You might have 3 other parties interested.
    • Party 1 is a an broking firm doing due diligence on a new client, the port operator. The data set is going to be worth a small amount as it will only be used as one of many pieces of information for a single deal. They might offer £100 for it.
    • Party 2 is another insurance company that also specialises in marine cargo. For them, the data set has a high value because it can provide insight on a competitor. They might offer £5000 for it.
    • Party 3 sells sensors for shipping container that link into an IoT network. For them the data is of medium value. They might offer a straight exchange for similar data they have about East Tilbury port and get their sales people to talk to the vendor in the hopes of selling the insurance company some sensors.
    Similarly if you were calculating utility or replacement costs, you would probably arrive at different values if you were the CFO of the insurance company vs if you were a data scientist using the data to train an algorithm.
    In short, I think there are too many variables involved to have a simple valuation process. It might be possible in years to come but it's a subjective exercise requiring experienced people unless it can be left to market forces.

  3. Thanks for your comments Iain

    I completely agree that the use-value of an asset is context-dependent. If B can make better use of a given asset than A, in other words the use value for B is higher than the use value for A, then it makes sense for A to sell the asset to B, as long as the exchange price is anywhere between the two use values. The actual exchange price may depend on the relative negotiating skill of A and B, or on the presence of a market in which many similar assets are being exchanged.

    The question for B is this - if this asset is available for this price (e.g. market price), is it worth my purchasing the asset for this price. This can only properly be answered if B has some notion how much value they will be able to extract from the asset. As you say, this remains a fairly subjective judgement in most cases.

  4. There is an important difference between data and most other assets. A can continue to extract value from the data even after selling to B. A can also sell the same data to C and D. In your example of marine data, the question is whether exclusive access to the data has a significantly higher use value to one of the potential purchasers than non-exclusive access. In other words, is the data worth enough to Party 2 to pay you NOT to sell it to Party 1 and 3 as well?