Saturday, September 04, 2021

Metadata as a Process

Data sharing and collaboration between different specialist areas requires agreement and transparency about the structure and meaning of the data. This is one of the functions of metadata.

I've been reading a paper (by Professor Paul Edwards and others) about the challenges this poses in interdisciplinary scientific research. They identify four characteristic features of scientific metadata, noting that these features can be found within a single specialist discipline as well as cross-discipline.

  • Fragmentation - many people contributing, no overall control
  • Divergent - multiple conflicting versions (often in Excel spreadsheets)
  • Iterative - rarely right first time, lots of effort to repair misunderstandings and mistakes
  • Localized - each participant is primarily focused on their own requirements rather than the global picture

They make two important distinctions, which will be relevant to enterprise data management as well.

Firstly between product and process. Instead of trying to create a static, definitive set of data definitions and properties, which will completely eliminate the need for any human interaction between the data creator and data consumer, assume that an ongoing channel of communication will be required to resolve emerging issues dynamically. (Some of the more advanced data management tools can support this.)

Secondly between precision and lubrication. Tight coupling between two systems requires exact metadata, but interoperability might also be achievable with inexact metadata plus something else to reduce any friction. (Metadata as the new oil, perhaps?)

Finally, they observe that metadata typically falls into the category of almost standards.

Everyone agrees they are a good idea, most have some such standards, yet few deploy them completely or effectively.

Does that sound familiar? 

J Bates, The politics of data friction (Journal of Documentation, 2017)

Paul Edwards, A Vast Machine (MIT Press 2010). I haven't read this book yet, but I found a review by Danny Yee (2011)

Paul Edwards, Matthew Mayernik, Archer Batcheller, Geoffrey Bowker and Christine Borgman, Science Friction: Data, Metadata and Collaboration (Social Studies of Science 41/5, October 2011), pp. 667-690. 

Martin Thomas Horsch, Silvia Chiacchiera, Welchy Leite Cavalcanti and Björn Schembera, Research Data Infrastructures and Engineering Metadata. In Data Technology in Materials Modelling (Springer 2021) pp 13-30

Jillian Wallis, Data Producers Courting Data Reusers: Two Cases from Modeling Communities (International Journal of Digital Curation, 2014, 9/1, 2014) pp 98–109

No comments:

Post a Comment