Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.
The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.
Paolo Missier, Khalid Belhajjame and James Cheny gave a tutorial at the EDBT conference on 2013-03-20 in Genova, Italy. The tutorial provided an account of these specifications. Starting from intuitive and informal examples that present idiomatic provenance patterns, it progressively introduces the relational model of provenance along with the constraints model for validation of provenance documents, and concludes with example applications that show the extension points in use.
The tutorial is in three parts, each about 30 minutes long, and consists of the following material:
- Part I: The PROV Data Model and PROV Notation (Paolo Missier)
- Part II: Constraints of the Provenance Model (James Cheney)
- Reference document: PROV-CONSTR
- Part III: Known extensions and applications (Khalid Belhajjame).
- Reference document: PROV Implementations
There is also a short paper describing the motivation, structure and content of the tutorial, published in the EDBT’13 proceedings: The W3C PROV family of specifications for modelling provenance metadata, Paolo Missier, Khalid Belhajjame, and James Cheney