Recording authorship, curation and digital creation with the PAV ontology

PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning. PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.

PAV version 2.1.1 was released on 2013-03-27, making PAV an extension of the W3C provenance ontology PROV-O, thus  enabling interoperability between PAV and PROV-compliant tools such as ProvToolbox.



Note: PAV does not define any classes, and the PAV properties do not put any explicit restrictions on their domain/ranges. Therefore the classes above, like “another resource”, are only for illustration of typical use. The diagram above does not show data properties attached to resources, like pav:createdOn.


Here’s an example of using PAV:

@prefix xsd: <> .
@prefix pav: <; .
@prefix foaf: <; .
@prefix prov: <> .
@prefix : <> .
pav:createdBy :alice ;
pav:createdWith :wordpress ;
pav:importedFrom <; ;
pav:importedBy :csv2html ;
pav:authoredBy :bob ;
pav:curatedBy :charlie ;
pav:authoredOn "2012-12-24T15:15:15Z"^^xsd:dateTime ;
pav:importedOn "2013-03-27T10:06:17Z"^^xsd:dateTime .
:alice foaf:name "Alice" .
:bob foaf:name "Bob" .
:charlie foaf:name "Charlie" .
:csv2html a prov:SoftwareAgent ;
foaf:homepage <; .
:wordpress a prov:SoftwareAgent ;
foaf:homepage <; .

view raw
hosted with ❤ by GitHub

Continue reading “Recording authorship, curation and digital creation with the PAV ontology”

Tutorial on the W3C PROV family of specifications

Provenance, a form of structured metadata designed to record the origin or source of information, can be instrumental in deciding whether information is to be trusted, how it can be integrated with other diverse information sources, and how to establish attribution of information to authors throughout its history.

The PROV set of specifications, produced by the World Wide Web Consortium (W3C), is designed to promote the publication of provenance information on the Web, and offers a basis for interoperability across diverse provenance management systems. The PROV provenance model is deliberately generic and domain-agnostic, but extension mechanisms are available and can be exploited for modelling specific domains.

Paolo Missier, Khalid Belhajjame and James Cheny gave a tutorial at the EDBT conference on 2013-03-20 in Genova, Italy. The tutorial provided an account of these specifications. Starting from intuitive and informal examples that present idiomatic provenance patterns, it progressively introduces the relational model of provenance along with the constraints model for validation of provenance documents, and concludes with example applications that show the extension points in use.

Tutorial material

The tutorial is in three parts, each about 30 minutes long, and consists of the following material:

There is also a short paper describing the motivation, structure and content of the tutorial, published in the EDBT’13 proceedings: The W3C PROV family of specifications for modelling provenance metadata, Paolo Missier, Khalid Belhajjame, and James Cheney