PAV is a lightweight ontology for tracking Provenance, Authoring and Versioning. PAV supplies terms for distinguishing between the different roles of the agents contributing content in current web based systems: contributors, authors, curators and digital artifact creators. The ontology also provides terms for tracking provenance of digital entities that are published on the web and then accessed, transformed and consumed.
Note: PAV does not define any classes, and the PAV properties do not put any explicit restrictions on their domain/ranges. Therefore the classes above, like “another resource”, are only for illustration of typical use. The diagram above does not show data properties attached to resources, like pav:createdOn.
Here’s an example of using PAV:
This example shows how the blog post
http://example.com/blog.html was createdBy Alice. The blog post was createdWith the software WordPress. The content of the blog was importedFrom
http://example.com/data.csv (presumably a CSV file), and this was importedBy the script csv2html.
Although Alice is the creator (as she made the blog post),
http://example.com/blog.html is authoredBy Bob, he made the original data and therefore also is the author of (the content of) the blog post. The post was curatedBy Charlie, who perhaps edited the CSV (or HTML) to include the correct column headers. We also notice that the blog was importedOn March 2013, while the content was authoredOn December 2012. We don’t know when Charlie curated it, although this could have been provided with curatedOn
Additional PAV properties allows specifying attributions like contributors, the provider in addition to other kind of sources, such as direct downloading, verification against source material and derivation when further refinements have been made. Data can be given a version number, indicate its lineage to a previous version, and indicate when a source was last updated.
The PAV approach
The goal of PAV is to provide a lightweight, straight forward way to give the essential information about authorship, provenance and versioning, and therefore these properties are described directly on the published resource. As such, PAV does not define any classes or restrict domain/ranges, as all properties are applicable to any online resource.
This “flat” approach mean that it is easy to use and query PAV without a deep understanding of provenance models, but at a small cost that more complex relationships are not expressed. For instance, pav:authoredBy allows multiple authors, but if there are multiple authors we won’t know who wrote what; or when they did so. Such details can be included alongside PAV using other PROV statements.
Combining PAV with other PROV extensions
Here’s an example combining PAV with another PROV extension, the Provenance Vocabulary.
This example shows how
http://example.com/data.csv was downloaded from
http://example.org/originalData with HTTP 1.1 and requesting
Accept: text/csv by content negotiation. The PAV term
pav:retrievedFrom gives the short-cut to the original data, while
prv:retrievedBy gives details of the transport and content-negotiation used in the download.
By combining vocabularies in such an approach it is possible to query the provenance for PAV statements in order to get a general overview of the provenance, and then explore other PROV statements for more specific details, which structure might not be known in advance.
If your ontology extends or uses PAV, you can use either:
for the latest version (which might at some point include additional properties), or
for the latest patch version of 2.1 (ie. no new terms will be added later).
Extracting PROV-O statements
As PAV is meant as a lightweight ontology, the inferred PROV-O statements are not usually explicitly included. Any OWL or RDFS reasoner should be able to infer the PROV-O statements as long as the PAV ontology is imported from http://purl.org/pav/
As an example of PAV interoperability with PROV, we built a Taverna workflow which uses the OWL reasoner Pellet to infer PROV statements, and then visualize this as SVG using the PROV toolbox. Here’s the diagram (as PNG) visualizing the PAV example from the beginning of this page: