Metadata Provenance Retrieval Workflows

Copyright © 2013 Open Geospatial Consortium. 19 In this design, the WPS is the central component to manage the airport map generation and contains all the business logic. An advantage is that the WFS and FPS components remain general and do not require any modification that is the specific for the ePIB setup.

6.1.3 Metadata Provenance Retrieval

Dataset Metadata can be defined as data about a dataset. One class of metadata of particular interest in OWS-9 Aviation is data provenance or lineage, which records the salient characteristics of the evolution of shared datasets, typically due to quality additions or enhancements by different users. The creation of such metadata is meant to tell a potential data user about its quality, whether it is fit for purpose, and provide guidance on its use and re-use. The value of metadata increases exponentially with the number of different responsible parties involved in sharing and updating information resources in distributed data environments. In the metadata creation, discovery, access, and management sub-tasks of the OWS-9 Aviation thread, we implemented an aviationweather metadata profile of ISO 19139 and tested its handling and processing by a community of users of OWS-9 software services. The common metadata e.g. license terms about aviation datasets AIXMWXXM and services published by authorized distributors was used to discover and access shared resources e.g. data, services, styling information and answer the following questions: ฀ Who was the creatorpublisher? ฀ Who has permission to use the resources and what are the terms? ฀ When was the data last modified? How was the data transformed? ฀ Where can it be accessed? How can it be processed and by which services? ฀ Which applications, software configuration or tool’s settings were used to process the data? Six distinguished categories of metadata are summarized as follows: 1. Lineageprovenance - identifies the original sources from which the data was derived and details the processing steps through which the data has gone to reach its current form. The traceability of the data can help determine its accuracy and relevance tasks by different users. 2. Quality - determine fitness for use, conformance to design specifications and fulfillment of requirements of a coordinated effort. 3. Accuracy - degree of correspondence between data and the real world depending on the quality and precision of the instruments used to capture the data. 4. Currency - measures the degree of temporal relevance of the data to the present real world. 20 Copyright © 2013 Open Geospatial Consortium. 5. Precision - represents the exactness of measurement or description and is determined by the input. Data can always be output at lower, but not higher, resolution. 6. Scale - is the ratio of distance on a map to the equivalent distance on the Earth’s surface Based on the assumption that the proposed Data_Quality extension to the CIM model is adopted, there are several steps involved in publishing provenance data. More detailed information on the proposed Data_Quality extension can be found in 07-038 OGC Cataloging of ISO Metadata CIM Using the ebRim Profile of CS-W discussion paper. Different components of the model must be published by different actors, as shown in the sequence diagram in Figure 3. In OWS-9, the Data Management Service DMS was a primary user of the metadata provenance workflow. The sets of activities involved in this workflow are summarized below the Figure. WFS Stakeholder DMS 1.0 Register via Harvest Method 4.0 FindDataSet 7.0 Publish Provenance Registry 8.0 Query Provenance 2.0 Publish Service Info 9.0 Format Results 3.0 FindWFS WFS Registration Process DMS Service Registration DMS Processing Declaration DMS Processing Event 5.0 Publish Source Declaration 6.0 Source Identifier 10.0 Provenance Data Stakeholder Review Browse Provenance Copyright © 2013 Open Geospatial Consortium. 21 Figure 3 Metadata Provenance Workflow 1. WFS Registration Process - DataMetadata was published along with DataQuality and Lineage objects using CSW-ebRIM with CIM model. 2. DMS Service Registration - This is a one-time step to publish service identification information e.g. Organization object. 3. DMS Processing Declaration - For each DMS processing option, the DMS declared its intent to capture provenance for a given WFSDataset pair by publishing a Source object describing DMS option e.g. filtering approach and associating it to the Lineage object for the selected WFS. 4. DMS Processing Event - For each DMS request processed, provenance details were published to a ProcessSetp instance related to the appropriate DMS Source entry specified by identifier. 5. Stakeholder ReviewBrowse Provenance - Stakeholders had the capability of querying provenance data in a variety of ways using OGC Filter queries. The Registry formatted the results if response content needed to be something other than XML.

6.1.4 Data Management Service