Comparison of models OGC Soil Data Interoperability Experiment

16 Copyright © 2016 Open Geospatial Consortium. The ANZSoilML, INSPIRE Soil and eSoTerML models are implementation ready see Annex B for details, with GML schema for the logical models available at no cost. Production ANZSoilML services have been deployed in both Australia and New Zealand. INSPIRE Soil and ISO SoilML have been implemented for the European Union GS Soil project http:www.eurogeosurveys.orgprojectsgssoil. However, the ISO SoilML cannot be implemented as is. There is no normative schema or register of soil properties and no mechanism for providing the property values. Properties are expected to be assigned to classes in other application and XML schema that extend ISO 28258:2013 – for example, the GS Soil project created a ‘provider specific’ extension of the schema for portal services [1]. Table 1 — Comparison of existing soil information models Green cells F denote aspects of a soil description covered by classes and formally defined properties; yellow cells S denote aspects of a soil description covered by classes and soft-typed properties; white cells X denote aspects of a soil description that are not covered, or handled by a soft-typing mechanism. Model Soil Information ANZSoilML eSoTer INSPIRE Soil ISO SoilML IUSSISO SoilML OGC Soil Data IE Site Registration F S F F F F Identity F S F F F F Location F S F F F F Observation datetime F S F x x x Author S S x x x x Soil Formation F x x x x x ClimateWeather F x x x x x Landform F S x x x x Land use F S x x x x Vegetation F S x x x x Human F x x x x x Parent Material F S F x x x Soil Description F S x x x S Surface character F S x x x S Horizon boundary F S F F F F Primary Constituents F S S x x S Soil colour matrix F S S x x S Mottling F S S x x S Redox S S S x x S Carbonates F S S x x S Gypsum S S S x x S Soluble Salts S S S x x F Field pH F S S x x F Copyright © 2016 Open Geospatial Consortium. 17 Model Soil Information ANZSoilML eSoTer INSPIRE Soil ISO SoilML IUSSISO SoilML OGC Soil Data IE Odour S S S x x S Organic Matter S S S x x S Soil Texture F S S x x F Soil Structure F S S x x S Consistence F S S x x S Soil-water status F S F x x F Bulk density F S S x x F Porosity F S S x x S Concentrations F S S x x S Roots F S F x x S Contamination S S F x x S Bedrock F S S x x S Soil Sampling F S F F F F Location F S F F F F Method F S S F F S Archive F x x F F F Observations F S S F F F Method F S S F F S Result F S F F F F Soil Classification F F F F F F Soil F F F F F F Horizon F F F x x F Vocabularies F SKOS x F Codelist x x F SKOS Soil Mapping F F F F F F Table 2 — Comparison of model accessibility and implementation Model Accessibility Implementation ANZSoilML Public; free of charge Production services eSoTer Public; free of charge Unknown INSPIRE Soil Public; free of charge GS Soil Project ISO SoilML Private; charge for access GS Soil Project IUSSISO SoilML No model No 18 Copyright © 2016 Open Geospatial Consortium. 9 Soil Interoperability Experiment Information Models

9.1 Introduction

As there was no suitable starting model selected from existing candidates, a compromise model was developed using classes and patterns from the existing models wherever possible. The initial focus for modeling was on supporting the IE use cases but the IE Participants were mindful of allowing extension to support soil mapping use cases. With only a 6-month time frame available for the Soil IE, and with the need to be able to map multiple data providers’ data to the model, implement services and develop demonstration clients, minimal soil data modeling was carried out. To constrain the model’s scope, the selection of properties to be assigned to soil classes was based on the specifications of the GlobalSoilMap consortium [4], with the intention to allow the model to expand in a modular fashion. Three models were produced by the IE. 1. A UML conceptual model defining the classes and properties needed to comprehensively describe a soil resource according to the needs of the IE use cases. This model is technology independent. 2. A UML logical model converting the conceptual model into an ISO19109:2005 compliant application schema and formally importing models from other OGC specifications: ISO 19156:2011 – Observations and Measurements OM and OGC 15-043r3 – the Timeseries Profile of Observations and Measurements. 3. An XML physical model defining the encoding of data using Geography Mark-up Language according to the ISO19136:2007.

9.1.1 Modeling Principles

To constrain modeling activity and help ensure that the model defined was not disruptive conforming closely to existing OGC standards or parochial recognizing that soil scientists must integrate data from many environmental domains, the group defined a set of guiding principles, including: — Open publication of the results UML, XSD once approved through the OGC process; — Open development of the model, subject to appropriate agreements with the OGC; — Re-use of existing models wherever possible for example Observations and Measurements, Timeseries Profile of Observations and Measurements, GeoSciML; and Copyright © 2016 Open Geospatial Consortium. 19 — If an existing model does not meet project needs, then work with the relevant community to address issues before resorting to an extension or branch of the model. The motivation for the principles was twofold: — To ensure consistency of data types across domains - soil is a function of geology, climate, topography, and biology, and pedologists must aggregate these data during analysis; and — To ease process of deployment – a client developed for geology or climate observations encoded as OM can be used for soil observations without modification other than dealing with domain specific values. There was a simplification of models from the conceptual to logical to physical viewpoints when classes explicitly defined at a higher level could be realized without loss of information by a single class. This is an equivalent practice to OM’s virtual typing strategy.

9.1.2 Treatment of Mandatory Properties

In most cases cardinality constraints on properties and associations were removed in the logical and physical model. This allowed the definition of a strict conceptual model that was very clear about the properties and associations needed to comprehensively and meaningfully describe a resource, but permitted a terse digital encoding of data when necessary. Conceptually, a distinction is made between properties that are mandatory without them the data has no meaning and mandatory but voidable they should be provided but may not be for some reason. In the translation to the logical and physical models, mandatory properties must remain so, but voidable properties become optional. It is up to the community deploying the services to define implementation profiles that define which properties must be present in that context. Providers can then deploy services that can lie on an appropriate continuum of use case support from comprehensive encodings where all data are provided and reasons are provided if data are missing to terse encodings where only the bare minimum of data required for a specific application are transmitted. Applications may also switch between the two, requesting a terse encoding to display property values, but using a comprehensive data service where appropriate, such as to explore meta-data about how, when and to what quality a value was generated, for dataset exchange or when sending data on to web processing services. Profiles identifying which aspects of the logical model services are expected to provide will require further community agreement.