Units Categories and Measurements

20 Copyright © 2012 Open Geospatial Consortium. Additional attributes are attractive, since they enrich the semantic data available without imposing a great deal of processing overhead. The interpolation type metadata corresponds to a timeseries implementation of CF conventions cell methods. If a timeseries has a single, consistent interpolation type, then the cell methods variables can be used to accurately characterise the time bounds of the interpolation type – see Section 8.7.

8.3 Units

Units of measure in WaterML are specified using the Unified Code for Units of Measure UCUMSchadow McDonald, 2009. NetCDF conventions use the UDUNITS-2 unit database Unidata Program Center, 2011. Both of these systems provide a mechanism for the specification of units in an unambiguous manner. In most cases, the differences between UCUM and UDUNITS-2 systems are syntactic and the units can be readily translated. One exception is that NetCDF attributes referring to longitude and latitude use units of degrees_east and degrees_north, attaching an additional level of interpretation to the unit. In practice, this extra requirement is unlikely to represent a difficulty, as it should be obvious when a location is being used. The WaterML standard allows changes in units within the timeseries by per-point metadata. NetCDF variables have a single unit for the entire variable. NetCDF-encoded WaterML, therefore, needs to have a single default unit for all measurements.

8.4 Categories and Measurements

Simple measurements in WaterML are UML double precision numbers, represented by XML schema doubles in the XML encoding. Categories represent qualitative data that, in WaterML, are described by terms drawn from a controlled vocabulary. Category data can be either encoded as an array of characters, or a string in the enhanced model, or by creating an enumeration and providing a dictionary – see Section 8.1.4. A simple approach to category data is to simply use arrays of characters and provide the category names for each data point. However, classic model character arrays are of a fixed size and tend to be wasteful. Instead, existing conventions such as the Argo float conventions tend to encode category data as enumerations with the meaning of the enumeration given in a reference document. This approach is considerably more space- efficient and emphasises the controlled nature of the terminology. Since the category vocabulary is known ahead of time, the enumerationdictionary approach is preferred.

8.5 Locations