Extraction Recommended Approach Scope of Work

63

12.1 Scope of Work

The Data Filtering Module of the DMS will interact with the data to be sent to the client, in order for them to correspond to a certain set of rules that are defined by the client at the session negotiation and can potentially be redefined at any time during the session. The different Information Services already offer filtering options. As long as the queries are compliant with the filter encoding rules [OGC-FES], the client is already able to specify a set of rules to be applied at the source of the data for the delivered information to be more specific to the client’s need. However, this filtering has limits. The specifications for the filtering using FES are only value based. It does not allow the modification of the structure of the document, the removal of whole sections that are not of interest for the consumer, etc. The approach here is to define new means of filtering, to allow the client to really tailor the information it will receive. During the session negotiation, the client shall have the opportunity to define a filtering policy at the DMS. This filtering policy will then be applied to all the data that transit through the DMS to the client.

12.1.1 Extraction

The purpose of the Extraction module is to allow the client to specify a sub set of the information he really intends to get. The client defines this subset at the DMS during the session negotiation, if the feature is available. Several options in the way the filtering policy is specified can be thought of. One of them could be to use a “schemask”. A schemask is a subset of the original document schema, where some elements have been omitted on purpose, because the client doesn’t want need them in the final data received. The idea of the schemask brings great flexibility to the client. However, since no restrictions to its use are foreseen, it will allow the creation of dataset that are not compliant to the original schemas they were built after, which could become problematic for clients that perform XML validation. The purpose is to allow the client to create a mask specifying a schema including only what the client needs. This will have the following advantages: ฀ The client knows upon reception that the dataset only contains what it needs, and therefore doesn’t have to parse useless part of the document to find the relevant information. ฀ Bringing the dataset to the minimum amount of data needed by the client can reduce the amount of information transmitted. Depending on the scarcity of the schemask proposed by the client, this reduction can be considerable, which can represent a very important asset in a low bandwidth environment e.g. SatCom media 64

12.1.2 Recommended Approach

At the session negotiation, when the client and the DMS together define the overall data management policy, the client shall define its extraction policy through the provision of schemasks.There can be different schemask for different type of data, type of features, ... The schemasks shall then constantly be used by the DMS, whenever it receives data intended to the client. The DMS shall parse the data received and remove any element and its descendants that is not present in the schemask. As an example, the following XML represents a simplified version of the schema defining an AIXM:AirspaceTimeSlice : element name=AirspaceTimeSlice type=aixm:AirspaceTimeSliceType complexType complexContent sequence element ref=gml:validTime element ref=aixm:interpretation element ref=aixm:sequenceNumber minOccurs=0 element ref=aixm:correctionNumber minOccurs=0 element ref=aixm:featureLifetime minOccurs=0 element name=designator type=aixm:CodeAirspaceDesignatorType nillable=true minOccurs=0 element name=name type=aixm:TextNameType nillable=true minOccurs=0 element name=geometryComponent type=aixm:AirspaceGeometryComponentPropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=activation type=aixm:AirspaceActivationPropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=annotation type=aixm:NotePropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=extension minOccurs=0 maxOccurs=unbounded complexType sequence element ref=aixm:AbstractAirspaceExtension sequence complexType element sequence complexContent complexType element It the following listing, that represents a potential schema, the client decided that the annotation and extension elements were of no use for the intended usage and therefore removed them in the schemask, that would possibly look like that: 65 element name=AirspaceTimeSlice type=aixm:AirspaceTimeSliceType complexType complexContent sequence element ref=gml:validTime element ref=aixm:interpretation element ref=aixm:sequenceNumber minOccurs=0 element ref=aixm:correctionNumber minOccurs=0 element ref=aixm:featureLifetime minOccurs=0 element name=designator type=aixm:CodeAirspaceDesignatorType nillable=true minOccurs=0 element name=name type=aixm:TextNameType nillable=true minOccurs=0 element name=geometryComponent type=aixm:AirspaceGeometryComponentPropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=activation type=aixm:AirspaceActivationPropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=annotation type=aixm:NotePropertyType nillable=true minOccurs=0 maxOccurs=unbounded element name=extension minOccurs=0 maxOccurs=unbounded complexType sequence element ref=aixm:AbstractAirspaceExtension sequence complexType element sequence complexContent complexType element Then, when the DMS will receive a dataset including an airspace time slice, it shall automatically remove all the elements and their descendants that are not present in the schema. The level of details of the schema shall be the level of filtering required by the client, not more. If the client want to remove elements that are for instance grand children of the root element, if shall not go any further that down 3 levels in the tree. It shall be considered that every element that is present in the schemask implies the presence of all its descent. Any element that is not implies the absence of all its descent. The combined use of Xpath and XSLT, at the DMS, should enable the filtering possibilities proposed above. The DMS shall deduce a XSL stylesheet from the schemask, this stylesheet shall then be used for all data received at the DMS and intended to the client. The advantage of the 66 XSLT is its flexibility. Indeed, the style sheet can easily be modified e.g. on client’s request to addretrievemodify filtering rules at the DMS.

12.1.3 Densification