TOWARDS AUTOMATIC VALIDATION AND HEALING OF CITYGML MODELS FOR GEOMETRIC AND SEMANTIC CONSISTENCY
N. Alam
a
, D. Wagner
a
, M. Wewetzer
b
, J. von Falkenhausen
b
, V. Coors
a
, M. Pries
b
a
HFT Stuttgart – University of Applied Sciences, Faculty C, Schellingstraße 24, 70174 Stuttgart, Germany
nazmul.alam, detlev.wagner, volker.coorshft-stuttgart.de
b
Beuth Hochschule für Technik Berlin – University of Applied Sciences, Department II, Luxemburger Straße 10,
13353 Berlin, Germany mark.wewetzer| julius.von.falkenhausen | margitta.priesbht-berlin.de
KEY WORDS: Healing, Validation, 3D City models, CityGML, Geosemantics ABSTRACT:
A steadily growing number of application fields for large 3D city models have emerged in recent years. Like in many other domains, data quality is recognized as a key factor for successful business. Quality management is mandatory in the production chain
nowadays. Automated domain-specific tools are widely used for validation of business-critical data but still common standards defining correct geometric modeling are not precise enough to define a sound base for data validation of 3D city models. Although
the workflow for 3D city models is well-established from data acquisition to processing, analysis and visualization, quality management is not yet a standard during this workflow. Processing data sets with unclear specification leads to erroneous results and
application defects. We show that this problem persists even if data are standard compliant. Validation results of real-world city models are presented to demonstrate the potential of the approach. A tool to repair the errors detected during the validation process
is under development; first results are presented and discussed. The goal is to heal defects of the models automatically and export a corrected CityGML model.
1. INTRODUCTION
Application and analysis of geo data is moving from traditional GIS applications with 2D map data towards deployment of real
3D data. Virtual 3D city models become more and more available for urban areas. More sophisticated tools for data
analysis and information extraction are under development. Quality assessment becomes mandatory because reliable and
reproducible processing results can only be obtained with correct original data.
Different views on the term “correctness” exist, existing standards such as ISO 19107 or CityGML
spectification provide a good starting point. This is not sufficient for an unambiguous definition of modeling
guidelines. Consequently, a discussion of the definition of guidelines for modelers and users and methods to check the data
set for compliance with these specifications are necessary. A general overview of the concept of data quality in the
geographic domain is included in Kresse Fadaie 2004, which offers a comprehensive summary of the relevant
standards, notably of the ISO 19100 series. The paper of Akca et al. 2010 has a focus on geometric accuracy with respect to
the generation process of a model from Lidar data. Discussing the problems of polygonal models, Krämer et al. 2007 define
quality measurements for 3D city models. Some simple algorithms for quality assessment and healing of geometries are
presented. Campen et al. 2012 provide an extensive collection of typical
defects of polygonal 3D models and existing techniques for processing and repair with respect to different fields of
application. A detailed analysis of completeness and separation issues in city models is presented by Zhao et al. 2012. They
consider typical properties of semi-automatic generated models and their insufficiencies and develop a generalization method.
However, other geometric errors are not investigated. Limited research was done regarding healing of 3D city models
so far. Approaches to repair triangle meshes such as Liepa 2003 and Attene Falcidieno 2006 exist but can only be tied
loosely to our approach. We map CityGML features to an internal data structure which is designed to maintain links to the
semantic properties of the original model. Using volumetric techniques, as suggested by Nooruddin Turk 2003 requires
conversion to a voxel representation which creates difficulties in maintaining model-inherent semantics. An alternative
approach is presented recently by Ledoux 2013. A top-down approach is described as favorable because it enables repairing a
model in one single step. The implementation shows that a hierarchical processing of the model is necessary before the
actual volume-based approach for healing solid defects can be performed. We present an overview of our research results
leading to the definition of certain quality criteria for CityGML models and the development of an automated validation tool. A
quality report is the result of this processing step. It includes detailed descriptions of all detected errors. This information is
used as input of a healing process which tries to repair as many errors automatically as possible. The healing procedures are
described in detail and experiences with the tool are discussed.
2. AUTOMATIC VALIDATION
2.1 Validation Rules
A validated a data set is expected to be clean, correct and useful for a given application. This implies that different sets of
validation rules exist, depending on the intended application. We separate the validation process in two general steps: first a
schema validation for CityGML data to assure schema conformant input to the second step, geometric and semantic
validation of the data set. Only the second step is discussed here ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences,
Volume II-2W1, ISPRS 8th 3DGeoInfo Conference WG II2 Workshop, 27 – 29 November 2013, Istanbul, Turkey
This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 1
because XML schema validation is a standard procedure for which sophisticated tools are available.
For the basics of geodata validation we refer to the explanations in Wagner et al. 2013.
The Special Interest Group 3D has developed guidelines for modeling of 3D city models. The goal is to clearly specify valid
alternatives and recommend one of them for general usage. This should lead to city models with known specifications in contrast
to the situation today where only the modeler knows how certain features are reproduced. These recommendations are the
base for geometric validation rules which have been developed and implemented as part of the research project
CityDoctor
at
University of Applied Sciences Stuttgart
, Germany. The geometric model as described by Gröger Plümer 2011 is
used. In addition, some geometric-semantic rules resulting from
CityGML requirements are included as plausibility checks for consistency of the data set.
A short listing of the checks is given in the following, more detailed explanations are given in Wagner et al. 2013. The
algorithms and the underlying data structure are suitable for CityGML LODs 1 and 2.
Polygon checks
1. A linear ring must consist of a minimum of 4 points 2. First and last point of a linear ring are identical.
3. All points of a linear ring
R
are different, with exception of first and last point.
4. Two edges can intersect only in one start-end point. Other points of intersection or touching are not
allowed to account for rounding errors or polygons which are not perfectly planar, a small tolerance is
allowed.
5. All points of the polygon must be located in a plane a small tolerance is allowed.
NB: since we consider only outer rings, polygons with holes are not processed currently they occur only rarely in LOD
1 or 2. Solid checks
6. The minimum number n of polygons to define a solid is four. They must be situated in different planes.
7. A valid intersection of two polygons of a solid either contains a common edge, a common point of a linear
ring, or is empty. Common edges and points must be elements of both polygons.
8. Each edge of a linear ring defining a polygon is used by exactly one neighboring polygon.
9. Consistent orientation of polygons of a solid such that common edges according to check 8 are used in
opposite direction. 10. The normal vectors of the polygons must point
towards the outside of the solid. 11. All parts of a solid must be connected.
12. The graph
G
S
= V
P
,E
P
of polygons and edges which are meeting in point
p
i
is connected for all
p
. Each vertex
v
∈
V
P
represents exactly one polygon which contains
p
. Two vertices are connected with an edge
e
∈
E
P
if the polygons represented by these vertices have a common edge that is bounded by
p
. Semantic checks
13. Orientation of
RoofSurface
,
WallSurface
and
GroundSurface
elements Figure 1. Umbrella check
14.
measuredHeight
in same range as height of building geometry
15.
numberOfStoreysAboveGround
plausible for height of the building geometry
16.
numberOfStoreysBelowGround
plausible for height of underground geometry of the building
17. Relationship of
Building
and
BuilidingPart
Figure 2. Alternative modeling scenarios for BuildingParts
3. HEALING
For each error detected during the validation process a specific error object contains all necessary parameters for healing. Our
approach assumes that all errors should be healed hierarchically, according to the dependency of the respective
checks. An iterative approach assures that after an error is healed, the geometry is checked repeatedly for new errors which
might have been introduced during the last healing step. This enables to manipulate to original model in a controlled and
reproducible way.
Figure 3. Complete Healing workflow ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences,
Volume II-2W1, ISPRS 8th 3DGeoInfo Conference WG II2 Workshop, 27 – 29 November 2013, Istanbul, Turkey
This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 2
For the cases where problems can’t be solved by the healing algorithms after a user-defined maximum number of iterations,
an error object is returned. Healing is done in two phases. Firstly all the polygons are healed and then if polygons pass the
validation process solid-errors are healed. In figure 3 the healing process is illustrated.
3.1
Geometry Healing POLYGON HEALING
In this phase one error is healed at a time. That means each iteration heals an error and checks the result for validation.
CP_CLOSE: The first and the last vertex of a polygon must be same, therefore healing would be just to copy the first vertex at
the end of the pointlist. If a Linear Ring contains four vertices in a sequence {P
1
, P
2
, P
3
, P
4
} where the last point and the first point are not same. The healed pointlist would look like {P
1
, P
2
, P
3
, P
4
, P
1
}.
CP_NUMPOINTS: A Linear Ring must contain a minimum number of four vertices in the sequence where the first and the
last point are same. A closed Linear Ring with less than 4 vertices is either a line or a point but not a valid polygon, in this
case healing is to delete these polygons. But if a polygon contains 3 different vertices and the first and the last vertex are
not same then it will be healed by the previous healing process and then the number of vertices of the healed polygon will be 4.
CP_DUPPOINT: In a Linear ring only the first vertex is allowed to repeat at the end of the point sequence. No other
vertex is allowed to repeat within the sequence at any position. In Figure 4 first Linear Ring X contains the point sequence of
{P
1
, P
2
, P
3
, P
4
, P
5
, P
3
, P
1
} where P
3
is repeating twice. In second Linear Ring Y point sequence is {P
1
, P
2
, P
3
, P
4
, P
4
, P
1
} where P
4
is repeating twice. In third Linear Ring Z point sequence is {P
1
, P
2
, P
3
, P
4
, P
2
, P
1
} where P
2
is repeating twice. In all the Linear Rings one or more vertex is repeating more than once
excluding the last point so there are Duplicate Point Error in the Linear Rings. These are healed in two different ways. For
Linear Ring Y vertex P
4
comes twice back to back so only one of the instance are kept and the other one is deleted from the
pointlist. But for X and Y deleting an instance of vertex will result change in the shape of polygon. So here loops are
searched within the pointlist. For X the loops will be {P
3
, P
4
, P
5
, P
3
} and {P
3
, P
1
, P
2
, P
3
} and for Z it will be {P
2
, P
3
, P
4
, P
2
} and {P
2
, P
1
, P
2
}. So the polygons will be split into multiple polygons according to the newly found loops.
Figure 4. Healing CP_DUPPOINT error
CP_SELFINT: Two edges are allowed to intersect only at start and end point of the edge and any other intersection will
considered as an error. In Figure 5 first polygon contains point sequence {P
1
, P
2
, P
3
, P
4
, P
1
} where edge P
2
, P
3
and edge P
4
, P
1
intersects at a point which doesn’t belong to the point sequence and in second polygon point sequence is {P
1
, P
2
, P
3
, P
4
, P
5
, P
1
} where edge P
2
, P
3
, edge P
4
, P
5
and edge P
5
, P
1
intersects at a point which doesn’t belong to the point sequence. So, both has Self-intersection Error of Edges. There are two
healing options one is to rearrange the point sequence which works sometimes fine with simple polygon and another one is to
extract the intersection points create new vertices with those and place the new vertices in between each intersecting edges. So
the first polygon would be {P
1
, P
2
, P
x
, P
3
, P
4
, P
x
, P
1
} where P
x
is the new vertex. This is not a valid polygon but there is no self-intersection error any more, the double point errors will be
healed by the next iteration by its healing process.
Figure 5. Healing CP_SELFINT error
CP_PLAN: This is very common error and difficult to heal. All vertices of a polygon must lie within the same plane regarding a
user specified tolerance. If a polygon contains point sequence of {P
1
, P
2
, …., P
n
, P
1
}, all of the vertices will lie within a plane formed by any three vertices from the point sequence and
normal of all vertices on the surface must be parallel. In Figure 6 the polygon has a point sequence of {P
1
, P
2
, P
3
, P
4
, P
1
} where i.e. vertex P
3
doesn’t lie within the plane formed by P
1
, P
2
, P
4
. Sometimes the error is very small like less than a 1 mm. Those
are most probably caused by the measurement issue or floating number. In this case a little adjustment of vertices might heal
the polygon. Another healing option is to triangulate the polygon and split it
into multiple triangular polygons. But again it is very difficult to decide how to triangulate because there is always more than
one possibility and only one is correct. So a little bit of customization according to the error pattern of the model helps
a lot. For example while repairing a vertical non planner polygon of wall surface only vertical triangles are accepted as
newly triangulated polygons.
Figure 6. Healing CP_PLAN error The healing of non-planar surfaces of a building with the first
method is divided in three phases: Healing of the GroundSurface:
Figure 7. Healing of the Ground and WallSurface ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences,
Volume II-2W1, ISPRS 8th 3DGeoInfo Conference WG II2 Workshop, 27 – 29 November 2013, Istanbul, Turkey
This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 3
The healing of the GroundSurface is identical for the LoD1 and the LoD2. We are identifying the GroundSurface for a LoD1
geometry as the surface with the smallest z-coordinate and the least deviation in respect to direction of the normal vector
of the xy-plane
. All points belonging to the linear ring of the GroundSurface are being projected on a plane, parallel to
and passing through the minimal z-value of the ring. See figure 7 for an example. The blue, non-planar polygon is being
projected on the green, planar one. Healing of the WallSurfaces:
As above we are not distinguishing between LoD1 and LoD2 during the healing process of WallSurfaces. We assume that
each WallSurface shares a common edge with the GroundSurface and each Surface of a LoD1 geometry, adjacent
to the GroundSurface is a WallSurface. Let
be the i-th WallSurface,
the common edge with the GroundSurface and the normal vector of the least squares plane through all
points of the linear ring of . If
is smaller than a given
, then all points of the linear ring are being projected into the plane, spanned by the directional vector of
and . With this approach we omit walls with a given angle of slope. An example is shown in figure 7.
Healing of the RoofSurfaces: There are two algorithms for healing the RoofSurfaces. The first
one handles LoD1 roofs and LoD2 flat roofs. The RoofSurface of a LoD1 building is determined similar to its GroundSurface.
It’s the polygon with the least deviation according to and maximal z-
value. Additionally it’s not adjacent to the GroundSurface. We will projecting all points of the linear ring
of the RoofSurface into a plane parallel to , passing through
the average z-value of all points in the ring, as you can see in figure 8 .
Figure 8. Healing of the LOD1 and LOD2 RoofSurface The second one handles all other LoD2 roof types and
calculates the least square fitting plane for all RoofSurfaces. Each point of a linear ring of the corresponding RoofSurface is
projected along the z-axis into the according least square fitting plane of its linear ring. This procedure will be repeated until all
RoofSurfaces are planar or a maximum number of iteration is reached. Note that the points are only projected along the z-
axis. Hence healed the WallSurfaces remaining planar, even if shared points of RoofSurfaces are moved. See figure 8 for an
example.
SOLID HEALING
In this phase one error is healed at a time. That means each iteration heals an error and checks the result for validation.
CS_NUMFACES: To form a solid minimum four surface is required. Any solid having less than four valid polygons has
insufficient number of face error. If a solid has less than 4 polygons then it is not possible to repair, only exception is a
triangular pyramid with one missing triangle. For all other cases the solid would be invalid and deleted from the model by the
healing process.
CS_SELFINT: Polygons of a solid must meet each other only through edges. Any other intersection of polygons will be
considered as a self-intersection error of solid. We assume that the polygons of a solid, in the sense of CP_PLANAR, are
planar, hence the user defined tolerance for the intersection algorithm should not be greater than the one used for
CP_PLANAR. This might lead otherwise to false positives andor irrational results. The Tolerance i.e. is used to determine
if the intersection of two polygons is a line or only a point the length of the line segment is below the tolerance. Furthermore
each intersection is classified by its type: partially embedded edge, fully embedded edge, partially embedded polygons, fully
embedded polygon, normal intersection and undefined intersection. In figure 9 for partially and fully embedded and
edge errors like A and B overlapping edge P
1
, P
3
and edge P
2
, P
4
are merged into 3 edges P
1
, P
2
, P
2
, P
3
and P
3
, P
4
and the pointlists of respective polygons are rearranged like H. And
for partially embedded polygon errors like Y the overlapping regions are trimmed out from the overlapped polygons and new
polygons are created from the overlapping regions like R. For fully embedded polygon errors like X the overlapping region is
only trimmed out from the bigger polygon.
Figure 9. Healing CS_SELFINT error For a normal intersection like figure 10
healing doesn’t brings a valid result. If the intersecting polygons are split into multiple
polygons an overused edge error occurs which is very difficult to heal. But still it solves the issue with self-intersection.
Figure 10. Healing complex CS_SELFINT error
CS_OUTEREDGE: Every edge of a solid will bound exactly two polygons. Any edge of the solid bounding less causes
incorrect number of polygons with edge error and there is a hole somewhere in the solid. Firstly all the error edges are searched
for loops. And new polygons are formed with the newly found loops. The incomplete parts of the loop are left out without
healing. CS_OVERUSEDEDGE: Any edge of the solid bounding more
than two polygons causes a topological error. In figure 10 if the self-intersection error is healed then there will be an edge
sharing 4 polygons. This type of error are not possible to heal automatically the possible options are to manually edit the solid
or delete the polygons. CS_FACEORIENT: Each edge must bound two polygon and
the orientation of the edge must be opposite in the polygons. In figure 11a two polygons P {P
1
, P
2
, P
4
, P
3
, P
1
} anti-clockwise oriented and Q {P
6
, P
4
, P
3
, P
5
, P
6
} clockwise oriented, are bound by the edge P
4
, P
3
. But the order should be in one polygon P
4
, P
3
and in another polygon P
3
, P
4
. If all or most ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences,
Volume II-2W1, ISPRS 8th 3DGeoInfo Conference WG II2 Workshop, 27 – 29 November 2013, Istanbul, Turkey
This contribution has been peer-reviewed. The double-blind peer-review was conducted on the basis of the full paper. 4
of the edges of a polygon have wrong orientation then it is wrong oriented and healing would be to reverse the order of the
pointlist. So if pointlist Q is wrong oriented then the healing process will correct the pointlist in {P
6
, P
5
, P
3
, P
4
, P
6
} and the orientation will be anti-clockwise.
Figure 11. Healing CS_FACEORIENT error
CS_FACEOUT: If every polygon of a solid is wrong oriented then the solid will be valid by face orientation check because
each edge will find an opposite pair. So every surface normal of a solid must direct outwards. Even after healing the face
orientation error it is not guaranteed that all the polygons will face outward. In figure 11b the red polygon has a face out error.
Healing of this error is to reverse the orientation of the polygon. CS_CONCOMP: If a solid contains a disconnected polygon
then it has an error and it will be detected by the outer edge check. But if the polygons defined in a solid forms two valid
solid like figure 12 then it will pass all the checks until connected component check. Healing of this error is to convert
each disconnected solid into a solid data structure and delete the original solid.
Figure 12. Healing CS_CONCOMP error
CS_UMBRELLA: Healing this error is still in progress but one option is to split the adjacent polygons into groups where
they are connected by edges and then create new vertices for each group with same coordinates then move those vertices a
little bit away from each other like figure 13.
Figure 13. Healing CS_UMBRELLA error
4. RESULT AND DISCUSSION