www.viamichelin.com , RouteNet plan
www.routenet.com ,
OpenRouteService http:openrouteservice.org
and Naver – in South Korea
http:maps.naver.com . Brussels in Belgium was
used for most of the tests, but Seoul in Korea was chosen for one of the underground experiments.
When calculating the shortest path in Brussels both Figure 34 Bing and Google do not use a gallery as a short cut, whereas the
others do so; Bing does not show the route, but Google Maps does provide a name, but not a route. The conclusion must be
that some networks do include interior pathways accessed by above ground entrances in their shortest route calculations;
others did not have indoor networks at the time of the experiment.
The underground Myondong shopping centre in Seoul lies beneath a wide and very busy road, with entrances at either side
of the road. This made a good test of routing software: would the calculated path suggest going down, through the shopping
centre, and then up on the other side of the road? Only Naver was able to provide pedestrian routing in Seoul, and it did
successfully find the route; definitely a case of “Why did the chicken survive crossing the road? Because it’s route planner
did know the underground path
.”
Figure 35: Brussels Central Station to Ravensteingallerij using route planner a Bing ,b Google Maps, c Mappy, d Via
Michelin, e RouteNet and f OpenRouteService. From Vanclooster et al 2012.
A further test was conducted in Brussels to see if underground entrances and exits were part of the route planner’s database.
The test route went from the main railway station via an underground connection to the Ravensteingallerij see Figure
35. Only OpenRouteService used all available entrances and underground passageways to progress directly to the gallery.
The other networks were less successful. In many cases, for all route finders, the address translating process to convert to geo-
location on the map were rather simplified. For instance all the route planners used one entranceaddress point for route
planning, no matter what the destination of the query, while railway stations are well known for having several entrances,
owing to their considerable size, These tests show that the data for the route networks in these
Brussels and Seoul examples were incomplete when the tests took place, not that the route finder algorithms per se were
inadequate. All data providers add as much extra spatial data and information into their networks as they can, but update and
discovery take time, and this leads to inadequacy in some locations. Until the last few years most commercial data
providers did not use any VGI or CGI spatial data to update their networks. Now, more are doing so as it becomes very clear
that organisations such as OSM can provide accurate useful information. For example, Google Maps – definitely in the CGI
camp for data acquisition – has started to make increased efforts to use this huge human resource to improve its mapping.
4.2 Crowdsourcing Indoor Geodata
The inside of buildings provides a fairly difficult VGI survey environment because of the non-operation of most easily
understood GPS handheld devices. Other means are becoming possible using 3-axis accelerometers and the 3-axis
magnetometers available in many smart phones and even a piezometer implanted in a Nike running shoe Xuan et al,
2011. But, the kit and skills required are not yet commonplace. Added to the sudden change of scale, and therefore desired
locational accuracy, when moving across the interface from exterior survey to interior mapping there is the question, as
discussed by Vanclooster et al 2012, of address matching with possible multiple entrances to the same building. Lee 2009 has
considered the address interface problem and comments on the orderliness of some exterior addresses and the disorganised
nature of many interiors. One of the most prolific authors dealing with 3D city models
published in the last few year has been Goetz Goetz and Zipf, 2011 and 2012; Goetz, 2012 who discusses the CityGML
standard for storing and exchanging 3D city models, and the progress of modelling, mostly with German examples. As he
points out, different regions in the world have different desires and therefore different standards that need to be met in their 3D
models. According to Haklay 2010 and others OSM is often able to exceed official or commercial data sources in terms of
quality and quantity.
Figure 36: Levels of Detail LoD0 – LoD4 defined in CityGML specification
CityGML defines a number of Levels of Detail LoD for modelling purposes: LoD0 is the plan view of a 2.5D terrain
model, LoD1 is a simple extruded block rendition, LoD2 is textured with roof structures, LoD3 is a detailed architectural
model, and LoD4 is a full “walkable” 3D model. As we have seen, most routing systems focus on a 2D network
specification and cannot cope easily with a 3D model. Goetz proposes:
“the development of a web-based 3D routing system based on a new HTML extension. The visualization of rooms as
well as the computed routes is realized with XML3D. Since this emerging technology is based on WebGL and will
likely be integrated into the HTML5 standard, the developed system is already compatible with most common
browsers such as Google Chrome or Firefox. Another key difference of the approach presented is that all utilized
data is actually crowdsourced geodata from OSM
” Goetz, 2012.
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
412
This would appear to be a very useful contribution to both mapping and routing services, and the use of OSM to provide
much of the ground truth data would prove invaluable in terms of both time and money.
Figure 37: Increase in use of the “building” blue and the “indoor” key for OSM additions from 2007 up to 2011. From
Goetz and Zipf 2011. There are now about 45M tagged buildings in OSM with
approaching 0.5M added every week Goetz, 2012; not all are fully rendered, but many are well above LoD1. This rate of
building increase is currently higher than that for roads, which, although standing at about the same total, increases by perhaps
0.2M per week. See Figure 37, which shows the increase in usage of building and interior keys between 2007 and 2011.
Describing and modelling buildings requires a common understanding of terms and importance. Ontologies have been
developed to allow not only questioning, searching and joining, but also to allow routing and navigation. Yuan and Zizhang
2008 proposed an alternative navigation ontology, but one that did not concern itself about colour or other non-essentials to
navigation. Goetz’s ontology is kept simple, but allows realistic visualization of the building as well.
It would appear from the foregoing that much of the present work on buildings, both exterior and interior is being performed
by OSM crowdsource enthusiasts; many of them very skilled. Who should be the crowdsourced agents for making the
building models? Rosser et al 2012 say that perhaps the buildings occupants are those best placed to make the most
satisfactory models; to them at least, and who has a better claim?
As time passes and the number of building edits shown in Figure 34 increases exponentially, it is likely that much of the
activity will turn to editing and the improvement of the quality of OSM submitted buildings, where in some cases exuberance
may have dominated accuracy. The open source tools to make and use the models have been generated, the skills are present in
some quantity, and the ontologies are defined to allow the buildings to be more than a pretty object; they can become full
network models. 4.3
Indoor Google
Google in its various mapping guises has for many years been the mainstay for providing base maps for the crowdsource
community. The main commercial spatial data providers were for many years Navteq, TeleAtlas and Google. Historically,
Netherlands based TeleAtlas and North American Navteq were used in many navigation applications.
Since Nokia
http:en.wikipedia.orgwikiNavteq bought
Navteq in 2008 and TomTom acquired TeleAtlas, also in 2008 http:en.wikipedia.orgwikiTele_Atlas
, there has been a clear separation where each navigation specialist was responsible for
its own data sets. But in 2011 Garmin bought Navigon AG to capture the iOS and Android navigation apps market where
Navigon was dominant. This was an important step for Garmin because all portable navigation devices had been losing ground
to smartphone navigation apps, a market that Garmin had not entered before. Interestingly in June 2011 Navigon had
introduced a PoI package derived from the crowdsourced OSM project.
Figure 38: Interiors of Guildford Cathedral, showing the indoor online possibilities of Google’s Street View. A full walk about
is very possible. Google Guildford Cathedral
. Google changed from TeleAtlas data in 2009 to individually
conducted “Street View” data gathering for their US dataset. Reasons for this move were said to be the lack of accuracy and
coverage in the United States from the TeleAtlas data according to
http:blumenthals.comblog20091012google-replaces- tele-atlas-data-in-us-with-google-data
. By doing this Google confirmed its intention to remain a major contender for
providing spatial information. Until 2011 neither Nokia, TomTom, nor Google had showed much interest in acquiring a
ubiquitous indoor data capability. Google announced in October 2011 that Street View would now move inside buildings.
Throughout 2011 Google was trialling its data collection and photography systems of interiors with the help of stores and
offices. Meanwhile Street View was being expanded to many new countries; by May 2012 Israel was on the photographic
map and by October Street View coverage had broadened to 11 countries, including the US, UK, Sweden, Italy and Singapore,
and special collections have been launched for South Africa, Japan, Spain, France, Brazil, Mexico, Israel and other countries.
At the same time Street View had been enhanced to provide indoor tours of a number of places, for instance: Russias
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
413
Catherine Palace and Ferapontov monastery, Taiwans Chiang Kai-shek Memorial Hall, Vancouvers Stanley Park, the interior
of Kronborg Castle in Denmark, and Guildford Cathedral in Figure 38.
This interest on Google’s part in interiors and encouraging both Street View but also crowdsourced participation may herald the
start of complete routing systems that do understand 3D buildings, their interiors, the underground passages, and the car
parks that go to make up the modern city. It will then be very interesting to apply Vanclooster’s routing tests again and see
whether the 3D information gathered by VGI, CGI, or PGI means, allows accurate shortestfastest guidance through the
urban maze.
5 WHITHER SPATIAL ONTOLOGIES?
Shanahan 1995 is quoted in Bhatt 2008 as defining spatial ontology as:
“If we are to develop a formal theory of common sense, we need a precisely defined language for talking about shape,
spatial location and change. The theory will include axioms, expressed in that language, that capture domain-
independent truths about shape, location and change, and will also incorporate a formal account of any non-
deductive forms of common sense inference that arise in reasoning about the spatial properties of objects and how
they vary over time
.” The idea that common sense is vital and will be used while
reasoning with and about spatial objects is to be applauded. The understanding and semantics of geographical concepts vary
both between user communities – VGI or PGI based – and need ontological formal languages and structures to represent the
concepts used by any information community. Stock 2008 argues that, as human semantics can be extremely informal in
nature, “Perhaps NSDIs have been too formal and do not account for this human flexibility?”
Berners-Lee 2005 tried to estimate the effort involved in ontology creation. The table in figure 39 shows his results.
Figure 39: Berners-Lee 2005 Total cost of ontologies. This table assumes ontologies are evenly spread across orders of
magnitude, committee size is reasonably represented by logcommunity, time is community2, and costs are shared
evenly over the community. The scale column represents the community size from which the committee size and costs are
calculated. The general conclusion is that the cost of large efforts is huge, but it is borne by and benefits an even larger
group and so the cost per individual is very, very small indeed. A large effort is required to build an ontology for a particular
domain. This is a disincentive for groups who might create ontologies to suit their own circumstances. Also, why should
there be a single ontology, and why in English, which might act as a constraint to some non-English semantic ontologies
http:en.wikipedia.orgwikiLinguistic_relativity ? Perhaps
variety and semantic translation and interoperability would be a better approach? The figures in the table argue strongly in
favour of large group participation and might be thought to be a good prospect for professional crowdsourcing, as with standards
and software open source projects? GIS has used standard forms with attached labels since it
appeared as a discipline Evans, 2008. Researchers consider a good representation of the world assumes: there is only one type
of object in a given space; several in one space would be identical, and that everyone would use the same descriptive
terms and labels. In practice this is not so; we are all our own experts with different mental formulations of the world around
us brought about by different experiences and upbringing in society.
Despite this problem with diverse groups OSM has built an open ontology from scratch Dodge, 2011. Many of the people
most actively involved in the ontological development of OSM, whilst skilful and self-motivated, do not have a cartographic
background. This often leads to lively debates about the ontology for OSM and some new thoughts about how maps
should look and work. The difficulties are usually ironed out by social negotiation to determine the best understanding of the
objects in question. As time passes so OSM’s ontology is becoming more complex and useful, but partly as the result of
considerable mental anguish amongst the contributors and editors.
5.1
Vagueness
Much time and effort is involved in defining ontologies of spatial objects; possibly even more in recognising, defining and
categorising the objects in a formal manner so that they can enter an ontological description. A major problem of the real
world is that although an object is definitely present it can rarely be described by a single term, or multiple occurrences by the
same exact definition. Our desire to classify is confounded by vagueness of description.
A traditional example of this problem is: when does a pond become a river, become a lake, become a sea Third, 2008?
Scale is one factor, form is another, salinity might be a third, and so on. Our descriptors are unclear and suit our thinking,
perhaps. This problem of vagueness could be overcome by choosing to ignore insignificant parts, but insignificant is also
vague and undefined in most cases. Humans tend to skip over insignificant
deviations. A serious problem in developing an ontology is to define the terms to be used within it.
Bennett 2008 discussed the standpoint theory of vagueness and showed how apparently impossibly difficult semantic
problems – is this a river or an estuary? Is this a heap or merely a pile? – could be expressed using supervaluation semantics
Bennett, 2008 to enable vague human language to be logically interpreted by a set of possible precise interpretations
called precisifications, providing a very general framework within which vagueness could be analysed within a formal
representation and handled by computer algorithm. Implementation of this process is not at all trivial, despite the
human ability to make instant judgements. For instance processing geometric shapes to provide qualitative shape
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
414
definitions is geographers f
for instance picking a relev
possibilities w 5.2
Moving f SDIs?
Bennett et al whether peopl
the expansion questions to b
now being dev ontology mat
available, or u external resou
Janowicz et al Enablement L
the Semantic W et al 2010 s
Resource Desc OSM data set
service for sea Stock et al 2
register contai rather than an
contain a voc could use, pro
assist interpret dealing with f
enable flexibl form an impo
expression wi ISO’s FTC w
and by adding compliant regi
richness of in language or U
Stock’s appro benefits: easy
SDI where it providers per
management WSML and s
able to conclu flexibility in s
including chan language inter
5.3
Standard
International s by crowd sou
awareness am problems invo
been addresse complexity o
groups: OGC
www.isotc21 df
. Percivall 201
“Fusion is data or in
quite complex from previous
Cole and Kin vant finite set o
was difficult
from Ontology
2007 require le recognised an
n of the seman be asked. Du e
veloped to perfo tching based
using lexical and urces and pri
l 2010 though Layer
for an SD Web was an im
suggested using cription Frame
t, within the Se arching and que
2010 has prop ining a Feature
n ontology to cabulary of term
ovide what Sto tation of the re
feature types. T le response to
ortant link betw ithin an SDI. S
within the OGC g stored querie
istry. The FTC nformation an
UML rather than ach, stressing t
navigation to c has been buil
rhaps as in the of the gap be
static content o ude that non-on
solving comple nge through tim
rpretations.
ds Equals Qua
standards exist, urced data col
mongst many olving geograp
d and mostly m f the standard
www.opengeo 1.orgOutreach
0 while writin the act or proc
nformation reg . Along with m
academic quan g, 1975, Benn
of regions from
y Feature Type
ed: more forma nd approved th
ntic structures et al 2012 rep
orm automated on shared up
d structural info ior matches
ht a shared and t DI, integrating
mportant missing g the LinkedG
ework RDF im
emantic Web, erying geodata.
osed the idea Type Catalogu
support SDIs ms that both c
ock calls a ligh egistry contents
This formulation operations on
ween implemen She suggested t
C’s Web Catalo es allow interro
s could store v d be expresse
n in formal onto the use of the F
content, simple lt or is being u
case of OSM etween web se
ontology. By 20 ntology approa
ex geographical me, vagueness,
ality?
, and are used, llectors, partly
crowdsource e phical descript
met, and partly ds generated b
ospatial.orgstan hISO_TC_211_
g about standar cess of combini
garding one o many generatio
ntitative revolu nett discovered
m an infinite ran
e Catalogue Ba
al semantics, te e formal results
to allow locat port many tool
and semi-autom pper ontologie
ormation, user i Noy et al, 2
transparent Sem query services
g element. Hah GeoData project
mplementation o to provide a c
of using a sem ue
FTC, ISO 19 as the FTC w
reators and da htweight
ontolo s, and a structur
n would, Stock a feature type
ntation and sem the incorporati
ogue Service C
ogation of the variable amount
ed either in na ology language.
FTC, provides er management
updated by mu , and an end t
ervice eg OW 011 Stock et al
aches provided l research prob
and varying n
, but not neces y owing to lac
enthusiasts tha tion standards
due to the nece by the internat
ndardsgml and
_Standards_Gui rds said:
ing or associat or more entit
ons of utions
d that nge of
ased
esting s, and
ational ls are
mated es, if
input, 2005.
mantic from
hmann t as a
of the entral
mantic 9110
would atasets
ogy to ure for
k said, e, and
mantic ion of
CSW OGC
ts and natural
many of an
ultiple to the
WL-S, were
more blems,
natural
ssarily ck of
at the have
essary ational
d ISO ide.p
ting ties
co to
for en
Wha natio
scree relia
appr Beck
Fig Typi
of F upda
junct rathe
utilit abstr
onsidered in an improve one’s
r detection, id ntity.”
at is a map but a onal or region
en, and assume able
same prob opriate for the p
k 2008.
gure 40: Top, u same juncti
ical mapping of Figure 40, gain
ate, showing a m tion. The map
er smaller scal ty lines. It pr
raction, not real explicit or imp
s capability or dentification, o
a fusion process nal mapping ag
they: are corre blem, and tha
purpose of the
tility survey da on on the finali
f a real world s ned from groun
multitude of uti on the right is
e with general rovides a clea
lity. It also mee plicit knowledg
r provide a new or characteriza
s? Most people agencies, either
ect whatever th
at the level of map. Look at F
ata at a junction ised map Beck
situation is reve nd survey and
ility lines criss- a plan of the a
lisation of both ar interpretatio
ets standards, m ge framework
w capability ation of that
see maps from r on paper or
hat means, are f abstraction is
Figure 40, from
n, Bottom, the k, 2008
ealed in the left manual paper
crossing a road area, drawn at a
h junction and on, but is an
may be accurate, m
r e
s m
t r
d a
d n
,
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
415
but is not precise, nor is it complete. There are many ways of storing data, here on paper or in a GIS, and several ways of
structuring the data using a variety of syntactic and schematic models. What is needed, as proposed by Stock and others listed
earlier in this text, is the ability to integrate models based on global schema and perhaps to use Bennett’s standpoint theory to
help remove vagueness from the mix. Mary McRae, until 2010 working with OASIS Organization
for the Advancement of Structured Information Standards
, is well known for a presentation containing the slide “Standards
are like parachutes: they work best when theyre open”, possibly derived from LE Modesitt Jr, or was it Frank Zappa, or
Elvis? The point is well made, however. Open standards are essential if people are to be willing to use them. They must be,
according to OSGeo, freely and publicly available, non- discriminatory, with no license fees, agreed through formal
consensus, vendor neutral, and data neutral. Note that open standards does not mean open source. Standards are usually
documents; sources tend to be software. See the paper from OSGeo on this subject – Open Source and Open Standards at
http:wiki.osgeo.orgwikiOpen_Source_and_Open_Standards .
The OGC recommends many geospatial standards to users, including GML Geography Markup Language as its base. See
http:en.wikipedia.orgwikiGeography_Markup_Language .
GML is the XML grammar for expressing geospatial features. It offers a system for data modelling and as such is used as a basis
for scientific applications and for international interoperability. It is at the heart of INSPIRE Infrastructure for Spatial
Information in the European Community, the European initiative, and for GEOSS Global Earth Observation System of
Systems
. See http:inspire.jrc.ec.europa.euindex.cfm
and http:www.earthobservations.orggeoss.shtml
. New standards from OGC are continuing, with community-specific application
schemas introduced to extend GML. GML 3.0 is a generic XML defining points, lines, polygons and coverages. It extends GML
to model data related to a city using CityGML for representation, storage and exchange of virtual 3-D models.
There was a workshop in January 2013: CityGML in National Mapping
. See http:www.geonovum.nlcontentprogramme-
workshop-national-mapping .
Other standards include: the Web Feature Service WFS, for requesting geographical features; the Web Map Service WMS,
for requesting maps using layers, to be drawn by the server and exported to the client as images; and the Styled Layer
Descriptor
SLD which provides symbolisation and colouring for feature and coverage data.
The OGC is an international organisation, but so is ISO, and both generate standards for the geodata community to use. The
OGC was started with mainly industrial, commercial and some research membership, whereas ISO, in the form of Technical
Committee 211 ISOTC211, was instituted as a completely independent body with members delegated by participating
nations, usually but not always from appropriate national standards bodies. Both operated in the same field but,
fortunately, for many years OGC and ISO have cooperated closely in standards specification to the benefit of both.
Recently ISOTC211 has published the text for standard 19157, Geographic information — Data quality
, to specify standards for ensuring that quality of geographic information can be
implemented, measured and maintained. The data quality elements consist of: completeness of features; logical
consistency , adherence to data structure rules; positional
accuracy within a spatial reference system; thematic accuracy
of quantitative and qualitative attributes; temporal quality of a time measurement; usability element based on user
requirements; and lineage: provenance of the data. Interestingly, all but logical consistency need to be verified by some ground
truth action. The standard proposes that different methods of sampling see Figure 41 will be necessary for different data
types.
Figure 41: Choosing a sampling strategy for quality checking by logical feature or by areal selection ISOTC19157.
There is flexibility as to which strategy is chosen based on either a complete population survey, probabilistic or
judgemental sampling procedure. The choice depends on the data being tested.
6
ACCURACY, QUALITY, AND TRUST
Figure 42: MODIS top and GLC-2000 bottom satellite images covering a 20km x 20km area around the UK town of
Milton Keynes, showing automatic classification by different but standard approaches, to indicate similar land cover classes.
In both images deep pink is cropland or natural vegetation, red is urban and built up areas.
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
416
Quality check taken from Fri
satellite image There is consi
classification classification,
automated alg 2000 image at
time luminos combination w
The differenc estimates urba
the percentage products agree
meet ISO1915 this and the g
field. Fritz 2 survey was
EuroGEOSS, G
The Geo-Wik now looking f
“Volunteers cover disag
actually see the land cov
recorded in used in the f
global land
The volunteer photos, tag the
Wiki. Figure project. Prese
contributors re checking exerc
Figure 43: ground tru
This technique studies, and en
contributions. up, the averag
with a median since 2009, wi
earlier tend to do the vast ma
It is possible t are, but som
Usually this is ground survey
This is probab additional aeri
with imagery interpretation;
king can, in pra itz et al 2011
es of the area iderable differe
that has be shown at th
gorithm trained t the bottom of
sity set at a with expert kn
ce is remarka an area at abou
e of pixels on es is less than 3
57 standards general problem
2011 reported to be carried
GEOBENE, CC ki Project at
htt for
s . . . to revie greement and
e in Google Ea ver maps are co
a database, al future for the
cover map .”
r can downloa em with approp
43 shows part ence in the tab
eceive, other th cise.
left High Scor uth checking, a
regist e is used in a nu
ntirely depende Thus far April
e contribution i n of around 30.
ith later peaks a have complete
ajority of the wo to assess how a
me true version s a national map
yors and consid bly as reliable a
ial photography y is not one
there are actice, be very
illustrates the p a near Milton
ence in the resu een used. Th
he top, was g d using calibrat
f the figure was a relatively h
nowledge, to c ably large. Th
ut 50 greater n which each o
30. This pres The only succ
m is through gr that a crowdso
d out from 2 C-TAME and C
tp:www.geo-w ew hotspot map
determine, bas rth and their lo
orrect or incorr long with uploa
creation of a n ad an App fo
priate text and u t of the High
ble is the only han taking part
re table for sate and right peak
tration date. umber of enviro
nt on crowdsou l 2013 346 vol
is nearly 400 ce Volunteers hav
and troughs. Th d more work, a
ork. accurate the cro
n must be us p, created over
derable editing as any base can
y or similar im of determinin
no easily m difficult. Figur
problem well fo Keynes in the
ults of the auto he MODIS i
generated usin ion data. The G
s created using high threshold
lassify urban a he MODIS pr
than the GLC-2 of these land
sent result woul cessful resoluti
round checks i ourced ground
011, supporte CC_AFS.
wiki.orgindex.p ps of global la
sed on what th ocal knowledge
rect. Their inpu aded photos, to
new and improv r their phone,
upload them to Score
table fo y incentive the
t in a useful qu
ellite classificati s and troughs o
onmental resour urced VGI
lunteers have si ells field checke
ve been signing hose registering
and as always a
owdsourced pro ed for compar
many decades over those dec
be, other than magery. The pro
ng accuracy bu measured auto
re 42, or two
e UK. omatic
image ng an
GLC- night
d, in areas.
roduct 2000;
cover ld not
on to in the
truth ed by
php is
and hey
e, if ut is
o be ved
take Geo-
or this VGI
quality
ion of
rce igned
ed, g up
few oducts
arison. using
cades. using
oblem ut of
omatic poss
dip i wher
defin param
comi attrib
cons 6.1
The node
down to ha
samp tryin
was 921,
1,662 The
accu beca
truth Moo
disco were
subs
F ver
the It is
all fo majo
edite cons
95.3 3,549
and A may
the e hypo
main Moo
been chan
Olym inade
Moo datas
VGI ibilities of line
into a churning re the monster
nitional problem meters for accu
ing sections th bute accuracy
istency, as reco
Accuracy – O
OSM global d es OSM-Stats,
nload but is cu andle, so subse
pling. Anderka ng to assess the
run on 20th N 816; uploaded
2,862,568; way global stats ar
uracy of OSM ause of OSM siz
h sets only cove oney et al 2012
over how much e using the UK
et.
Figure 44: From rsions of edited
number of data axis, for
interesting to n our datasets hav
ority of ways h ed a very few
idered finished 3, 232,707 fr
9,831, Germa Austria 89.0
not indicate th editor – and a
othesis could b nly in rural ar
oney does not co n ferociously ed
nging landscap mpic stadium si
equate survey in oney 2011 se
sets. A lack of contributors co
e-point-area com g ocean of atte
rs of the deep ms rather than
uracy have been here will be d
y, completene ommended in th
SM compared
database contai 2012. The hist
rrently close to ets tend to be
a et al 2011 quality of Wik
ovember 2012 d GPS points
ys - 157,983,53 re huge, so eve
has been forc ze, and partly b
red some nation looked at the
h dynamic chan K, Ireland, Au
m table 1 in Moo ways on the ho
asets falling into r Germany, Aus
note that a very ve five or fewer
have been gath times at very
d. The percent rom 244,192,
any 93.1, 10 , 1,085,003 fro
hey are accurate any others – m
be that low v reas that do n
onsider this pos dited by many p
e, such as wa ite in Figure 33
n the first place es a number o
cartographical, ould be a majo
omparison. Wha empts to determ
p are often the n those of mea
n defined in ISO discussions of
ess, currency, he standard.
d with OS
ins several bill story dump file
o 500 GB in siz used rather th
had similar p kipedia. An OS
to find exact f - 3,182,076,
32, and relation erybody who h
ced to take a because their na
nal areas, not th annotation pro
ange the editing ustria and Ger
oney 2012, the orizontal axis co
o that category stria, and UK-E
y large percent r versions. The
hered in the f few sessions
tages are as fo UK 95.4, 3
0,445,536 from om 1,219,045.
te or complete; merely lost int
version number not change mu
ssibility. Some people. This cou
as the case fo 3, or it might be
e of risks to the
, surveying, an or issue, differe
at follows is a mine accuracy,
e semantic and asurement. The
O 19157, and in positional and
and logical
lion points and is available for
ze and difficult han a complete
problems when SM stats report
figures: users - ,380; nodes -
ns - 1,667,558. has studied the
subset, partly ational mapping
he entire world. cess in OSM to
g caused. They rmany as their
e number of ompared with
on the vertical Eire.
tage of ways in e overwhelming
field, uploaded, and then been
ollows: Ireland 3,384,643 from
m 11,226,308, Of course this
it may be that terest. Another
r counts occur uch over time.
few areas have uld be a rapidly
or the London e that there was
crowdsourced nd GIS skills of
ent tags may be a
, d
e n
d l
d r
t e
n t
- -
. e
y g
. o
y r
n g
, n
d m
, s
t r
r .
e y
n s
d f
e
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
417
used to descri features, and t
vandalism by structures wer
In the UK Hak much VGI la
checked four 2 by OSM volun
OS ITN Integ are accurate to
Figure 45: compared w
similar meth disp
OSM has capt quarter of th
attributes. Ple minor roads,
method outlin 3.8m buffer fo
an 8m buffer f to approximat
within this bu the roads abo
were always m each square k
than in a prev Figure 46, in 2
VGI contribut
Figure 46: A blocks, calcu
Me ibe the same w
there is the pos some contribu
re not in place to klay et al 2010
abour was req 25sqkm tiles in
nteers for comp grated Transpo
o 1m.
: Accuracy mea with OS-ITN, f
hod is proposed placement comp
tured about 30 hese were spag
enty of editing main roads an
ned in Figure 4 or minor roads,
for motorways, te the true road
uffer 80 to 86 ove 85. The V
more than 5 an kilometre of hi
vious general su 2008 where nea
ting crowdsourc
Average positio ulated for node
eridian 2 data. S web resources o
ssibility of rogu utors. Mooney
o find and elim 0 were concern
quired to map n the urban Lon
parison with the ort Network
d
asurement meth from Haklay et
d in ISO 19157 pared with a ref
of the linewor ghetti without
remains to be nd motorways
45 to measure a , a 5.6m buffer
placed around d width. He fou
6 of the time, VGI contributi
nd up to 20 co is urban Londo
urvey of VGI in arly 90 was m
cers.
onal accuracy le points, compar
See Haklay et a or to annotate s
ue edits or delib states that, in
inate rogue edit ned to discover
an area well ndon area comp
e generally exc data set where n
hod of OSM wa al 2010. A ve
for measuring l ference set.
rk for England, a complete s
done. Haklay for testing, an
accuracy. He u for main roads
d the ITN centre und OSM road
, and more than ion was large;
ontributors map on tiles, many
n England, show mapped by 3 or
evels for 1km ro ring OSM with
al 2010. spatial
berate 2011,
tors. r how
, and pleted
cellent nodes
ay ery
line , but a
set of y used
nd the used a
s, and e line,
ds fell n half
there apping
more wn in
fewer
oad OS
Meri spec
20m are c
node most
areas posit
than lies a
sugg chan
Hakl linke
mate Figu
incom 10m±
latter stand
F Bhat
large with
topo and
conc varie
and foun
comp comp
Kouk done
In th mult
OSM split
were were
cons of IT
OSM were
OSM urba
perce order
of ro town
idian 2 data w ification indica
buffer accurac correct. A comp
es, and to asse t accurate with,
s. Haklay say tional error sm
15 metres. The at about 7m. T
gest that measu nce.
lay 2010 als ed to VGI contr
erial in poorer a ure 47 shows er
mplete ones. P ±6.5m 1SD f
r, indicating po dard of OSM m
Figure 47: Distri comparison for
Engl ttacharya 2012
ely confirmed t OSM but
graphic map T accurate betw
cluded; automa ed from 70 to
building data c nd the general
pleteness was pleteness was th
koletsos et al 2 e by Haklay, us
his later article ti-stage automa
M roads. Simila into small 1km
e localised. Roa e matched usin
traints. Results TN roads could
M roads with I e much poorer
M matched ITN an and less th
entage being d r of matcher an
oads in ITN an n [20,734km v
was used as th ates it has been
y along roads, plex algorithm
ss errors. The , as might be e
s over 70 o maller than 12 m
e mode of the e he shape of the
urements better o showed that
ribution in that areas, and the le
ror levels for c Positional accur
for the former a oorer socio-eco
mapping.
ibution of error r complete and
and, from Hakl 2, in a master
the previous fi his reference
TOP10NL, whi ween scales of
atic matching o 102 presum
completeness v quality of O
high, and in a he most signific
2012 performed ing OSM and I
e, a rather mo ated process w
r to the cells in m block areas, s
ad entities, with g a combinatio
were promisin d be matched w
TN. As before : 60 of ITN
N. Matching err han 4 rural
different, and nd matchee, is
nd OSM were 18,368km], mo
the comparison n generalised
but that junctio was used to ch
major urban expected, larger
of the OSM n metres and ano
error distributio e distribution o
r than 7m wer t socio-econom
t OSM voluntee evel of complet
complete areas uracy was foun
and nearly 12m onomic areas do
rs for the OSM incomplete OS
lay et al 2010 r’s thesis in th
indings. Bhatta set was th
ich is consider f 1:5,000 and
was possible; mably more de
varied from 89 OSM was quit
agreement with cant quality fac
ed similar exper ITN for rural an
ore developed was used to m
n Figure 46 the so that compari
h feature tags of on of geometri
ng in that for ur with OSM road
e, in the rural matched OSM
rors were smal . The reason
dependent on because the ab
not the same ore ITN in coun
n to OSM. Its to lie within a
onnode centres hoose matching
areas were the r errors in rural
nodes show a other 80 less
on in Figure 47 of errors would
re obtained by mic factors are
ers provide less teness is lower.
compared with nd to be under
m±7.7m for the o have a lower
to Meridian M areas in
, he Netherlands,
acharya worked e Netherlands
red both useful 1:25,000. He
completeness etail in OSM?;
to 97. He e good where
h Haklay, that ctor.
riments to those nd urban areas.
feature based atch ITN with
e data sets were ison and results
f name and type ic and attribute
rban areas 93 ds, and 81 of
areas matches M, but 88 of
ll, between 2 for the match
the positional bsolute quantity
more OSM in ntry [2,936km v
s a
s g
e l
a s
7 d
y
e s
. h
r e
r
, d
s l
e s
; e
e t
e .
d h
e s
e e
f s
f h
l y
n v
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
418
1,953km], an as they simpl
shows how thi
Figure 48: Go OSM data. O
What Koukole has surveyed
publicly availa might be exam
reference set. data
not foun either.
Anand et al with feature r
They conclude to be sufficien
feature type in matching and
standards has likely to be pre
Figure 49: Pi
of the similar p
Al-Bakri et al on XML sche
view of both that Path in O
level. Similarl at different lev
quite awkward be gained from
OS ITN and O perform better
nd some roads i ly did not exis
is can happen.
ogle backgroun OSM has extra
from Kouko etsos calls OSM
something that able mapping. N
mples of this k Similarly ther
nd in OSM, an 2010 used sim
ich informal O ed that simple
nt and that, agre nformation was
enriching the meant that me
esent in most V ieces of the OS
arities and differ position. From
2012 suggest ema, as in Figu
schemas for O OS is Footway
ly Rail, though vels. This make
d. Nevertheless m trying to ma
OSM roads to se r than a purely g
in both data set st in the other
nd, not quite reg data in this part
oletsos et al 20 M
commission the OS does n
New estates and ind of extra ma
re are cases o nd sometimes
milar methods OSM, and to ke
geometric matc eeing with Kou
going to be ver data. Introduct
etadata are now VGI data sets.
and OSM sche
rences, both in Al-Bakri et al
t ways of simila ure 49, which s
OS MasterMap in OSM and o
the same term es calculation o
s, there should atch schemas D
ee if an ontolog geometric one t
ts were unmatc data set. Figu
gistered with O ticular area. Ta
12. data
is when not yet have on
d areas of rebui ap data over th
f OS commiss just plain erro
to try to matc eep the best of
ching was not g ukoletsos 2012
ry important, bo tion of internat
w specified and emas showing s
terminology an 2012.
arity matching b shows a very p
and for OSM. occurs at a diff
m in both schem of similarity ma
be some benef Du et al 2011
gical approach w to road matchin
chable ure 48
S and aken
OSM n their
ilding he OS
sioned ors in
ch OS both.
going 2, the
oth in ational
quite some
nd in based
partial Note
fferent mas, is
atches fits to
used would
ng. Figu
in attr
Onto simil
matc of nu
Geom matc
spec level
foun – fou
Failu spec
90 met
an on Du
geos onto
geos ensu
OW form
OGC Java
data to tr
http: the
105 prod
with meth
need and
proc ure 50: OS top
ncluding schem ribute type and n
ologies by them larity and over
ching approach ulls in OSM att
metric error, ching were used
ific pairs of OS l could then be
nd. The output f und 111 match
ures were due ific road conne
. Du concluded tadata to allow
ntological appro et al 2012 c
patial ontologi logies, and then
patial ontology
ure consistency WL 2, W3C [2
mats to facilitate C W3C standar
implementatio from disparate
ranslate, merge sourceforge.n
open source c test road match
duced 92 new ge around 90 i
hod has insuffi ds further work
in solving prob ess. The princip
p and OSM bo ma information,
name can be dif road is sele
mselves were n rlap. Du consi
Figure 50 to t tribute fields an
network conn d to assign a pr
S and OSM road e used to determ
from the open s hes in OSM o
to very differe ections, but th
d that greater su w better ontolog
oach, no formal continued work
ies by first co n merging these
y that could b y. The OWL
2009], develop e information s
rd web ontolog on for Du’s exa
e sources. The e, amend and
netprojectsgeoo code. Rather dis
hes drawn from eometries, abou
n the previous icient worth to
to refine the p blems of incon
ple is good and ottom data for
from Du et al ifferent. The red
lected. not adequate to
idered a proba try to circumve
nd feature type nections, and
robability to a d data. A thresh
mination if a m source based so
out of 116 road ent topological
he success rate uccess required
gy matching. A al ontologies we
k on automate onverting inpu
e structures into be checked an
2 Web Ontol ped as one o
sharing and int gy language. It
ample to integr authors develo
regenerate geo ontomergingfi
sappointingly t m OS and OSM
ut 85 of the in paper. This is
o be continued process of linki
nsistency during the goal is pos
the same area, 2011. Note
d OSM airport o infer concept
abilistic feature ent the problem
dissimilarities. feature type
match between hold probability
match had been oftware – QGIS
ds in OS ITN. l structures for
was still over more complete
Although using ere generated.
ed merging of ut data sets to
o a new flexible nd modified to
logy Language f the standard
tegration, is an was used in a
ate road vector ped algorithms
ometries. Visit les
to fetch he output from
M data sets only nput, compared
not to say the d; rather that it
ing information g the matching
sible. t
e m
. e
n y
n S
. r
r e
g f
o e
o e
d n
a r
s t
h m
y d
e t
n g
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
419
Are VGI contributors ever going to desire to exploit fully the geospatial standards available; or indeed to have any interest in
them? How far can loose knit organisations like OSM impose structure and completeness on their members; and, should they?
The evidence from this review would suggest that the majority of the work is done by competent committed people who do try
to produce completed maps, with full attribute sets, and careful editing. They may be unpaid, but they appear to be anything but
uncaring And, they are fast. There seems to be no question that timely update of features that interest them is a major strength
of the VGI mapping process. The numbers of VGI participants who are involved in any
particular project can be absolutely staggering, even if the distribution of effort means only relatively few do most; a few
thousand
or so perhaps? Raymond 2001 discusses what he calls Linus’ Law Linus Torvald, of Linux. He says in the
Cathedral and the Bazaar pp14-18: “Given enough eyeballs,
all bugs are shallow .” In open source developments, many
programmers are involved in the development, scrutiny, and testing the code, so software becomes increasingly better
without formal quality assurance procedures. In mapping, perhaps this can be translated into the number of enthusiastic
contributors that work on a given theme in a given area Haklay et al, 2010.
Most bazaar dwellers including most of us, dear friends are either not aware or at most partially aware of the standards we
are trying to implement, but the web software tries its best to prod us into good behaviour, so no problem really? Will they
map the bits others can’t or don’t want to reach? Almost certainly. Will they do things we, the supposed professionals
have not thought of? Oh yes, but not always for the best, perhaps. Those rural areas are a real problem if one is thinking
of backing up a national mapping organisation with VGI help. Volunteers are colourful and bumptious and the results a bit
prickly, but if their enthusiasm adds value – by no means necessarily only or at all in monetary terms – and is reasonably
accurate, then why not? Volunteers have a good editing record as seen in all the
community projects using OSM. Is VGI a major opportunity for national mapping agencies? Yes
6.2
Quality in VGI
Figure 51: from Grira et al 2009, volunteer freedom of interaction compared with authoritarian constraint.
The relationship between the actors in mapping is illustrated on a theoretical basis in Figure 51, taken from Grira et al 2009.
The graph in Figure 51 can be used to position the trends, tools, products, and processes in geospatial activity. Grira invoked
two axes: one from authoritarian to volunteered, which might be called a power axis, where many people, volunteers
individually have little power, but a few organisations, national mapping have considerable authority but also considerable
constraint on their actions. The other axis is one of capability: where full capability is exhibited by volunteers in the form of
mashups developed for a disaster mapping site perhaps, using open source software, considerable ingenuity and a large user
base; and the other end of the same axis would, amongst other consist of internet surfers viewing contributed maps. The
national mapping agency lies to the right in the diagram, including both the highly trained surveyor in the field
influencing changes in a map, and in the constrained and limited corner print shop, where the map is printed correctly, or perhaps
not. Grira considered there were three gaps threatening the success
of volunteered efforts in relation to institutions: − A normative gap – the difference between quality expected
by end users and quality standards developed by organisations, for instance ISO 19115 on metadata
compliance. Very, very few volunteers will have looked at, let alone read and understood the specification.
− A technological gap – GIS or map servers provide geographic information; it may well be an uphill struggle
for users or volunteers to upload feedback and comments about data quality. In the age of VGI, institutions that
supply mapping will need to reconsider their role in relation to users.
− A cognitive gap – volunteers need to be able to use less static types of metadata, and introduce their own
terminology. At the same time producers must realise that user perception of the meaning of quality is often confused
with that of accuracy see section on trust later in this paper.
Figure 52: From Cooper 2011. African based 2D VGI taxonomy, and the advent of the crowdsourced SDI.
Cooper et al 2011 took a far more practical approach to the problem of a 2D taxonomy of VGI, with one axis called
Determination of Specifications instead of Power, and the
other axis Type of Data instead of Capability, and the entries are real web sites rather than theoretical activities, but the principles
of construction are the same.
At the top right in Figure 52 is an as yet probably non-existent, possibly national, perhaps international, crowdsourced SDI
where users contribute data according to tight specifications from the SDI custodian, who would then subject the VGI to full
quality assurance processes. This concept has become a distinct possibility as open source software platforms become both more
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
420
integrated and users more sophisticated. In the same quadrant OSM can be found: a crowdsourced product but with less
draconian specifications. In the lower right quadrant are regional or national point location databases for Points of
Interest
, such as Precint Web, a South African crime mapping site. The lower left quadrant contains sites which require
minimal custodial activity and contain location data – such as the Google Panoramio site that holds photos for Google Earth,
without much location editing or content checking, and hence fairly prone to error. The top left quadrant is effectively empty
as fundamental data sets Base data are very unlikely to be held by and for an individual user.
National mapping agencies provide data of generally higher quality than VGI output, as they have better trained staff, better
kit, they understand and are concerned about standards, and survey is their job of work. But, national responsibility for
national coverage at a particular range of scales might result in significant delays before they can update data in certain areas.
Most surveys keep a record of known changes that have occurred on map sheets and hope to update the maps when these
changes have reached a specified threshold level. This requires the agency both to notice change has taken place, presumably
by aerial photo survey or ground detection, and is quite costly. If at the very least the public could be persuaded to form part of
the notification process, then VGI might be the best available method to document changes when they happen and
simultaneously result in revision requests being submitted to the relevant agency. A stage more helpful, but risky perhaps to the
fundamental SDI, would be VGI data gathering and mapping initiatives, heading towards the fully crowdsourced SDI region
in Figure 52. Poor quality data might result from VGI, but it might also be produced by commercial contractors. In all cases
– VGI, internal or contracted – systems are needed to check the data quality, presumably to ISO 19157 and other standards. The
provision of metadata to ISO 19115 standard?, peer pressure and review should keep VGI contributors on the straight and
narrow. If it does not, then software tools are being developed to assess the quality aspects of imported data and to ensure their
consistency with the national depository. These tools are not yet fully functional see previous sections in
this paper but are showing promise in the research laboratory. ISO 19158 on Quality Assurance of Data Supply Jakobsson,
2011, now issued as a published standard 2012, is a major attempt to ensure validation of data input to measurable quality
standards. The standard allows three different levels of accreditation to be certified, and these would pass with the data
to all user generated output. It will be particularly useful for assessing data transfer potential between countries for the
INSPIRE programme within Europe. Whether the ISO 19158 specification is framed sufficiently widely to be able to be used
to certify VGI data – even data as robust as OSM data seems to be – is an interesting question. Yet this goal must be achieved if
VGI data is to enter the marble halls of NSDIs. Perhaps NSDIs should recognise levels of trustworthiness for
data, rather than set absolute standards? The question of how to assign trustworthiness indicators to datasets that do not meet all
standards requirements and whether this might be a solution where data is needed but better is not yet available, is
considered later in this paper. 6.3
Trust in VGI
User trust is different in form from agency trust as the user has different expectations and is concerned about different aspects
of VGI than the agency. The former wants thematic accuracy most of all with some, possibly mostly, topological regard for
locational consistency, whereas the agency tends to reverse this order, or even insist on both. Goodchild 2009 considered
“uncertainty regarding the quality of user-generated content is often cited as a major obstruction to its wider use
”, and considered that crowdsourcing ventures should always publish
assessments of quality. With the advent of the appropriate trustworthy ISO standards perhaps this has now become
possible? Stark 2010 tried an address matching method to asses
accuracy, quality and trust. The Swiss canton of Solothurn had over 90,000 geocoded addresses acting as the reference set
based on national cadastral survey data. If people using an address matching service are going to trust it then the address
location displayed must point reasonably closely to the mailbox location on map or photo. Goodchild 2007 thought failure in
achieving good locational matches to be a major vulnerability to acceptance of a VGI approach to generating location data.
Figure 53: Part of canton of Solothurn, from Stark 2012. The cadastral map at the top shows correct house locations. At the
bottom are the attempts by Google Maps, Bing and Yahoo Maps to locate addresses on an air photo background, with the
reference locations displayed. Stark used Google Maps, Bing and Yahoo Maps to locate the
addresses, and then compared these locations with the cadastral survey derived information see Figure 53. Over 95 of the
addresses were found for all three open WMS geocoders but the locations given were quite suspect in some cases. Figure 53
demonstrates a particularly poor area, where Bing and Yahoo
Volume XL-1W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany
421
Maps linearly road. Google
probably beca shows the dist
Figure 54 Dis the whole can
axis; percent The modal dis
perhaps less geolocators, b
pattern to tho Google Maps
compared with The effect of
terms of trust was two years
but more exam of this paper’
navigation sys house, and on
1975 on his user developin
consequences? Kessler et al
Flikr photos. gazetteers use
types for nam building block
Figure 55: K Pink dots
Photo location of “Manhattan
in, or of, Manh y interpolated a
tried harder an ause it used the
tribution of loca
stribution of dis nton. Error in m
tage in each dist From S
stance error for than 5m – an
but the distribu ose visible in
s error tail h the other two.
mislocations s by the user po
s ago and the s mples continue
’s authors disc stem did not kn
nly one of six cul-de-sac was
ng a lack of t ?
2009 investig An example
e place names, med geographi
k for spatial sea
Kessler et al 200 s show the vary
Ma n errors have be
n” photos becau hattan, Some ph
address location nd was mostly
cadastral surve ation errors for
stance errors in metres is shown
tance category Stark 2010.
the entire canto nd roughly the
tions of error s the small exa
is very signi .
such as these i opulation. One
situation has no to crop up from
overed his new now the numbe
x houses built s recognised; de
trust Small fa gated similar lo
is given in geographic foo
ic places and rch engines.
09 Flickr geolo ying locations o
anhattan .
een made in pi use they may ha
hotos for exam ns along the c
correct for this ey source. Figu
the three system
address locatio along the horiz
along vertical a
on data set is sm same for all
show a very si mple in Figur
ficantly suppr
is very damagi e might say tha
ow improved; i m other sources
w Nov 2012 er or location o
between 1830 efinitely a case
ailures can hav ocation problem
Figure 55. D otprints, and fe
are a fundam
ocation failure t f photos tagged
cking the corre ave been either
ple have been losest
s area ure 54
ms.
on for zontal
axis. mall –
three imilar
re 53. ressed
ing in at this
it has, s. One
car’s of his
0 and e of a
ve big ms for
Digital eature
mental
test. d
ect set taken
taken on th
in te solvi
using quali
cross reaso
and u noise
new Gaze
reinf pape
reaso Bror
Sens inter
6.4 Do