GIS-05_Databases.ppt 1456KB Mar 29 2010 04:55:20 AM

GIS DATABASES
an overview

Contents
– the basics of data storage
– overview of databases
• the database approach
• types of databases
• databases in GIS

– design considerations
– development of an ARC/INFO database

2

Conceptual, logical and physical ...

Conceptual

Logical


Physical
3

A storage hierarchy ...
– files/tables
• records
• fields(types …)





databases
information systems
decision support systems (DSS)

increasin
g
complexit
y


– approaches to storage
• application/file based
• databases

4

Application based approach
Tax/Rates
Tax/Rates
Assessment
Assessment

Assessment Data

Permits
Permits

Permit Data


Sewer
Sewer
Maintenance
Maintenance

Sewer Data

Applications using data stored as Application Specific data

5

Tax/Rates
Tax/Rates
Assessment
Assessment
Permits
Permits
Sewer
Sewer
Maintenance

Maintenance

Database Management System

Database approach
Assessment
AssessmentData
Data

Permit
PermitData
Data

Sewer
Sewer Data
Data

Database approach and use of shared data implications for GIS

6


Database … a definition
• A collection of interrelated data stored
together with controlled redundancy to
serve one or more applications in an
optimal fashion.
• A common and controlled approach is used
in adding new data and modifying and
retrieving existing data within the data base
7

Databases… objectives/advantages
– centralised data storage and management …
global view of data … data dictionary







standardisation of all aspects of data management
reduced duplication
multiple access / retrieval flexibility
integrity constraints … validation enforced
...

– data base management system (DBMS)

8

Database/s… data dictionary
– the most critical (?) element of a database
– data about data… metadata
– essential for system development
– uses include






design - entities and data relationships
data capture - entry/validation
operations - program documentation
maintenance (impact assessment of proposed
changes , est. of effort, cost …)
9

Data dictionary…
types of information (general)

10

GIS Metadata

DBMS … key modules
– a data description/definition module
• defines/creates/restructures
• enforces rules

– a query module

• retrieval for queries, ad-hoc queries, simple reports

– a report writing program
– a high level language interface
– ...
12

Database… stages of development
– information systems plan for organisation
– system specification … user needs analysis
– conceptual design … data modelling
• hardware and software independent

– physical design … database design
– database implementation
– monitoring/audit

13

Database… stages of development


14

Organisational strategy and IT
Land Information System (LIS) (i)
– Problems/issues:
• rationalisation of land related information in
government agencies
• the removal/reduction of duplication
• introduction of economies in data capture,
maintenance and storage
• better (and wider) access to data

solutions .

..
15

Organisational strategy and IT
Land Information System (LIS) (ii)

– Solutions:
• better data distribution mechanism (data format and
location transparent to user)
• knowledge of data distribution built into the data
dictionary
• reduction of data duplication
• uniform query language (SQL)
• coding and data interchange standardisation ( …
SDTS)

16

Database types a history
Evolution of Database
technology

18

Database types - hierarchical (i)
– lends itself to GIS use as data are often

hierarchical in structure e.g. municipality x
province x country
– records divided into logically related fields …
connected in a tree-like arrangement
– master field in each group of records …
pointers … updates require pointers to be
modified
– fast preset queries … ad hoc queries difficult or
impossible
19

Database types
- hierarchical (ii)
COUNTRY (USA)

States

Counties
Boundaries

Nodes

20

Hierarchical Structure for a
Cadastral database

Hierarchical Structure for a
Cadastral database

Database types - network (i)
– similar to hierarchical but have multiple
connections between files to accommodate
many to many (M:M) relationships
– access to a particular file without searching the
entire hierarchy above that file
– linked records … quick preset searches … large
overhead in pointer management
– modification after creation difficult
23

Database types - network (ii)

24

Database types - network (ii)

25

Database types - relational (i)
– model developed from mathematics
– records and fields in a 2-dimensional table
– no pointers etc … any field can be used to link
one table to another
– normalisation … redundancy/stable structure
– ad hoc queries SQL… modifications easy
– not very efficient for GIS …SQL3

26

Database types - relational (i)

27

Database types - relational (iii)

28

Hierarchical structure
Network structure

Relational structure
(part…)

Centralised vs distributed
– a database does not necessarily mean a
centralised arrangement i.e. all data in one
physical place

30

GIS and distributed
databases
... ...
– trend towards
open systems
• special hardware and software can be used widely
… specific applications optimised
• system/network communications is easier

– modular implementation from an overall design
… incremental change
– unlimited capacity (nodes) … lower risks

31

Approaches to GIS system design
– develop a proprietary system
– develop a hybrid system: proprietary graphics +
commercial DBMS for attribute data (e.g.
ARC/INFO)
– use commercial DBMS and develop spatial
functions and graphics display used in
geographic analysis (e.g. siroDBMS, System9)
– develop a spatial DBMS from scratch
32

Approaches to GIS system
design

33

(1) Separate Spatial and attribute data

Software
linkages

(2) Integrated Spatia
and attribute data

GIS databases … some problems (i)
– centralised risk
• centralisation demands better quality control other higher
potential for disaster

– cost
• large DBMSs are expensive to design, implement and operate
• piecemeal design is difficult

– complexity
• need to keep track of complex hardware and software
• need to keep track of graphical as well as attribute data and the
links

35

GIS databases … some problems (ii)

Cascading effects of change in a GIS database (ESRI 1989)

36

GIS Design

GIS database design guide

38

Objectives of design
– a good design results in a database which:
• contains necessary data but no redundant data
• organises data so that different users access the same
data
• accommodates different views of the data
• distinguishes applications which maintain data from
those that use it
• appropriately represents, codes and organises
geographic features

39

Design methodology (for ARC/INFO)
– conceptual model
• model the users’ view
• define entities and their relationships

– logical model
• identify representation of entities
• match to ARC/INFO data model
• organise into geographic data sets

– physical model
40

Design methodology (for ARC/INFO)
– 1. Model the users’ view
– 2. Define entities and their relationships
– 3. Identify representation of entities
– 4. Match to ARC/INFO data model
– 5. Organise into geographic data sets


41

1. Model the users’ view
– create a model of work performed by users for
which ‘location’ is a factor
• identify organisational functions
• identify the data which supports the functions

– organise data into sets of geographic features
• data function matrix




high level classification of data
interdependence of data and function
difference between users and creators of data

42

Land development management function

43

Data function matrix …an example

44

2. Define entities and their relationships
– entities: distinguishable objects which have a
common set of properties
• identify and describe entities
• identify and describe the relationship among these
entities
• document the process
– diagrams
– data dictionary

• Normalise the data

45

Entity/relationship definition

46

Diagramming … entities

47

Normalisation
– First Normal Form (1NF)
– Second Normal Form (2NF)
– Third Normal Form (3NF)

ASR - Assessor

48

Underlying entities...

Parcel

Zoning Owner Ownership

3. Identify representation of entities
– determine the most effective spatial
representation for geographic features
– consider whether:
• a feature might be represented on a map
• the shape of a feature might be significant in
performing geographic analysis
• the feature will have different representations and
different map scales
• textual attributes of the feature will be displayed on
map products
• ...

53

4. Match to ARC/INFO data model
– determine the appropriate ARC/INFO
representation for entities
• points, lines, polygons

– ensure complex feature classes are supported
• route comprised of sections which in turn are based
on arcs
• a region is composed of polygons
• event is a point or a line which occurs along a route

– others (e.g. GRID, TIN)
54

Matching to ARC/INFO data model
Entity Spatial ARC/
type
INFO

Related Coverage Attribu Anno.
to
tefiles LUT

55

5. Organise into geographic data sets
– to identify and name the geographic data sets
that will contain the various entities:
• define the contents of geographic data sets
(coverages, grids etc)
• name workspaces, geographic data sets, entities and
attributes
• complete entity definitions
• add cartographic text and lookup tables

56

5(i) Define the content of geographic data sets
– Data sets supported : coverage, grid, tin, image
and drawing
– coverages several entities can be grouped into a
single coverage
– DBMS : stored in a separate database
management system

57

5 (ii) Geographic datasets, entities and attributes

– coverage definitions
• high level summary of the data physically stored in
the database
• required for defining the coverage structure

– file naming conventions in ARC/INFO

58

5 (iii) Complete entity definitions
– background information: coverage name, data
source, agency, number of records etc.
– attribute definition
• attribute name, type, field width
• validation rules/ permitted values

59

5 (iv) Cartographic text & code tables
– annotation (text, placing rules etc)
– look up tables
• pre defined set of values
• description/ labels
• means of creating displays based on attribute values

60

Robinson (Ch 14): Scale and GIS databases
– (past) map’s scale greatly influenced map
content and data resolution
– GIS data are ‘scaleless’ … scale is still a critical
factor with digital databases - because of the
ways in which we create digital databases
– scale and resolution (Tab 14.1)

61

Robinson (Ch 14): Scale and resolution issues
– symbolisation and display problems
– handling databases of different scales
• join problems (e.g. urban rural)
• merge problems (different themes)
• scale levels
– in general
– large scale data (AM/FM etc.)

62

Robinson (Ch 15): Managing large GIS
– Data organisation




partitioning
spatial indexes
metadata

– data compression
• run length encoding (RLE)
• quadtree encoding
• others ...

63