Course: IKI81404T : Perancangan Sistem Informasi
PERANCANGAN SISTEM INFORMASI
Session
Session 88 Data
Data Storage
Storage Design
Design
nd
Based
Based on
on System
System Analysis
Analysis && Design
Design 22nd Edition
Edition
Authors
Authors :: Alan
Alan Dennis
Dennis && Barbara
Barbara Haley
Haley Wixom
Wixom
Publisher
Publisher :: John
John Wiley
Wiley && Sons
Sons
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Facult y of Com put er Science – UNI VERSI TY OF I NDONESI A
Objectives
Objectives
Become familiar with several file and database
formats
Understand several goals of data storage
Be able to optimize a relational database for
data storage and data access
Become familiar with indexes
Be able to estimate the size of a database
2
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Key
Key Definitions
Definitions
The data storage function manages how data is
stored and handled by programs that run the
system
Goals of data storage design
Efficient data retrieval (good response time)
Access to the information users need
3
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
DATA STORAGE FORMATS
4
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Types
Types of
of Data
Data Storage
Storage Formats
Formats
Files: electronic lists of data optimized to
perform a particular transaction
Database: a collection of groupings of
information the relate to each other in some
way.
A Database Management System (DBMS) is
software that creates and manipulates databases.
5
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Appointment
Appointment File
File
6
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Appointment
Appointment Database
Database
7
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
File
File Attributes
Attributes
Files contain information formatted for a
particular transaction
Typically organized sequentially
Pointers used to associate records with other
records
Linked Lists are files with records linked
together using pointers
8
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
File
File Types
Types
Master files
store core, important information
Look-up files
store static values
Transaction files
store information that updates a master file
Audit files
record before and after versions of data
History (archive) files
store past information
9
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Database
Database Types
Types
Legacy database
Hierarchical (depict parent-child relationships using
inverted trees)
Network (depict nonhierarchical associations using
pointers)
Relational database
Object database
Multidimensional database
10
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Hierarchical
Hierarchical Database
Database Example
Example
11
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Network
Network Database
Database Example
Example
12
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Relational
Relational Database
Database Concepts
Concepts
Popular; easy for developers to use
Primary and foreign keys used to identify and
link tables
Referential integrity ensures correct and valid
table synchronization
Structured Query Language (SQL)- standard
language for accessing data
13
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Relational
Relational Database
Database Example
Example
14
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Object
Object Database
Database Concepts
Concepts
Built around objects consisting of both data
and processes
Objects are encapsulated (self-contained)
Object classes – major object categories
OODBMS – used primarily for applications with
multimedia or complex data
Hybrid OODBMS – both object and relational
features
15
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Object
Object Database
Database Example
Example
16
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Multidimensional
Multidimensional Database
Database Concepts
Concepts
Stores data for easy aggregation and
manipulation across many dimensions
Used for data warehouses and data marts
Summary data is pre-calculated and stored for
fast access
17
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Multidimensional
Multidimensional Database
Database Example
Example
18
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
OPTIMIZING DATA STORAGE
19
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Dimensions
Dimensions of
of Data
Data Storage
Storage Optimization
Optimization
Conflicting goals:
Storage efficiency (minimizing storage space)
Speed of access (minimizing time to retrieve
desired information)
20
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Storage
Storage Efficiency
Efficiency
Minimize null values and redundancy
Reduce update anomalies
Normalization process optimizes the data
storage design for storage efficiency
21
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Optimizing
Optimizing Data
Data Storage
Storage Efficiency
Efficiency
22
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Normalization
Normalization Steps
Steps
23
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Optimizing
Optimizing Access
Access Speed
Speed
Techniques available to increase access speed
after optimizing for efficiency
Denormalization
Clustering
• Intrafile
• Interfile
Indexing
24
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Denormalization
Denormalization
Add redundancy back to data storage design to
reduce the number of joins performed in a
query
Ideal for frequently queried but rarely updated
data
Look-up tables
1:1 relationships
Add parent attributes to child
Star schema design data models
25
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Clustering
Clustering
Reduce the number of times storage must be
accessed by physically placing like records
close together.
Intrafile clustering – similar records in a table are
stored together
Interfile clustering – combine records from more
that one table that are typically retrieved together
26
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Indexing
Indexing
A minitable that contains values from one or
more fields in a table and the location of the
values within the table
Similar to the index of a book.
27
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Payment
Payment Type
Type Index
Index
28
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Guidelines
Guidelines for
for Creating
Creating Indexes
Indexes
Use indexes sparingly for transaction systems
Use many indexes to increase response times in
decision support systems
For each table
Create a unique index based on the primary key
Create an index based on the foreign key
Create an index for fields used frequently for
grouping, sorting, or criteria
29
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Volumetrics
Volumetrics –– Estimating
Estimating Data
Data Storage
Storage Size
Size
Raw data – sum of the average widths of all
fields in a table.
Calculate overhead requirements based on
DBMS vendor recommendations
Estimate initial number of records
Estimate growth rate of records
30
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Data
Data Storage
Storage Size
Size Estimator
Estimator
31
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Summary
Summary
Files are electronic lists of data generally of five
types: master, look-up, transaction, audit, and
history.
A database is a collection of groupings of
information and a DBMS is software that creates
and manipulates these.
There are a number of methods for optimizing data
access speed and data storage efficiency, though
the designers may have to make tradeoffs between
these goals.
32
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Session
Session 88 Data
Data Storage
Storage Design
Design
nd
Based
Based on
on System
System Analysis
Analysis && Design
Design 22nd Edition
Edition
Authors
Authors :: Alan
Alan Dennis
Dennis && Barbara
Barbara Haley
Haley Wixom
Wixom
Publisher
Publisher :: John
John Wiley
Wiley && Sons
Sons
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Facult y of Com put er Science – UNI VERSI TY OF I NDONESI A
Objectives
Objectives
Become familiar with several file and database
formats
Understand several goals of data storage
Be able to optimize a relational database for
data storage and data access
Become familiar with indexes
Be able to estimate the size of a database
2
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Key
Key Definitions
Definitions
The data storage function manages how data is
stored and handled by programs that run the
system
Goals of data storage design
Efficient data retrieval (good response time)
Access to the information users need
3
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
DATA STORAGE FORMATS
4
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Types
Types of
of Data
Data Storage
Storage Formats
Formats
Files: electronic lists of data optimized to
perform a particular transaction
Database: a collection of groupings of
information the relate to each other in some
way.
A Database Management System (DBMS) is
software that creates and manipulates databases.
5
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Appointment
Appointment File
File
6
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Appointment
Appointment Database
Database
7
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
File
File Attributes
Attributes
Files contain information formatted for a
particular transaction
Typically organized sequentially
Pointers used to associate records with other
records
Linked Lists are files with records linked
together using pointers
8
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
File
File Types
Types
Master files
store core, important information
Look-up files
store static values
Transaction files
store information that updates a master file
Audit files
record before and after versions of data
History (archive) files
store past information
9
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Database
Database Types
Types
Legacy database
Hierarchical (depict parent-child relationships using
inverted trees)
Network (depict nonhierarchical associations using
pointers)
Relational database
Object database
Multidimensional database
10
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Hierarchical
Hierarchical Database
Database Example
Example
11
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Network
Network Database
Database Example
Example
12
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Relational
Relational Database
Database Concepts
Concepts
Popular; easy for developers to use
Primary and foreign keys used to identify and
link tables
Referential integrity ensures correct and valid
table synchronization
Structured Query Language (SQL)- standard
language for accessing data
13
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Relational
Relational Database
Database Example
Example
14
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Object
Object Database
Database Concepts
Concepts
Built around objects consisting of both data
and processes
Objects are encapsulated (self-contained)
Object classes – major object categories
OODBMS – used primarily for applications with
multimedia or complex data
Hybrid OODBMS – both object and relational
features
15
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Object
Object Database
Database Example
Example
16
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Multidimensional
Multidimensional Database
Database Concepts
Concepts
Stores data for easy aggregation and
manipulation across many dimensions
Used for data warehouses and data marts
Summary data is pre-calculated and stored for
fast access
17
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Multidimensional
Multidimensional Database
Database Example
Example
18
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
OPTIMIZING DATA STORAGE
19
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Dimensions
Dimensions of
of Data
Data Storage
Storage Optimization
Optimization
Conflicting goals:
Storage efficiency (minimizing storage space)
Speed of access (minimizing time to retrieve
desired information)
20
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Storage
Storage Efficiency
Efficiency
Minimize null values and redundancy
Reduce update anomalies
Normalization process optimizes the data
storage design for storage efficiency
21
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Optimizing
Optimizing Data
Data Storage
Storage Efficiency
Efficiency
22
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Normalization
Normalization Steps
Steps
23
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Optimizing
Optimizing Access
Access Speed
Speed
Techniques available to increase access speed
after optimizing for efficiency
Denormalization
Clustering
• Intrafile
• Interfile
Indexing
24
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Denormalization
Denormalization
Add redundancy back to data storage design to
reduce the number of joins performed in a
query
Ideal for frequently queried but rarely updated
data
Look-up tables
1:1 relationships
Add parent attributes to child
Star schema design data models
25
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Clustering
Clustering
Reduce the number of times storage must be
accessed by physically placing like records
close together.
Intrafile clustering – similar records in a table are
stored together
Interfile clustering – combine records from more
that one table that are typically retrieved together
26
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Indexing
Indexing
A minitable that contains values from one or
more fields in a table and the location of the
values within the table
Similar to the index of a book.
27
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Payment
Payment Type
Type Index
Index
28
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Guidelines
Guidelines for
for Creating
Creating Indexes
Indexes
Use indexes sparingly for transaction systems
Use many indexes to increase response times in
decision support systems
For each table
Create a unique index based on the primary key
Create an index based on the foreign key
Create an index for fields used frequently for
grouping, sorting, or criteria
29
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Volumetrics
Volumetrics –– Estimating
Estimating Data
Data Storage
Storage Size
Size
Raw data – sum of the average widths of all
fields in a table.
Calculate overhead requirements based on
DBMS vendor recommendations
Estimate initial number of records
Estimate growth rate of records
30
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Data
Data Storage
Storage Size
Size Estimator
Estimator
31
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia
Summary
Summary
Files are electronic lists of data generally of five
types: master, look-up, transaction, audit, and
history.
A database is a collection of groupings of
information and a DBMS is software that creates
and manipulates these.
There are a number of methods for optimizing data
access speed and data storage efficiency, though
the designers may have to make tradeoffs between
these goals.
32
PowerPoint Course Mat erial for SCELE Graduat e Program I nform at ion Technology
Fa cu lt y of Com put e r Scie n ce – Unive r sit y of I n don e sia