Database_Management.ppt 333KB Mar 29 2010 04:55:20 AM

Database Management
Foundations of Information Systems

© The McGraw-Hill Companies, Inc.., 1998

MS PowerPoint 4.0
Presentation

MS PowerPoint 4.0
Presentation

MS PowerPoint 4.0
Presentation

MS PowerPoint 4.0
Presentation

Chapter Objectives
• Specify the elements of the data hierarchy.
• Understand what can be accomplished with files and
what the limitations of a file-based environment are.

• Understand the advantages of a database
environment and the role of a database management
system (DBMS).
• Specify the three levels at which data are defined in
databases.
• Compare the three data models.
• Understand the operation of relational databases and
theirquery language SQL.

MS PowerPoint 4.0
Presentation

MS PowerPoint 4.0
Presentation

MS PowerPoint 4.0
Presentation

Chapter Objectives, cont.
• Understand the role of the data dictionary.

• Explain the components of information
resource management.
• Describe the developmental trends in
database management, including distributed
databases, data warehouses, and objectoriented and other rich database.

Hierarchy of Data
• Data are the Principal Resources of an
Organization
• Data Stored in Computer Systems form
a Hierarchy
– Extending from a Single Bit to a Database
– The Major Record Keeping entity of a firm
– Each higher rung of this hierarchy is
organized from the components below it

Hierarchy of Data








Data are logically organized into:
1. Bits (Characters)
2. Fields
3. Records
4. Files
5. Databases

Hierarchy of Data
• Bit (Character)
– The smallest unit of data representation
– Value of a Bit may be 0 or 1
– 8 Bits make a Byte which can represent a
character or a special symbol in a
character code

Hierarchy of Data

• Field
– A grouping of characters
– A data field represents an attribute,
characteristic, or quality of some entity,
object, person, place, or event.

Hierarchy of Data
• Record
– A collection of attributes that describe a
real world entity
– A Record consists of fields with each field
describing an attribute of the entity

Hierarchy of Data
• File
– A group of related records
– Classified by the application for which
they are used
• Example: Employee File


– Primary Key in a File is the Field (or
Fields) whose value identifies a records
among others in a data file

Hierarchy of Data
• Database
– An integrated collection of logically related records or
files
– Consolidates records previously stored in separate files
into a common pool of data records that provides data
for may applications
– The Data is managed by systems software called
Database Management Systems (DBMS)
– The Data stored in a Database is independent of the
application programs using it and of the types of
secondary storage devices

Hierarchy
Hierarchy of
of Data

Data Organization
Organization in
in
Computer
Computer Storage
Storage
Example

Component of Data
Organization

SUPPLIERS

Database

Logical
Components

File


PARTS

SHIPMENTS
SUPPLIERS
NO. NAME

STREET
ADDRESS

13

Gasket Co.

50 Oak

Tifflin

OH 44883

13


Gasket Co.

50 Oak

Tifflin

OH 44883

CITY ST ZIP

.
.
3251 Reliable Supp. 11.Cedar Teaneck NJ 07666

Record

Physical
(Storage)
Components


Field
(attribute)
Byte
Bit

Reliable Suppliers
01000001 (represents “A” in the
ASCII-8 character code)
0

File Environment and Its
Limitations
• File Organization
– Data files are organized so as to facilitate
access to records and to ensure efficient
storage
• A Tradeoff between these two requirements
generally exists
– If rapid access is required, then more storage is

required.

File Environment and Its
Limitations
• Access to a record for reading it is the essential
operation on data
• 1. Sequential Access
– Records are accessed in the order they are stored
– The main access mode only in Batch Systems where files are
used and updated at regular intervals

• 2. Direct Access
– Required by On-Line Processing
– A Record can be accessed without including the records
between it and the beginning of the file
– The Primary Key serves to identity the needed record

File Environment and Its
Limitations






File Organization Methods
1. Sequential Organization
2. Indexed Sequential Organization
3. Direct Organization

File Environment and Its
Limitations
• Sequential Organization
– Records are physically stored in a specified order according to a key
field in each record
– Advantages
• Fast
• Efficient for dealing with large volumes data that need to be processed
periodically (batch system)

– Disadvantages
• Requires that all new transactions be sorted into the proper sequence for
sequential access processing
• Locating, Storing, Modifying, Deleting, or Adding Records in the file
requires rearranging the file
• The method is too slow to handle applications requiring immediate
updating or responses

File Environment and Its
Limitations
• Indexed Sequential Organization
– Records are physically stored in
sequential order on a magnetic disk or
other direct access storage device based
on the key field of each record
– Each file contains an index that references
one or more key field of each data record
to its storage location address

File Environment and Its
Limitations
• Direct Organization
– Provides the fastest direct access to records
– Records do not have to be arranged in any particular
sequence on storage media
– Computers must keep track of the storage location
of each record using a variety of direct organization
methods to retrieve data
– New transactions’ data does not have to be sorted
– Processing that requires immediate responses or
updating is easily performed

File Environment and Its
Limitations
• Limitations of a File Oriented
Environment
– Fair amount of data redundancy
– Example of Three Systems





Supplier
Shipment
Inventory
Note: These Databases may have
inconsistencies

Database Environment
• A Database is an organized collection of
interrelated data that serves a number of
applications in an enterprise.
– Stores both the values of various entities
and their relationships
– Managed by a Database Management
System DBMS
• A system software that provides assistance in
managing databases shared by multiple users

Database Environment
• DBMS
• Helps organize data for effective access
by a variety of users with different access
needs and for efficient storage
• Makes it possible to create, access,
maintain, and control databases
• Data can be integrated and presented on
demand

Database Environment
• Advantages of a Database Management
Approach
• Avoiding uncontrolled data redundancy
and preventing inconsistency
• Program-Data independence
• Flexible access to shared data
• Benefits of centralized control of data

A
A Database
Database Environment
Environment
Reports
End Users

Application
Program A
.
.
.

Application
Program X

DBMS
Database

Query

Update

Reports

Levels of Data Definition in
Databases
• User view of a DBMS is the basis for
modeling steps
• Data models define the logical
relationships among the elements to
support the basic process
• Logical framework to base the physical
design of databases and the development
of application programs

Levels of Data Definition in
Databases
• DBMS defines a database
• Schema
– Overall logical view of the relationships between data in a
database

• Subschema
– Logical view of data relationships needed to support specific
end user applications to access the database

• Physical
– How data is physically arranged, stored, and accessed on the
magnetic disks and other secondary storage devices of a
computer system

Levels of Data Definition in
Databases
• DBMS provides the language, DDL
Data Definition Language to define the
databases on the three levels (Schema,
Subschema, and Physical), and the
DML Data Manipulation Language to
access records, change values of
attributes, and delete or insert records

Data Models or How to
Represent Relationships
Between Data

• Data Model is a method for organizing databases
on the logical, schema, and subschema levels.
• The main concern is how to represent
relationships among database records
• Relationships are based on logical data structures
or models
• DBMS are designed to provide end users with
quick, easy access to information in the
databases






Data Models or How to Represent
Relationships
Between Data
Data Model Structures
Hierarchical
Network
Relational

Data Models or How to Represent
Relationships
Between Data: Hierarchical






Used by early mainframe DBMS
Relationships between records form a hierarchy or tree-like
structure.
Records are dependent and arranged in multilevel structures of
one root record and any number of subordinate levels.
Relationships among the records are one-to-many as each data
element is related only to one element above it.
The data element or record at the highest level is the root
element. Any data element can be accessed by moving
progressively downward from the root and along the branches
of the tree to locate the record.

Data Models or How to Represent
Relationships
Between Data: Hierarchical
• Advantages
– Ease with which data can be stored and retrieved in
structured, routine types of transactions
– Ease with which data can be extracted for reporting purposes
– Routine types of transaction processing are fast and efficient.

• Disadvantages
– Hierarchical one-to-many relationships must be specified in
advance and are not flexible.
– Cannot easily handle ad hoc requests for information
– Modifying a hierarchical database structure is complex
– Redundancy

CLASS
COURSE
NUMBER

MIS 101

A
A Hierarchical
Hierarchical Database
Database
SECTION

ROOM
NUMBER

1

525

MEETING
DAYS

MEETING
HOURS

MWF

10-11 a.m.

PROFESSOR
PROFESSOR
NUMBER

221

PROFESSOR
NUMBER

STUDENT
NUMBER

Turkel

02579
02311
01171

STUDENT
NAME

Young
Volkonsky

Johnson

STUDENT
STUDENT

STUDENT

Data Models or How to Represent
Relationships
Between Data: Network
• Can represent more complex logical
relationships and is used by mainframe
DBMS packages
• Allows many-to-many relationships among
records meaning that the network model can
access a data element by following several
paths due to multiple relationships

Data Models or How to Represent
Relationships
Between Data: Network
• Advantages
– More flexible than the hierarchical model
– Ability to provide sophisticated logical relationships
among the records

• Disadvantages
– Network many-to-many relationships must be specified in
advance
– User is limited to retrieving data that can be access using
the established links between records
– Cannot easily handle ad hoc requests for information
– Requires knowledge of a programming language

A
A Network
Network Database
Database
PROFESSOR

STUDENT

PROFESSOR PROFESSOR
NUMBER
NAME

221

STUDENT
NUMBER

Turkel

01171

Links to Records of
Other Classes Taught
by This Professor

STUDENT
NAME

Johnson
Links to Records of
Other Classes Taken
by This Student
Links to Records of
Other Students
Taking This Class

COURSE
NUMBER

MIS 101

SECTION

ROOM
NUMBER

1

525

MEETING
DAYS

MWF

MEETING
HOURS

10-11 a.m.

Data Models or How to Represent
Relationships
Between Data: Relational
• Most popular Database Structure
• Used by most microcomputer DBMS packages as
well as many minicomputer and mainframe
systems
• Data elements within the database are stored in the
form of simple tables which are related if they
contain common fields
• DBMS packages based on the relational model can
link data elements from various tables to provide
information to users

Data Models or How to Represent
Relationships
Between Data: Relational
• Advantages
– Flexible in that ad hoc information request can be handled
– Easy for programmers
– End users can use Relational Structures with little effort or
training.
– Easier to maintain than the hierarchical and network
models

• Disadvantages
– Cannot process large amounts of business transactions as
quickly and efficiently as the hierarchical and network
models

Relational Databases








A collection of tables that is relatively easy to use and
understand offering flexibility for the data and ability to
modify
All records in a relational database must have a unique primary
key
Relational Systems support three principal operations on tables
without any predefined access paths
Select- from a specified row that satisfies a given condition
Project- selects the attribute values
Join- builds a new table from two specified tables

Relational Databases
• The power of the relational model
derives from the join operation.
– Records are related through a join
operation rather than links so that a
predefined access path is not needed
– The join operation is time consuming
requiring access to many records stored
(on disk) to find the needed records

SQL- A Relational Query
Language
• International standard access language fro defining and
manipulating data in databases
• Data-Definition-and-Management Language for DBMS
and some nonrelational ones
• SQL is used an independent query language to define
objects in a database, enter, and access the data
• Embedded SQL is for programming in procedural
languages (host) such as C, COBOL, PL/1, PL/SQL to
access from an application program
• For the end user environment, SQL is “hidden” by more
friendly user interfaces

SQL- A Relational Query
Language
• Principal Facilities of SQL
• Data Definition
• Data Manipulation

Designing a Relational
Database
• Database design progresses from the design of the
logical of the schema and the subschema to the physical
level.
• Logical Design or Data Modeling is to design the
schema and the subschema.
• A relational database consists of tables (relations)
describing the attributes of a particular class of entities.
• Logical Design begins with identifying the entity classes
to be represented in the database and establishing
relationships or interactions between the entities
represented to access the data.

Designing a Relational
Database
• Entity-Relationship (E-R) Diagrams
are used to perform data modeling.
– Diagrams include rectangles for entities
and diamonds for relationships including
attributes

Designing a Relational
Database
• Normalization is the simplification of the
logical view of data in relational databases
meaning that all fields contain single data
elements, distinct records, and each table
describes only a single class of entities.
• The objective of normalization is to prevent
replication of data with all of its negative
consequences.

Designing a Relational
Database
• Physical Design is after the Logical Design of the
database.
• Fields are specified as to their length and the
nature of the data (numeric, characters).
• A principal objective is to minimize the number of
time consuming disk accesses that are necessary in
order to answer typical database queries.
• Frequently, indexes are provided to ensure fast
access for queries.

The Data Dictionary
• A software module and database
containing descriptions and definitions
of the structure, data elements,
interrelationships, and other
characteristics of an organization’s
database

The Data Dictionary
• Schema, Subschemas, and Physical Schema
• Which applications and users may retrieve and/or modify
the specific data
• Cross reference information such as which programs use
what data and the users receive particular reports
• Where individual data elements originate, and who is
responsible for maintaining the data
• Standard naming conventions for database entities
• Integrity rules for the data
• Where the data are stored in geographically distributed
databases

The Data Dictionary
• Contains all the data definitions and
information necessary to identify data
ownership
• Ensures security and privacy of the
data and the information used during
the development and maintenance of
applications which rely on the database

Managing the Data Resource
of an Organization
• Database technology enables organizations to
control their data as a resource, but it does not
automatically produce organizational control of data
• Components of Information Resources Management:
Organizational and Technical Means
• Ensure that a firm systematically accumulates data
in its databases
• Maintains the data over time
• Provides the appropriate access

Managing the Data Resource
of an Organization
• Principal Components of Information
Resource Management
• Organizational Processes
– Information planning and data modeling

• Enabling Technologies
– DBMS and a Data Dictionary

• Organizational Functions
– Data and database administration

Managing the Data Resource
of an Organization
• Database Administration Functional
Units to Manage the Data
• Data Administrator (DA)
• Database Administrator (DBA)

Managing the Data Resource of an
Organization:
Data Administrator







Person who has the central responsibility for an organization’s
data
Establish policies and specific procedures for collecting,
validating, sharing, and inventorying data to be stored and
databases and for making information accessible to members of
the organization and possible persons outside of it
Data administration is a policy making function and the DA
should have access to senior corporate management.
Key person involved in the strategic planning of the data
resource
Often defines the principal data entities, their attributes, and the
relationship among them

Managing the Data Resource of an
Organization:
Database Administrator
• A specialist responsible for maintaining standards for the
development, maintenance and security of an organization’s
databases
• Creating the databases and carrying out the policies of the data
administrator
• In large organizations, the DBA function is performed by a group
of professionals; in a small firm, a programmer/ analyst may
perform the DBA function, while one of the managers acts as the
DA.
• Schema and subschemas of the database are most often defined by
the DBA, who has the requisite technical knowledge to detail the
physical layout of the databases with a view toward optimizing
system performance for the expected pattern of database usage

Managing the Data Resource of
an Organization
• Joint Responsibilities of the DA and DBA
• Maintaining the Data Dictionary
• Standardizing names and other aspects of data
definition
• Providing Backup
• Provide security and privacy for the data stored
in the database
• Establish a disaster recovery plan for the
databases

Developmental Trends in
Database Management
• Distributed Databases
• Data Warehousing
• Rich Databases (includes objectoriented databases)

Developmental Trends in
Database Management
• Distributed Databases






Spread across several physical locations
Data are placed where they are used most often
Entire database is available to each authorized user
Local work groups (LAN)
Departments at regional offices (WAN)
• Branch offices, manufacturing plants, and other work sites

– Databases includes segments of both common operational
and user as well as data generated only at the user’s own
site

Developmental Trends in
Database Management
• Data Warehouse Databases
– Stores data from current and previous years that has been extracted
from operational and management databases
– Central source of data
– Standardized and integrated for use by managers and other end user
professionals
– Objective of a corporate data warehouse
• To continually select data from the operational databases
• Transform the data into a uniform format
• Open the warehouse to the ends through a friendly and consistent interface

– Data mining
• Automated discovery of potentially significant relationships among
categories of data

Developmental Trends in
Database Management
• Systems supporting a data warehouse
• Extract and Prepare Data
– The first subsystem extracts the data from the operational systems
– Many are older legacy systems that get “scrubbed” by removing
errors and inconsistencies

• Store Data in the Warehouse
– The second support component is the DBMS that will manage the
warehouse data.

• Provide Access and Analysis Capabilities
– The third subsystem is made up of the query tools to access the data
– Includes OLAP (OnLine Analytical Processing) and other DSS tools
supporting data analysis

Developmental Trends in
Database Management
• Object-Oriented and Other Rich Databases
– With expanded capabilities of information technology, the
content of databases is becoming richer
– Traditional databases include largely numerical data or short
fragments of text organized into well structured records
– As processing and storage capabilities expand with growing
telecommunications, the knowledge is supported with rich
data
– Geographic Information Systems
– Object-Oriented Databases
– Hypertext and Hypermedia Databases
– Image Databases and Other Text Databases

A
ASystem
System with
with aa Distributed
Distributed Database
Database
Site 1

Site 3

.
.
.
Database
Fragment 1

Telecommunications
Network

Site 2

Database
Fragment 3
Users have
access to the
entire database
over the network

.
.
.
Database
Fragment 2

Key
Key Terms
Terms
Bit
Bit
Byte
Byte
Field
Field
Record
Record
File
File
Primary
Primary Key
Key
Database
Database
Access
Access (to
(to aa record)
record)
Sequential
SequentialAccess
Access
Direct
DirectAccess
Access
Sequential
Sequential File
File
Indexed-Sequential
Indexed-Sequential
File
File
Direct
Direct File
File
Database
Database Management
Management
System
System (DBMS)
(DBMS)
Program-Data
Program-Data
Independence
Independence
Schema
Schema
Subschema
Subschema

Physical
Physical Schema
Schema
Data
Data Definition
Definition
Language
Language (DDL)
(DDL)
Data
Data Manipulation
Manipulation
Language
Language (DML)
(DML)
Data
Data Model
Model
Hierarchical
Hierarchical Model
Model
Network
Model
Network Model
Relational
Relational Model
Model
Structured
Structured Query
Query
Language
Language (SQL)
(SQL)
Logical
Logical Design
Design (data
(data
modeling)
modeling)
Entity-Relationship
Entity-Relationship
(E-R)
(E-R) Diagram
Diagram
Normalization
Normalization
Physical
Physical Design
Design
Data
Data Dictionary
Dictionary
Data
DataAdministrator
Administrator
(DA)
(DA)

Database
Database
Administrator
Administrator (DBA)
(DBA)
Distributed
Distributed Database
Database
Object-Oriented
Object-Oriented
Database
Database
Data
Data Warehouse
Warehouse
OLAP
OLAP Software
Software
Data
Mining
Data Mining
Geographical
Geographical
Information
Information System
System
(GIS)
(GIS)