PPT UEU Basis Data Pertemuan 12
Normalisation to BCNF Database Systems Lecture 12
In This Lecture
- More normalisation
- Brief review of relational algebra
- Lossless decomposition
- Boyce-Codd normal form (BCNF)
- Higher normal forms
Normalisation so Far
- First normal form • Third normal form
- All data values are • In 2NF plus no non-key atomic
attribute depends transitively on a
- Second normal form
candidate key
- In 1NF plus no non-key attribute is partially dependent on a
Lossless decomposition
- To normalise a relation, we used projections
- If R(A,B,C) satisfies AB then we can project it on A,B and A,C without losing information
- Lossless decomposition: R = AB
- Reminder of projection:
(R) ⋈ AC
(R)
A B C R A B
AB
(R)
Relational algebra reminder: selection
A B R C D 1 c c 2 y d e 3 z a a x
A B C D 1 c c 3 z a a x
C=D
(R)
Relational algebra reminder: product
A B R1
1 2 y x
A C
1 2 v w
R2 A B A C 1 x 1 w 1 x 2 v 1 x 3 u While I am on the subject… SELECT A,B FROM R1, R2, R3 WHERE (some property holds) translates into relational algebra
Relational algebra: natural join R1 ⋈R2 =
R1.A = R2.A
(R1R2)
A B R1
1 2 y x
A C
1 2 v w
R2 A B C 1 x w 2 y v
R1
⋈
R2 When is decomposition lossless:
Module Lecturer
Module Lecturer Text DBS nza CBDBS nza UW RDB nza UW APS rcb B
R
Module Lecturer DBS nza RDB nza APS rcb
Module,Lecturer
R
Module Text DBS CB RDB UW APS UW
Module,Text
R
DBS B When is decomposition is not lossless: no fd
S S S
First,Last First,Age First Last Age First Last First Age John Smith
20 John Smith John
20 John Brown
30 John Brown John
30 Mary Smith Mary Mary Smith
20
20 Tom Brown Tom
10 Tom Brown
10 When is decomposition is not lossless: no fd First Age John Smith
S
20
John Smith
30 John Brown Mary Smith Tom Brown
10 Last John Brown
John
First,Age
20
First,Last S ⋈ First,Last S
30 Tom Normalisation Example
20 Mary
First Age John
S
First,Last
First Last John Smith
20
- We have a table representing orders in an online store
- Each entry in the table represents an item on a partic>Columns
- Order • Product • Customer • Address • Quantity • UnitPrice
{Order} {Customer} {Customer}{Address} {Product} {UnitPrice}
Functional Dependencies
- Each order is for a single customer
- Each customer has a single address
- Each product has a single price
Normalisation to 2NF
- Second normal form means no partial dependencies on candidate keys
- {Order} {Customer,
Address}
FD we project over {Order, Customer,
Address} (R1) and {Order, Product, Quantity,
UnitPrice} (R2)
- {Product}
- To remove the first
Normalisation to 2NF
- To remove this we
- R1 is now in 2NF,
project over but there is still a
{Product, UnitPrice} (R3) partial FD in R2 and
{Product} {UnitPrice} {Order, Product, Quantity}
(R4) Normalisation to 3NF
- R has now been split • To remove into 3 relations - R1,
{Order} {Customer} {Address}
R3, and R4
- R3 and R4 are in 3NF
we project R1
- R1 has a transitive FD
over on its key
- {Order, Customer}
Normalisation
- 1NF:
- {Order, Product, Customer, Address, Quantity,
UnitPrice}
- 2NF:
- {Order, Customer, Address}, {Product, UnitPrice}, and
{Order, Product, Quantity}
The Stream Relation
- Consider a relation,
Stream, which stores information about times for various streams of courses
- For example: labs for
- Each course has several streams
- Only one stream (of any course at all) takes place at any given time
The Stream Relation
Student Course Time John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00 Mary Programming 10:00 FDs in the Stream Relation
- Stream has the following non-trivial FDs
- {Student, Course}
{Time}
- {Time} {Course}
Anomalies in Stream
- INSERT anomalies
- You can’t add an empty stream
- UPDATE anomalies
- Moving the 12:00 class to 9:00 means changing two rows
- A relation is in Boyce- • The same as 3NF except
- If there is only
- B is contained in A (the
- A contains a candidate key of the relation,
- Stream is not in
- Lossless: Data should not be lost or created when splitting relations up
- Dependency preservation: I
- Normalisation to 3NF is always lossless and dependency preserving
- Normalisation to
- BCNF is as far as we
- Higher normal forms are based on other
- Fourth normal form removes multi-valued
- Normalisation
- Removes data redundancy
- Solves INSERT, >It leads to more tables in the database
- Often these need to be joined back together, which is expensive to do
- This makes it easier
- However
- You might want to
- Database speeds are
- There are going to be
- {ID} {Name, Address, Postcode, CardType, CardNumber}
- {Address} {Postcode}
- Physical DB Is>RAID arrays for recovery and speed
- Indexes and query effici>Query optimisa
- Query trees
Student Course Time John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00
Boyce-Codd Normal Form
Codd normal form in 3NF we only worry (BCNF) if for every FD A about non-key Bs B either
candidate key then 3NF
FD is trivial), or
and BCNF are the same
Student Course Time John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00 Mary Programming 10:00 Rebecca Programming 13:00
Stream and BCNF
Conversion to BCNF
Student Course Course Time
Student Course Time
Decomposition Properties
BCNF is lossless, but Higher Normal Forms
1NF Relations
can go with FDs
2NF Relations
3NF Relations
sorts of dependency
BCNF Relations
4NF Relations
Denormalisation
UPDATE, and DELETE anomalies
Denormalisation
Address
denormalise if
Number Street City Postcode
Not normalised since
unacceptable (not just a bit slow)
{Postcode} {City}
Address1
very few INSERTs,
Normalisation in Exams
Given a relation with scheme {ID, Name, Address, Postcode, CardType, CardNumber}, the candidate key {ID}, and the following functional dependencies:
Normalisation in Exams
(ii) Show how this relation can be converted to third normal form. You should show what functional dependencies are being removed, explain why they need to be removed, and give the relation(s) that result.
(4 marks) (iii) Give an example of a relation that is in third normal Next Lecture