COP 5725 Fall 2012 Database Management Systems

  COP 5725 Fall 2012 Database Management Systems

  University of Florida, CISE Department Prof. Daisy Zhe Wang Adapted Slides from Prof. Jeff Ullman

Chap 2. Introduction to Database Design

  Database Design E/R Diagrams Weak entity sets Database Design Steps

  • Requirements Analysis • Conceptual Database Design
    • – E/R model

  • Logical Database Design
    • – E.g., relational data model
    • – Output: logical/conceptual schema

  • Schema Refinement • Physical Database Design • Application and Security Design

  Purpose of E/R Model

  • The E/R model allows us to sketch database designs.
    • – Kinds of data and how they connect.

  • Designs are pictures called entity-

  relationship diagrams .

  • Later: convert E/R designs to relational DB designs.

  Entity Sets

  • Entity = “thing” or object.
    • – A student

  • Entity set = collection of similar entities.
    • – Students in UF

  • Attribute = property of (the entities of) an entity set.
    • – Students(sid: string, name: string, login: string, age: integer, gpa:real)
    • – Key, primary key, candidate key

  E/R Diagrams

  • In an entity-relationship diagram: – Entity set = rectangle.
    • – Attribute = oval, with a line to the rectangle representing its entity set.
    • – Key = underlined attribute(s)

  Example: a Multi-attribute Key

  dept hours room number Courses

  hours and room could also serve as a

  • Note that key, but we must select only one key.

  A more fun Running Example

  name manf Beers

  Beers has two attributes, name and

  • Entity set manf (manufacturer).

Beers entity has values for these two

  • Each attributes, e.g. (Bud, Anheuser-Busch)
  • key

Relationships

  • A relationship connects two or more entity sets.
  • It is represented by a diamond, with lines to each of the entity sets involved.

  Example

  name addr name manf Bars sell some

  Bars Beers

  Sells beers. license

  Drinkers like some beers. Frequents Likes

  Drinkers frequent some bars. Drinkers name addr Relationship Set

  • For the relationship Sells , we might have a relationship set like:

  Bar Beer Joe’s Bar

  Bud Joe’s Bar

  Miller Sue’s Bar

  Bud Sue’s Bar Pete’s Ale Sue’s Bar

  Bud Lite Attributes on Relationships

  Bars Beers

  Sells price Price is a function of both the bar and the beer, not of one alone.

Multiway Relationships

  Bars Beers

  Drinkers name name addr manf name addr license

  Preferences A Ternary Relationship Set

  Bar Drinker Beer Joe’s Bar

  Ann Miller Sue’s Bar

  Ann Bud Sue’s Bar

  Ann Pete’s Ale

  Joe’s Bar Bob Bud

  Joe’s Bar Bob Miller

  Joe’s Bar Cal Miller

  Sue’s Bar Cal Bud Lite Roles

  • Sometimes an entity set appears more than once in a relationship.
  • Label the edges between the relationship and the entity set with names called .

  roles

Example

  Relationship Set Husband Wife Bob Ann Joe Sue

  Married … … husband wife

  Drinkers

Example

  Relationship Set Buddy1 Buddy2 Bob Ann Joe Sue

  Buddies Ann Bob Joe Moe

  1

  2 … …

  Drinkers Many-Many Relationships

  • Focus: binary relationships, such as Sells between Bars and Beers .

  relationship , an entity

  • In a many-many of either set can be connected to many entities of the other set.
    • – E.g., a bar sells many beers; a beer is sold by many bars.

Many-Many Relationships (II)

  many--many Bar--Beer friend--friend Many-One / One-Many

Relationships

  manyone Drinker  Favorite Beer Beer  Manufacturer Child  Father

One-One Relationships

  One  one Manufacturer  BestSellingBeer

  Husbandwife Representing “Multiplicity”

  • In E/R, a many-one relationship is represented by an arrow entering the “one” side (i.e., at most one).
  • Show a one-one relationship by arrows entering both entity sets.
  • Rounded arrow = “exactly one,” i.e., at most one and at least one.

  The book uses a different notation to represent multiplicity

(i.e., key/participation constraints). Use the one we learn in class.

Example (I)

  Drinkers Beers

  Likes Favorite

Example (II)

  Manfs Beers

  Best- seller

Example (III)

  Bars Beers

  Drinkers name name addr manf name addr license

  Favorite Weak Entity Sets

  • Occasionally, entities of an entity set need “help” to identify them uniquely.

  name number name Plays-

  Players Teams on many-one relationship

  • Double diamond for supporting with total participation.

  .

  • Double rectangle for the weak entity set

  Weak Entity-Set Rules

  • A weak entity set has one or more many-one relationships to other (supporting) entity sets.
    • – Not every many-one relationship from a weak entity set need be supporting.
    • – Many-one with total participation

  • The key for a weak entity set is its own underlined attributes and the keys for the supporting entity sets.
    • – E.g., (player) number and (team) name is a key for Players in the previous example.

Subclasses

  Beers Ales isa name manf color

  

Conceptual Design with E/R Model

  • Choices in developing E/R diagram
    • – Avoid Redundancy – Entity vs. Attribute – Entity vs. relationship
    • – Relationships (Connectivity, N-ways)

  Bars Beers

  Entity vs. Attributes

  Sells price

  • Create an entity set representing values of the attribute.
  • Make that entity set participate in the relationship.

  Entity vs. Attributes (II)

  Bars Beers

  Sells

  Note convention: arrow from multiway relationship

  Prices

  = “all other entity sets together determine a unique one of these.”

  price

Design Techniques 1. Avoid redundancy

  2. Don’t use an entity set when an attribute will do.

  3. Limit the use of weak entity sets.

  Avoiding Redundancy

  • occurs when we say the

  Redundancy same thing in two or more different ways.

  • Redundancy wastes space and (more importantly) encourages inconsistency.
    • – The two instances of the same fact may

      become inconsistent if we change one and

      forget to change the other.

Example (I)

  name name addr

  Beers Manfs

  ManfBy This design gives the address of each manufacturer exactly once.

Example (II)

  name name addr

  Beers Manfs

  ManfBy manf This design states the manufacturer of a beer twice: as an attribute and as a related entity. Example (III)

  name manf manfAddr Beers

  This design repeats the manufacturer’s address once for each beer and loses the address if there are temporarily no beers for a manufacturer.

  Entity Sets Versus Attributes

  • An entity set should satisfy at least one of the following conditions:
    • – It is more than the name of something; it has at least one nonkey attribute.

  or

  • – It is the “many” in a many-one or many- many relationship.

  Example (I)

  name name addr

  Beers Manfs

ManfBy deserves to be an entity set because of

  • Manfs the nonkey attribute addr .
  • Beers deserves to be an entity set because it is the “many” of the many-one relationship ManfBy .

  Example (II)

  name manf Beers

  There is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name.

Example (III)

  name name

  Beers Manfs

  ManfBy Since the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity set. Don’t Overuse Weak Entity Sets

  • Beginning database designers often doubt that anything could be a key by itself.
    • – They make all entity sets weak, supported by all other entity sets to which they are linked.

  • In reality, we usually create unique ID’s for entity sets.
    • – Examples include social-security numbers, automobile VIN’s etc.

  When Do We Need Weak Entity

Sets?

  • The usual reason is that there is no global authority capable of creating unique ID’s.
  • Example: it is unlikely that there could be an agreement to assign unique player numbers across all football teams in the world.

  Key Constraints

  • Key constraints on relations
    • – Superkeys – Keys (minimal)/Candidate keys
    • – Primary keys

  • Key constraints on relationships
    • – Many-One/One-Many
    • – One-One
    • – Many-Many
    • – Foreign keys (e.g., Drinker likes Beer)

  More IC: Participation Constraints

  • Does every Manf have a Best-seller? If so, this is a

  : the participation constraint

  • – participation of Manf in Best-seller is said to be total (vs. partial ) .
    • Every Manf entity must appear in an instance of the Best Seller relationship.

  Best- Manfs

  Beers seller IC in ER Model

  • Constraints in the ER Model:
    • – be captured.

      But some constraints cannot be captured in

  A lot of data semantics can (and should)

  • – ER diagrams (e.g., functional dependencies, candidate keys)

More example

  lot name did

since

name dname budget did

since

  Manages Departments Employees ssn

At most, at least and exactly one!

  Manfs Beers

  Best- seller Comparison: two E/R models

A more complicated example for ISA Contract_Emps name ssn Employees lot hourly_wages

  ISA Hourly_Emps contractid hours_worked

  name

ssn

lot

  Aggregation Employees Monitors until since started_on dname pid pbudget did budget Sponsors Departments Projects allows us to treat a relationship

  Aggregation set as an entity set for purposes of participation in (other) relationships

  Other Choices in E/R Design

  • Attribute vs. Entity (more examples)
  • Entity vs. Relationship • Binary vs. Ternary Relationships (vs.

  Aggregation)

  Entity vs. Attribute

  • • Should be an attribute / entity

  price address

  1. If we have several prices per beer (e.g., seasonal) or several addresses per bar

  2. If the structure (street, city, etc.) is important, e.g., we want to retrieve bars in a given city 3. otherwise, simpler representation (i.e., attribute) would better to avoid overhead

  Bars Beers

  Sells address price Entity vs. Relationship

  • What if a manager gets

  since dbudget

  a discretionary name

  dname ssn lot did budget

  budget that covers managed depts?

  all Departments

  

Employees

Manages2

  • – Redundancy:

  dbudget stored for each dept

name

managed by manager. ssn lot

  Misleading: Suggests – dname since dbudget associated with did

  

Employees

budget department-mgr combination.

  Departments ISA Manages2

  This fixes the

  Managers dbudget

  problem! Binary vs. Ternary Relationships name ssn lot pname age

  • If each policy is

  Dependents Employees Covers

  owned by just 1

  Bad design

  employee, and

  Policies

  each dependent

  policyid cost

  is tied to the

  name pname age

  covering policy

  ssn lot Dependents

  • What are the

  Employees

  additional constraints in the Purchaser

  Beneficiary

  2nd diagram?

  Better design Policies policyid cost Binary vs. Ternary Relationships (Contd.)

  • Previous example illustrated a case when two binary relationships were better than one ternary relationship.
  • • An example in the other direction: a ternary

    relation Favorite relates entities Drinkers ,

  Beers , Bars . No combination of binary relationships is an adequate substitute.

  • Aggregations can also be better than both Binary and ternary relationship (previous example)

  

Summary of Conceptual Design

  ,

  • follows

  Conceptual design requirements analysis

  • – Yields a high-level description of data to be stored
    • ER model popular for conceptual design

  • – about their applications.

  Constructs are expressive, close to the way people think

  , , and (of

  • Basic constructs: entities relationships attributes entities and relationships).
  • Some additional constructs: , ,

  weak entities

  ISA hierarchies etc. we showed

  • Note: There are many variations on ER model – two (one in slides, one in book) .

  Summary of ER (Contd.)

  • Several kinds of integrity constraints can be expressed in the ER model: e.g.

  key constraints , participation constraints , foreign . key constraints

  • – ) cannot be expressed in

  Some constraints (notably, , candidate keys

  functional dependencies the ER model.

  • – the best database design for an enterprise.

  Constraints play an important role in determining

  Summary of ER (Contd.)

  • ER design is . There are often

  subjective many ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large enterprise. Common choices include:

  • – binary or n-ary relationship, and whether or not to use ISA hierarchies.

  Entity vs. attribute, entity vs. relationship,