This page intentionally left blank

PART 1 • GETTING STARTED

Introduction to Structured Query Language 31

PART 2 • DATABASE DESIGN

Chapter 3

The Relational Model and Normalization 100

Chapter 4

Database Design Using Normalization 137

Chapter 5

Data Modeling with the Entity-Relationship Model 155

Chapter 6

Transforming Data Models into Database Designs 203

PART 3 • DATABASE IMPLEMENTATION

Chapter 7

SQL for Database Construction and Application Processing 246

Chapter 8

Database Redesign 313

PART 4 • MULTIUSER DATABASE PROCESSING

Chapter 9

Managing Multiuser Databases 338

Chapter 10 Managing Databases with SQL Server 2008 R2 373 ONLINE CHAPTER: SEE PAGE 447 FOR INSTRUCTIONS

Chapter 10A Managing Databases with Oracle Database 11g 10A-1 ONLINE CHAPTER : SEE PAGE 448 FOR INSTRUCTIONS

Chapter 10B Managing Databases with MySQL 5.1 10B-1

PART 5 • DATABASE ACCESS STANDARDS

Chapter 11 The Web Server Environment 449 Chapter 12 Database Processing with XML 509 Chapter 13 Database Processing for Business Intelligence Systems 549

ONLINE APPENDICES: SEE PAGE 590 FOR INSTRUCTIONS Appendix A Getting Started with Microsoft Access 2010 A-1

Appendix B Getting Started with Systems Analysis and Design B-1 Appendix C E-R Diagrams and the IDEF1X Standard C-1 Appendix D E-R Diagrams and the UML Standard D-1 Appendix E Getting Started with the MySQL Workbench Database Design Tools E-1 Appendix F Getting Started with the Microsoft Visio 2010 F-1 Appendix G The Semantic Object Model G-1 Appendix H Data Structures for Database Processing H-1

Appendix I

Getting Started with Web Servers, PHP and the Eclipse PDT I-1

This page intentionally left blank This page intentionally left blank

Preface xv

PART 1 • GETTING STARTED

Chapter 1 : Introduction 2

Chapter Objectives 2 The Characteristics of Databases 3

A Note on Naming Conventions 3 •

A Database Has Data and Relationships 4 •

Databases Create Information 5 Database Examples 6

Single-User Database Applications 6 • Multiuser Database Applications 6 E-Commerce Database Applications 7 • Reporting and Data Mining Database Applications 7

The Components of a Database System 8

Database Applications and SQL 9 • The DBMS 11 • The Database 12 Personal Versus Enterprise-Class Database Systems 13

What Is Microsoft Access? 13 • What Is an Enterprise-Class Database System? 15 Database Design 16

Database Design from Existing Data 17 • Database Design for New Systems Development 17 • Database Redesign 18

What You Need to Learn 19

A Brief History of Database Processing 20 The Early Years 20 • The Emergence and Dominance of the Relational Model 22 • Post-Relational Developments 23

Summary 25 • Key Terms 26 • Review Questions 26 • Project

Questions 28

Chapter 2: Introduction to Structured Query Language 31

Chapter Objectives 31 Components of a Data Warehouse 32 Cape Codd Outdoor Sports 33

The Extracted Retail Sales Data 33 • RETAIL_ORDER Data 35 • ORDER_ITEM Data 35 • SKU_DATA Table 36 • The Complete Cape Codd Data Extract Schema 36 • Data Extracts Are Common 37

SQL Background 37 The SQL SELECT/FROM/WHERE Framework 38

Reading Specified Columns from a Single Table 38 • Specifying Column Order in SQL Queries from a Single Table 39 • Reading Specified Rows from a Single Table 41 • Reading Specified Columns and Rows from a Single Table 42

Submitting SQL Statements to the DBMS 43

Using SQL in Microsoft Access 2010 43 • Using SQL in Microsoft SQL Server 2008 R2 48 • Using SQL in Oracle Database 11g 51 • Using SQL in Oracle MySQL 5.5 54 Using SQL in Microsoft Access 2010 43 • Using SQL in Microsoft SQL Server 2008 R2 48 • Using SQL in Oracle Database 11g 51 • Using SQL in Oracle MySQL 5.5 54

Contents

SQL Enhancements for Querying a Single Table 56

Sorting the SQL Query Results 56 • SQL WHERE Clause Options 58 • Combining the SQL WHERE Clause and the SQL ORDER BY Clause 63

Performing Calculations in SQL Queries 63

Using SQL Built-in Functions 63 • SQL Expressions in SQL SELECT Statements 66

Grouping in SQL SELECT Statements 68 Looking for Patterns in NASDAQ Trading 72

Investigating the Characteristics of the Data 72 • Searching for Patterns in Trading by Day of Week 73

Querying Two or More Tables with SQL 75

Querying Multiple Tables with Subqueries 75 • Querying Multiple Tables with Joins 78 • Comparing Subqueries and Joins 82

Summary 82 • Key Terms 82 • Review Questions 83 • Project Questions 88 • Marcia’s Dry Cleaning 92 • Morgan Importing 94

PART 2 • DATABASE DESIGN

99 Chapter 3: The Relational Model and Normalization 100

Chapter Objectives 99 Relational Model Terminology 102

Relations 103 • Characteristics of Relations 103 • Alternative Terminology 105 • Functional Dependencies 106 • Finding Functional Dependencies 107 • Keys 110

Normal Forms 112

Modification Anomalies 112 •

A Short History of Normal Forms 113 • Normalization Categories 113 • From First Normal Form to Boyce-Codd Normal Form Step-By-Step 114 • Eliminating Anomalies from Functional Dependencies with BCNF 118 • Eliminating Anomalies from Multivalued Dependencies 126 • Fifth Normal Form 130 • Domain/Key Normal Form 130

Summary 131 • Key Terms 131 • Review Questions 132 • Project Questions 134 • Marcia’s Dry Cleaning 135 • Morgan Importing 136

Chapter 4: Database Design Using Normalization 137

Chapter Objectives 137 Assess Table Structure 138 Designing Updatable Databases 139

Advantages and Disadvantages of Normalization 139 • Functional Dependencies 139 • Normalizing with SQL 140 • Choosing Not to Use BCNF 141 • Multivalued Dependencies 142

Designing Read-Only Databases 142

Denormalization 142 • Customized Duplicated Tables 144 Common Design Problems 145

The Multivalue, Multicolumn Problem 145 • Inconsistent Values 147 • Missing Values 148 • The General-Purpose Remarks Column 148

Summary 149 • Key Terms 150 • Review Questions 150 • Project Questions 152 • Marcia’s Dry Cleaning 152 • Morgan Importing 153

Chapter 5: Data Modeling with the Entity-Relationship Model 155

Chapter Objectives 155 The Purpose of a Data Model 156

Contents

vii

The Entity-Relationship Model 156

Entities 156 • Attributes 157 • Identifiers 158 • Relationships 158 • Maximum Cardinality 160 • Minimum Cardinality 161 • Entity- Relationship Diagrams and Their Versions 162 • Variations of the E-R Model 162 • E-R Diagrams Using the IE Crow’s Foot Model 163 • Strong Entities and Weak Entities 164 • ID-Dependent Entities 164 • Non-ID-Dependent Weak Entities 165 • The Ambiguity of the Weak Entity 166 • Subtype Entities 167

Patterns in Forms, Reports, and E-R Models 168

Strong Entity Patterns 169 • ID-Dependent Relationships 173 • Mixed Identifying and Nonidentifying Patterns 179 • The For-Use-By Pattern 182 • Recursive Patterns 183

The Data Modeling Process 185

The College Report 186 • The Department Report 187 • The Department/Major Report 189 • The Student Acceptance Letter 189

Summary 191 • Key Terms 192 • Review Questions 193 • Project Questions 195 • Marcia’s Dry Cleaning 201 • Morgan Importing 202

Chapter 6: Transforming Data Models into Database Designs 203

Chapter Objectives 203 Create a Table for Each Entity 204

Selecting the Primary Key 204 • Specifying Candidate (Alternate) Keys 206 •

Specify Column Properties 206 • Verify Normalization 208 Create Relationships 209

Relationships Between Strong Entities 209 • Relationships Using ID-Dependent

Entities 212 • Relationships with a Weak Non-ID-Dependent Entity 217 •

Relationships in Mixed Entity Designs 217 • Relationships Between Supertype and Subtype Entities 219 • Recursive Relationships 219 • Representing Ternary and Higher-Order Relationships 221 • Relational Representation of the Highline University Data Model 224

Design for Minimum Cardinality 225

Actions When the Parent Is Required 227 • Actions When the Child Is Required 228 • Implementing Actions for M-O Relationships 228 • Implementing Actions for O-M

Relationships 228 • Implementing Actions for O-M Relationships 229 •

Implementing Actions for M-M Relationships 230 • Designing Special Case M-M Relationships 230 • Documenting the Minimum Cardinality Design 231 • An Additional Complication 233 • Summary of Minimum Cardinality Design 233

The View Ridge Gallery Database 233

Summary of Requirements 233 • The View Ridge Data Model 234 • Database Design with Data Keys 235 • Minimum Cardinality Enforcement for Required Parents 236 • Minimum Cardinality Enforcement for the Required Child 238 • Column Properties for the View Ridge Database Design Tables 238

Summary 240 • Key Terms 241 • Review Questions 242 • Project Questions 243 • Marcia’s Dry Cleaning 244 • Morgan Importing 244

PART 3 • DATABASE IMPLEMENTATION

245 Chapter 7: SQL for Database Construction and Application

Processing 246

Chapter Objectives 246 The View Ridge Gallery Database 247 SQL DDL, DML, and a New Type of Join 247 Managing Table Structure with SQL DDL 248

Creating the View Ridge Database 248 • Using the SQL CREATE TABLE Statement 249 • Variations in SQL Data Types 250 • Creating the ARTIST Creating the View Ridge Database 248 • Using the SQL CREATE TABLE Statement 249 • Variations in SQL Data Types 250 • Creating the ARTIST

Contents

Table 252 • Creating the WORK Table and the 1:N ARTIST-to-WORK Relationship 254 • Implementing Required Parent Rows 255 • Implementing 1:1 Relationships 256 • Casual Relationships 256 • Creating Default Values and Data Constraints with SQL 256 • Creating the View Ridge Database Tables 258 • The SQL ALTER TABLE Statement 261 • The SQL DROP TABLE Statement 262 • The SQL TRUNCATE TABLE Statement 263

SQL DML Statements 263

The SQL INSERT Statement 263 • Populating the View Ridge Database Tables 264

• The SQL UPDATE Statement 270 • The SQL MERGE Statement 271 •

The SQL DELETE Statement 272 New Forms of Join 272 The SQL JOIN ON Syntax 272 • Outer Joins 274 Using SQL Views 277

Using SQL Views to Hide Columns and Rows 280 • Using SQL Views to Display Results of Computed Columns 281 • Using SQL Views to Hide Complicated SQL Syntax 282 • Layering Built-in Functions 284 • Using SQL Views for Isolation, Multiple Permissions, and Multiple Triggers 285 • Updating SQL Views 286

Embedding SQL in Program Code 287

SQL/Persistent Stored Modules (SQL/PSM) 288 • Using SQL Triggers 289 •

Using Stored Procedures 295 Summary 298 • Key Terms 299 • Review Questions 299 • Project

Questions 303 • Marcia’s Dry Cleaning 306 • Morgan Importing 309

Chapter 8: Database Redesign 313

Chapter Objectives 313 The Need for Database Redesign 314 SQL Statements for Checking Functional Dependencies 314

What Is a Correlated Subquery? 315 How Do I Analyze an Existing Database? 320

Reverse Engineering 320 • Dependency Graphs 322 • Database Backup and Test Databases 322

Changing Table Names and Table Columns 323

Changing Table Names 323 • Adding and Dropping Columns 325 • Changing

a Column Data Type or Column Constraints 326 • Adding and Dropping Constraints 326 Changing Relationship Cardinalities and Properties 326

Changing Minimum Cardinalities 327 • Changing Maximum Cardinalities 328 Adding and Deleting Tables and Relationships 331 Forward Engineering(?) 331

Summary 331 • Key Terms 333 • Review Questions 333 • Project Questions 335 • Marcia’s Dry Cleaning 335 • Morgan Importing 336

PART 4 • MULTIUSER DATABASE PROCESSING

Chapter 9: Managing Multiuser Databases 338

Chapter Objectives 338 Database Administration 339

Managing the Database Structure 340 Concurrency Control 341

The Need for Atomic Transactions 342 • Resource Locking 346 • Optimistic Versus Pessimistic Locking 348 • Declaring Lock Characteristics 349 • Implicit and Explicit Commit Transaction 350 • Consistent Transactions 351 • Transaction Isolation Level 352 • Cursor Type 353

Contents

ix

Database Security 354 Processing Rights and Responsibilities 354 • DBMS Security 355 • DBMS Security Guidelines 356 • Application Security 358 • The SQL Injection Attack 359

Database Backup and Recovery 359 Recovery via Reprocessing 360 • Recovery via Rollback/Rollforward 360 Managing the DMBS 362 Maintaining the Data Repository 363 Distributed Database Processing 364

Types of Distributed Databases 364 • Challenges of Distributed Databases 365 Object-Relational Databases 366

Summary 367 • Key Terms 368 • Review Questions 369 • Project Questions 371 • Marcia’s Dry Cleaning 371 • Morgan Importing 372

Chapter 10: Managing Databases with SQL Server 2008 R2 373

Chapter Objectives 373 Installing SQL Server 2008 R2 374 The Microsoft SQL Server 2008 R2 Management Studio 376 Creating an SQL Server 2008 R2 Database 376 SQL Server 2008 R2 Utilities 378

SQL CMD and Microsoft PowerShell 379 • Microsoft SQL CLR 379 • SQL Server 2008 R2 GUI Displays 380 • SQL Server 2008 R2 SQL Statements and SQL Scripts 381

Creating and Populating the View Ridge Database Tables 383 Creating the View Ridge Database Table Structure 383 • Reviewing Database Structures in the SQL Server GUI Display 387 • Indexes 391 • Populating the VRG Tables with Data 393 • Creating Views 396

SQL Server Application Logic 404 Transact-SQL 405 • Transact-SQL Cursor Statements 406 • Stored Procedures 408 • Triggers 416

Concurrency Control 431

Transaction Isolation Level 432 • Cursor Concurrency 432 • Locking Hints 432 SQL Server 2008 R2 Security 433 SQL Server Database Security Settings 436 SQL Server 2008 R2 Backup and Recovery 437

Backing Up a Database 438 • SQL Server Recovery Models 439 • Restoring

a Database 439 • Database Maintenance Plans 440 Topics Not Discussed in This Chapter 440

Summary 440 • Key Terms 441 • Review Questions 441 • Project Questions 443 • Marcia’s Dry Cleaning 445 • Morgan Importing 445

ONLINE CHAPTER: SEE PAGE 447 FOR INSTRUCTIONS Chapter 10A : Managing Databases with Oracle Database 11g 10A-1

Chapter Objectives 10A-1 Installing Oracle Database 11g 10A-2

Installing a Loopback Adapter 10A-3 • Oracle and Java 10A-4 • Oracle Database 11g Documentation 10A-4 • The Oracle Universal Installer (OUI) 10A-5

Oracle Database 11g Administration and Development Tools 10A-7

The Oracle Database 11g Configuration Assistant 10A-7 • The Oracle Enterprise Manager 11g Database Control 10A-8

Oracle Tablespaces 10A-10 Oracle Security 10A-13

User Privileges 10A-14 • Creating a User Account 10A-14 • Creating a Role 10A-17 Oracle Application Development Tools 10A-19 Oracle SQL*Plus 10A-19 • Oracle SQL Developer 10A-20 • Oracle Schemas 10A-22

Oracle Database 11g SQL Statements and SQL Scripts 10A-22

Contents

Creating and Populating the View Ridge Database Tables 10A-24

Creating the View Ridge Database Table Structure 10A-24 • Transaction COMMIT in Oracle Database 10A-27 • Reviewing Database Structures in the SQL Developer GUI

Display 10A-28 • Indexes 10A-31 • Populating the VRG Tables 10A-32 •

Creating Views 10A-38 Application Logic 10A-44

Oracle PL/SQL 10A-45 • Stored Procedures 10A-47 • Triggers 10A-54 Concurrency Control 10A-68

Read-Committed Transaction Isolation Level 10A-69 • Serializable Transaction

Isolation Level 10A-69 • Read-Only Transaction Isolation 10A-70 •

Additional Locking Comments 10A-70 Oracle Backup and Recovery 10A-70

Oracle Recovery Facilities 10A-70 • Types of Failure 10A-71 Topics Not Discussed in This Chapter 10A-72

Summary 10A-72 • Key Terms 10A-73 • Review Questions 10A-73 • Project Questions 10A-75 • Marcia’s Dry Cleaning 10A-76 • Morgan Importing 10A-76

ONLINE CHAPTER: SEE PAGE 448 FOR INSTRUCTIONS Chapter 10B: Managing Databases with MySQL 5.5 10B-1

Chapter Objectives 10B-1 The MySQL 5.5 DBMS 10B-2 Installing and Updating MySQL 10B-3

Configuring MySQL 10B-4 • MySQL Storage Engines 10B-6 The MySQL GUI Utilities 10B-6 Creating a Workspace for the MySQL Workbench Files 10B-8 Creating and Using a MySQL Database 10B-8

Creating a Database in MySQL 10B-8 • Setting the Active Database in MySQL 10B-12 MySQL Utilities 10B-13

MySQL Command-Line Client 10B-13 • MySQL GUI Displays 10B-14 •

MySQL SQL Statements and SQL Scripts 10B-14 Creating and Populating the View Ridge Database Tables 10B-17

Creating the View Ridge Database Table Structure 10B-17 • Reviewing Database Structures in the MySQL GUI Display 10B-20 • Indexes 10B-21 • Populating

the VRG Tables with Data 10B-26 • Transaction COMMIT in MySQL 10B-27 •

Creating Views 10B-27 MySQL Application Logic 10B-38

MySQL Procedural Statements 10B-38 • Stored Procedures 10B-41 • Triggers 10B-47 •

A Last Word on MySQL Stored Procedures and Triggers 10B-61

Concurrency Control 10B-61 MySQL 5.5 Security 10B-62

MySQL Database Security Settings 10B-64 MySQL 5.5 DBMS Backup and Recovery 10B-68

Backing Up a MySQL Database 10B-68 • Restoring a MySQL Database 10B-71 Topics Not Discussed in This Chapter 10B-72

Summary 10B-71 • Key Terms 10B-72 • Review Questions 10B-73 • Project Questions 10B-74 • Marcia’s Dry Cleaning 10B-75 • Morgan Importing 10B-76

PART 5 • DATABASE ACCESS STANDARDS

Chapter 11: The Web Server Environment 449

Chapter Objectives 450 The Web Database Processing Environment 451 The ODBC Standard 453

ODBC Architecture 453 • Conformance Levels 454 • Creating an ODBC Data Source Name 456

Contents

xi

The Microsoft .NET Framework and ADO.NET 462 OLE DB 463 • ADO and ADO.NET 466 • The ADO.NET Object Model 467

The JAVA Platform 471

JDBC 471 • JavaServer Pages (JSP) and Servlets 473 • Apache Tomcat 473 Web Database Processing with PHP 474 Web Database Processing with PHP and Eclipse 475 • Getting Started with HTML Web Pages 477 • The index.html Web Page 478 • Creating the index.html Web Page 478 • Using PHP 480 • Challenges for Web Database Processing 487

Web Page Examples with PHP 487 Example 1: Updating a Table 489 • Example 2: Using PHP Data Objects (PDO) 493 • Example 3: Invoking a Stored Procedure 495

Summary 500 • Key Terms 501 • Review Questions 502 • Project Questions 505 • Marcia’s Dry Cleaning 507 • Morgan Importing 508

Chapter 12: Database Processing with XML 509

Chapter Objectives 509 The Importance of XML 510 XML as a Markup Language 511

XML Document Type Declarations 511 • Materializing XML Documents with XSLT 512

XML Schema 516

XML Schema Validation 517 • Elements and Attributes 517 • Flat Versus Structured Schemas 519 • Global Elements 521

Creating XML Documents from Database Data 525

Using the SQL SELECT . . . FOR XML Statement 525 • Multitable SELECT with

FOR XML 530 • An XML Schema for All CUSTOMER Purchases 534 •

A Schema with Two Multivalued Paths 537 Why Is XML Important? 537 Additional XML Standards 543 The NoSQL Movement 545

Summary 545 • Key Terms 546 • Review Questions 547 • Project Questions 548 • Marcia’s Dry Cleaning 548 • Morgan Importing 548

Chapter 13: Database Processing for Business Intelligence Systems 549

Chapter Objectives 549 Business Intelligence Systems 549 The Relationship Between Operational and BI Systems 550 Reporting Systems and Data Mining Applications 550

Reporting Systems 550 • Data Mining Applications 550 Data Warehouses and Data Marts 551 Components of a Data Warehouse 551 • Data Warehouses Versus Data Marts 554 • Dimensional Databases 555

Reporting Systems 563

RFM Analysis 563 • Producing the RFM Report 564 • Reporting System Components 567 • Report Types 568 • Report Media 568 • Report Modes 569 • Report System Functions 569 • OLAP 572

Data Mining 577 Unsupervised Data Mining 578 • Supervised Data Mining 580 • Three Popular Data Mining Techniques 580 • Market Basket Analysis 580 • Using SQL for Market Basket Analysis 582

Summary 582 • Key Terms 583 • Review Questions 584 • Project Questions 586 • Marcia’s Dry Cleaning 588 • Morgan Importing 589 Summary 582 • Key Terms 583 • Review Questions 584 • Project Questions 586 • Marcia’s Dry Cleaning 588 • Morgan Importing 589

Contents

APPENDICES ONLINE APPENDICES: SEE PAGE 590 FOR INSTRUCTIONS

Appendix A: Getting Started with Microsoft Access 2010 A-1

Chapter Objectives A-3 What Is the Purpose of This Appendix? A-3 Why Should I Learn to Use Microsoft Access 2010? A-3 What Will This Appendix Teach Me? A-4 What Is a Table Key? A-5 What Are Relationships? A-5 Creating a Microsoft Access Database A-5 The Microsoft Office Fluent User Interface A-8

The Ribbon and Command Tabs A-8 • Contextual Command Tabs A-9 •

Modifying the Quick Access Toolbar A-9 • Database Objects and the Navigation Pane A-9

Closing a Database and Exiting Microsoft Access A-10 Opening an Existing Microsoft Access Database A-11 Creating Microsoft Access Database Tables A-13 Inserting Data into Tables—The Datasheet View A-22

Modifying and Deleting Data in Tables in the Datasheet View A-25 Creating Relationships Between Tables A-26 Working with Microsoft Access Queries A-30 Microsoft Access Forms and Reports A-35 Closing a Database and Exiting Microsoft Access 2010 A-36

Key Terms A-37 • Review Questions A-38

Appendix B: Getting Started with Systems Analysis and Design B-1

Chapter Objectives B-3 What Is the Purpose of This Appendix? B-3 What Is Information? B-4 What Is an Information System? B-5 What Is a Competitive Strategy? B-5 How Does a Company Organize Itself Based on Its Competitive Strategy? B-5 What Is a Business Process? B-6 How Do Information Systems Support Business Processes? B-7 Do Information Systems Include Processes? B-7 Do We Have to Understand Business Processes in Order to Create Information Systems? B-8 What Is Systems Analysis and Design? B-8 What Are the Steps in the SDLC? B-9

The System Definition Step B-9 • The Requirements Analysis Step B-10 • The Component Design Step B-11 • The Implementation Step B-11 • The System Maintenance Step B-12

What SDLC Details Do We Need to Know? B-12 What Is Business Process Modeling Notation? B-13 What Is Project Scope? B-14 How Do I Gather Data and Information About System Requirements? B-14 How Do Use Cases Provide Data and Information About System Requirements? B-14 The Highline University Database B-15

The College Report B-17 • The Department Report B-19 • The Department/Major Report B-21 • The Student Acceptance Letter B-22

What Are Business Rules? B-24 What Is a User Requirements Document (URD)? B-25 What Is a Statement of Work (SOW)? B-26

Key Terms B-27 • Review Questions B-28 • Project Questions B-29

Contents

xiii

Appendix C: E-R Diagrams and the IDEF1X Standard C-1

Chapter Objectives C-3 What Is the Purpose of This Appendix? C-3 Why Should I Learn to Use IDEF1X? C-3 What Will This Appendix Teach Me? C-4 What Are IDEF1X Entities? C-4 What Are IDEF1X Relationships? C-5

Nonidentifying Connection Relationships C-5 • Identifying Connection Relationships C-6 • Nonspecific Relationships C-7

What Are Categorization Relationships? C-7 What Are Domains? C-10

Domains Reduce Ambiguity C-10 • Domains Are Useful C-11 • Base Domains and Typed Domains C-11

Key Terms C-12 • Review Questions C-13

Appendix D: E-R Diagrams and the UML Standard D-1

Chapter Objectives D-3 What Is the Purpose of This Appendix? D-3 Why Should I Learn to Use UML? D-3 What Will This Appendix Teach Me? D-3 How Does UML Represent Entities and Relationships? D-4 UML Entities and Relationships D-5

Representation of Weak Entities D-5 • Representation of Subtypes D-5 What OOP Constructs Are Introduced by UML? D-6 What Is the Role of UML in Database Processing Today? D-7

Key Terms D-8 • Review Questions D-8

Appendix E: Getting Started with the MySQL Workbench Database Design Tools E-1

Chapter Objectives E-3 What Is the Purpose of This Appendix? E-3 Why Should I Learn to Use the MySQL Workbench for Database Design? E-4 What Will This Appendix Teach Me? E-4 What Won’t This Appendix Teach Me? E-4 How Do I Install the MySQL Workbench and the MySQL Connector/OBDC? E-4 How Do I Start the MySQL Workbench? E-5 How Do I Create a Workspace for the MySQL Workbench Files? E-6 How Do I Create Database Designs in the MySQL Workbench? E-6

How Do I Create a Database Model and E-R Diagram in the MySQL Workbench? E-7 Key Terms E-22 • Review Questions E-22 • Exercises E-22

Appendix F: Getting Started with Microsoft Visio 2010 F-1

Chapter Objectives F-3 What Is the Purpose of This Appendix? F-3 Why Should I Learn to Use Microsoft Visio 2010? F-3 What Will This Appendix Teach Me? F-4 What Won’t This Appendix Teach Me? F-4 How Do I Start the Microsoft Visio 2010? F-4 How Do I Create a Database Model Diagram in Microsoft Visio 2010? F-4 How Do I Name and Save a Database Model Diagram in Microsoft Visio 2010? F-9 How Do I Create Entities/Tables in a Database Model Diagram in Microsoft Visio 2010? F-11 How Do I Create Relationships Between Tables in a Database Model Diagram in Microsoft Visio 2010? F-16 How Do I Create Diagrams Using Business Process Modeling Notation (BPMN) in Microsoft Visio 2010? F-33

Key Terms F-35 • Review Questions F-35 • Exercises F-36 Key Terms F-35 • Review Questions F-35 • Exercises F-36

Contents

Appendix G: The Semantic Object Model G-1

Chapter Objectives G-3 What Is the Purpose of This Appendix? G-3 Why Should I Learn to Use the Semantic Object Model? G-4 What Will This Appendix Teach Me? G-4 What Are Semantic Objects? G-4 What Semantic Objects Are Used in the Semantic Object Model? G-5

What Are Semantic Object Attributes? G-6 • What Are Object Identifiers? G-9 •

What Are Attribute Domains? G-10 • What Are Semantic Object Views? G-10 What Types of Objects Are Used in the Semantic Object Model? G-11 What Are Simple Objects? G-12 • What Are Composite Objects? G-13 • What Are Compound Objects? G-16 • How Do We Represent One-to-One Compound Objects as Relational Structures? G-19 • How Do We Represent One-to- Many and Many-to-One Relationships as Relational Structures? G-21 • How Do We Represent Many-to-Many Relationships as Relational Structures? G-22 • What Are Hybrid Objects? G-24 • How Do We Represent Hybrid Relationships in Relational Structures? G-27 • What Are Association Objects? G-30 • What Are Parent/Subtype Objects? G-34 • What Are Archetype/Version Objects? G-37

Comparing the Semantic Object and the E-R Models G-39 Key Terms G-42 • Review Questions G-43

Appendix H: Data Structures for Database Processing H-1

Chapter Objectives H-3 What Is the Purpose of This Appendix? H-3 What Will This Appendix Teach Me? H-3 What Is a Flat File? H-3

Processing Flat Files in Multiple Orders H-4 •

A Note on Record Addressing H-5 • How Can Linked Lists Be Used to Maintain Logical Record Order? H-5 • How

Are Indexes Used to Maintain a Logical Record Order? H-8 • B-Trees H-9 •

Summary of Data Structures H-11 How Can We Represent Binary Relationships? H-12

A Review of Record Relationships H-12 • How Can We Represent Trees? H-14 •

How Can We Represent Simple Networks? H-17 • How Can We Represent Complex Networks? H-19 • Summary of Relationship Representations H-20

How Can We Represent Secondary Keys? H-22

How Can We Represent Secondary Keys with Linked Lists? H-22 • How Can We Represent Secondary Keys with Indexes? H-23

Key Terms H-26 • Review Questions H-27

Appendix I: Getting Started with Web Servers, PHP, and the Eclipse PDT I-1

Chapter Objectives I-3 What Is the Purpose of This Appendix? I-3 Which Operating Systems Are We Discussing? I-3 How Do I Install a Web Server? I-4 How Do I Set Up IIS in Windows 7? I-4 How Do I Manage IIS in Windows 7? I-7 How Is a Web Site Structured? I-11 How Do I View a Web Page from the IIS Web Server? I-12 How Is Web Site Security Managed? I-13 What Is the Eclipse PDT? I-19 How Do I Install the Eclipse PDT? I-22 What Is PHP? I-32 How Do I Install PHP? I-32 How Do I Create a Web Page Using the Eclipse PDT? I-39 How Do I Manage the PHP Configuration? I-50

Key Terms I-59 • Review Questions I-59 • Review Exercises I-60

Bibliography 591 Glossary 592 Bibliography 591 Glossary 592

The 12th edition of Database Processing: Fundamentals, Design, and Implementation refines the organization and content of this classic textbook to reflect a new teaching and profes- sional workplace environment. Students and other readers of this book will benefit from new content and features in this edition.

New to This Edition

Content and features new to the 12th edition of Database Processing: Fundamentals, Design, and Implementation include:

• The use of Microsoft Access 2010 to demonstrate and reinforce basic principles of

database creation and use. This book has been revised to update all references to Microsoft Access and other Microsoft Office products (e.g., Microsoft Excel) to the recently released Microsoft Office 2010 versions.

• The updating of book to reflect the use of Microsoft SQL Server 2008 R2, the current

version of Microsoft SQL Server. Although most of the topics covered are backward compatible with Microsoft SQL Server 2008 and Microsoft SQL Server 2008 Express edition, all material in the book now uses SQL Server 2008 R2 in conjunction with Office 2010, exclusively. In addition, although we cannot present screenshots, we have tested the SQL Server SQL statements against a Microsoft Community Technology Preview (CTP) version of the forthcoming SQL Server 2011 (code name Denali), so our text material should be compatible when that version is released in the near future.

• The updating of the book to use Oracle MySQL 5.5, which is the current generally

available (GA) release of MySQL. Further, we also now use the MySQL Workbench GUI as the main database development tool for MySQL 5.5. The MySQL GUI Tools utilities used in Database Processing: Fundamentals, Design, and Implementation, 11th edition, were declared “end of life” by MySQL on December 18, 2009. The MySQL Workbench 5.2.x now integrates the functionality of the MySQL GUI Tools bundle and is, with a few exceptions, used throughout Database Processing: Fundamentals, Design, and Implementation, 12th edition.

• The use of the Microsoft Windows Server 2008 R2 as the server operating system

and Windows 7 as the workstation operating system discussed and illustrated in text. These are the current Microsoft server and workstation operating systems.

• More material in Chapter 3 on normalization is presented in the traditional step-by-step

approach (1NF → 2NF → 3NF → BCNF) in response to comments and requests from professors and instructors who prefer to teach normalization using that approach.

• Additional SQL topics in Chapter 7 including the SQL TRUCATE TABLE statement,

the SQL MERGE statement, and a discussion of SQL Persistent Stored Modules (SQL/PSM) as the context for SQL triggers and stored procedures.

• Datasets for example databases such as Marcia’s Dry Cleaning and Morgan Importing

have been clearly defined in all chapters for consistency in student responses to have been clearly defined in all chapters for consistency in student responses to

Preface

Review Questions, Review Projects, and the Marcia’s Dry Cleaning and Morgan Importing projects.

• The addition of online Appendix B, “Getting Started with Systems Analysis and

Design.” This new material provides an introduction to systems analysis and design concepts for students or readers who have not had a course on this material. It presents basic methods used to gather the input material needed for data modeling, which is discussed in Chapter 5. This material can also be used as a review for students or readers who are familiar with systems analysis and design concepts and helps put data modeling, database design and database implementation in the context of systems development life cycle (SDLC).

• The addition of online Appendix F, “Getting Started with Microsoft Visio 2010.” This

new material provides an introduction to the use of Microsoft Visio 2010 for data modeling, which is discussed in Chapter 5, and database design, which is discussed in Chapter 5.

• The addition of online Appendix E, “Getting Started with MySQL Workbench

Database Design Tools.” Although the use of MySQL 5.5 as a DBMS is covered in Chapter 10B and referenced throughout the text, this new appendix provides the introduction needed to use the MySQL Workbench data modeling tools for database design, which is discussed in Chapter 6.

• The addition of online Appendix I, “Getting Started with Web Servers, PHP, and the

Eclipse PDT.” This new material provides a detailed introduction to the installation and use of the Microsoft IIS Web server, PHP and the Eclipse IDE used for Web database application development as discussed in Chapter 11.

• Although Oracle Database 11g remains the version of Oracle Database discussed in

the book, the current release is Oracle Database 11g Release 2, and all Oracle Database 11g material has been updated to reflect use of Release 2 and the current version of the Oracle SQL Developer GUI tool.

Fundamentals, Design, and Implementation

With today’s technology, it is impossible to utilize a DBMS successfully without first learning fundamental concepts. After years of developing databases with business users, we have developed what we believe to be a set of essential database concepts. These are augmented by the concepts necessitated by the increasing use of the Internet, the World Wide Web, and commonly available analysis tools. Thus, the organization and topic selection of the 12th edition is designed to:

• Present an early introduction to SQL queries. • Use a “spiral approach” (as discussed below) to database design. • Use a consistent, generic Information Engineering (IE) Crow’s Foot E-R diagram

notation for data modeling and database design. • Provide a detailed discussion of specific normal forms within a discussion of

normalization that focuses on pragmatic normalization techniques. • Use current DBMS technology: Microsoft Access 2010, Microsoft SQL Server 2008 R2,

Oracle Database 11g Release 2, and MySQL 5.5. • Create Web database applications based on widely used Web development

technology. • Provide an introduction to business intelligence (BI) systems.

• Discuss the dimensional database concepts used in database designs for data

warehouses and OnLine Analytical Processing (OLAP). These changes have been made because it has become obvious that the basic structure of

the earlier editions (up to and including the 9th edition—the 10th edition introduced many of the changes we used in the 11th edition and retain in the 12th edition) was designed for a

Preface

xvii

teaching environment that no longer existed. The structural changes to the book were made for several reasons:

• Unlike the early years of database processing, today’s students have ready access to

data modeling and DBMS products. • Today’s students are too impatient to start a class with lengthy conceptual discus-

sions on data modeling and database design. They want to do something, see a result, and obtain feedback.

• In the current economy, students need to reassure themselves that they are learning

marketable skills.

Early Introduction of SQL DML

Given these changes in the classroom environment, this book provides an early introduction to SQL data manipulation language (DML) SELECT statements. The discussion of SQL data definition language (DDL) and additional DML statements occurs in Chapters 7 and 8. By presenting SQL SELECT statements in Chapter 2, students learn early in the class how to query data and obtain results, seeing firsthand some of the ways that database technology will

be useful to them. The text assumes that students will work through the SQL statements and examples with

a DBMS product. This is practical today, because nearly every student has access to Microsoft Access. Therefore, Chapters 1 and 2 and Appendix A, “Getting Started with Microsoft Access 2010,” are written to support an early introduction of Microsoft Access 2010 and the use of Microsoft Access 2010 for SQL queries (Microsoft Access 2010 QBE query techniques are also covered).

If a non–Microsoft Access-based approach is desired, versions of SQL Server 2008 R2, Oracle Database 11g, and MySQL 5.5 are readily available for use. Free versions of the three major DBMS products covered in this book (SQL Server 2008 R2 Express, Oracle Express 10g, and MySQL 5.5 Community Edition) are available for download. Further, the text can be pur- chased with a licensed educational version of Oracle Database 11g Release 1 Personal Edition (this is a developer license) as well. Alternatively, a trial copy of MySQL 5.5 Enterprise Edition also is available as a download. Thus, students can actively use a DBMS product by the end of the first week of class.

The presentation and discussion of SQL is spread over three chapters so that students can learn about this important topic in small bites. SQL

SELECT statements are taught in Chapter 2. SQL DDL and SQL DML statements are presented in Chapter 7. Correlated subqueries and EXISTS/NOT EXISTS statements are described in Chapter 8. Each topic appears in the context of accomplishing practical tasks. Correlated subqueries, for example, are used to verify functional dependency assumptions, a necessary task for database redesign.

This box illustrates another feature used in this book: BTW boxes are used to separate comments from the text discussion. Sometimes they present ancillary material; other times they reinforce important concepts.

A Spiral Approach to Database Design

Today, databases arise from three sources: (1) from the integration of existing data from spreadsheets, data files, and database extracts; (2) from the development of new information systems projects; and (3) from the need to redesign an existing database to adapt to changing requirements. We believe that the fact that these three sources exist present instructors with a significant pedagogical opportunity. Rather than teach database design just once from data Today, databases arise from three sources: (1) from the integration of existing data from spreadsheets, data files, and database extracts; (2) from the development of new information systems projects; and (3) from the need to redesign an existing database to adapt to changing requirements. We believe that the fact that these three sources exist present instructors with a significant pedagogical opportunity. Rather than teach database design just once from data

Preface

models, why not teach database design three times, once for each of these sources? In practice, this idea has turned out to be even more successful than expected.

Design Iteration 1: Databases from Existing Data

Considering the design of databases from existing data, if someone were to e-mail us a set of tables and say, “Create a database from them,” how would we proceed? We would examine the tables in light of normalization criteria and then determine whether the new database was for query only or whether it was for query and update. Depending on the answer, we would denor- malize the data, joining them together, or we would normalize the data, pulling them apart. All of which is important for students to know and understand.

Therefore, the first iteration of database design gives instructors a rich opportunity to teach normalization, not as a set of theoretical concepts, but rather as a useful toolkit for mak- ing design decisions for databases created from existing data. Additionally, the construction of databases from existing data is an increasingly common task that is often assigned to junior staff members. Learning how to apply normalization to the design of databases from existing data not only provides an interesting way of teaching normalization, it is also common and useful!

We prefer to teach and use a pragmatic approach to normalization, and present this approach in Chapter 3. However, we are aware that many instructors like to teach normal- ization in the context of a step-by-step normal form presentation (1NF, 2NF, 3NF, then BCNF), and Chapter 3 now includes additional material to provide more support this approach as well.

In today’s workplace, large organizations are increasingly licensing standardized software from vendors such as SAP, Oracle, and Siebel. Such software already has a database design. But with every organization running the same software, many are learning that they can only gain

a competitive advantage if they make better use of the data in those predesigned databases. Hence, students who know how to extract data and create read-only databases for reporting and data mining have obtained marketable skills in the world of ERP and other packaged software solutions.

Design Iteration 2: Data Modeling and Database Design

The second source of databases is from new systems development. Although not as common as in the past, many databases are still created from scratch. Thus, students still need to learn data modeling, and they still need to learn how to transform data models into database designs.

The IE Crow’s Foot Model as a Design Standard This edition uses a generic, standard IE Crow’s Foot notation. Your students should have no trouble understanding the symbols and using the data modeling or database design tool of your choice.

IDEF1X (which was used as the preferred E-R diagram notation in the 9th edition of this text) is explained in Appendix C, “The IDEF1X Standard,” in case your students graduate into an environment where it is used, or if you prefer to use it in your classes. UML is explained in Appendix D, “UML-Style Entity-Relationship Diagrams,” in case you prefer to use UML in your classes.

The choice of a data modeling tool is somewhat problematic. The two most readily available tools, Microsoft Visio and Sun Microsystems

MySQL Workbench, are database design tools, not data modeling tools. Neither can produce an N:M relationship as such (as a data model requires), but have to immediately break it into two 1:N relationships (as database design does). Therefore, the intersection table must be constructed and modeled. This confounds data modeling with database design in just the way that we are attempting to teach students to avoid.

To be fair to Visio, it is true that data models with N:M relationships can be drawn using either the standard Visio drawing tools or the Entity Relationship shapes dynamic

Preface

xix

connector. For a full discussion of these tools, see Appendix E, “Getting Started with the MySQL Workbench Database Design Tools,” and Appendix F, “Getting Started with Microsoft Visio 2010.”

Good data modeling tools are available, but they tend to be more complex and expensive. Two examples are Visible Systems’ Visible Analyst and Computer Associates’ ERwin Data Modeler. Visible Analyst is available in a student edition (at a modest price).

A 1-year time-limited CA ERwin Data Modeler Community Edition suitable for class use can be downloaded from http://erwin.com/products/detail/ca_erwin_data_modeler_ community_edition/. This version has limited the number of objects that can be created by this edition to 25 entities per model, and disabled some other features (see http:// erwin.com/uploads/erwin-data-modeler-r8-community-edition-matrix.pdf), but there is still enough functionality to make this product a possible choice for class use.

Database Design from E-R Data Models As we discuss in Chapter 6, designing a database from data models consists of three tasks: representing entities and attributes with tables and columns; representing maximum cardinality by creating and placing foreign keys; and representing minimum cardinality via constraints, triggers, and application logic.

The first two tasks are straightforward. However, designs for minimum cardinality are more difficult. Required parents are easily enforced using NOT NULL foreign keys and refer- ential integrity constraints. Required children are more problematic. In this book, however, we simplify the discussion of this topic by limiting the use of referential integrity actions and by supplementing those actions with design documentation. See the discussion around Figure 6-28.

Although the design for required children is complicated, it is important for students to learn. It also provides a reason for students to learn about triggers as well. In any case, the dis- cussion of these topics is much simpler than it was in prior editions because of the use of the IE Crow’s Foot model and the use of ancillary design documentation.

David Kroenke is the creator of the semantic object model (SOM). The SOM is presented in Appendix G, “The Semantic Object Model.” The E-R

data model is used everywhere else in the text.

Design Iteration 3: Database Redesign

Database redesign, the third iteration of database design, is both common and difficult. As stated in Chapter 8, information systems cause organizational change. New information systems give users new behaviors, and as users behave in new ways, they require changes in their information systems.

Database redesign is by nature complex. Depending on your students, you may wish to skip it, and you can do so without loss of continuity. Database redesign is presented after the dis- cussion of SQL DDL and DML in Chapter 7, because it requires the use of advanced SQL. It also provides a practical reason to teach correlated subqueries and EXISTS/NOT EXISTS statements.

Active Use of a DBMS Product

We assume that the students will actively use a DBMS product. The only real question becomes “which one?” Realistically, most of us have four alternatives to consider: Microsoft Access, Microsoft SQL Server, Oracle Database, or MySQL. You can use any of those products with this text, and tutorials for each of them are presented for Microsoft Access 2010 (Appendix A), SQL Server 2008 R2 (Chapter 10), Oracle Database 11g (Chapter 10A), and We assume that the students will actively use a DBMS product. The only real question becomes “which one?” Realistically, most of us have four alternatives to consider: Microsoft Access, Microsoft SQL Server, Oracle Database, or MySQL. You can use any of those products with this text, and tutorials for each of them are presented for Microsoft Access 2010 (Appendix A), SQL Server 2008 R2 (Chapter 10), Oracle Database 11g (Chapter 10A), and

Preface

MySQL 5.5 (Chapter 10B). Given the limitations of class time, it is probably necessary to pick and use just one of these products. You can often devote a portion of a lecture to discussing the characteristics of each, but it is usually best to limit student work to one of them. The pos- sible exception to this is starting the course with Microsoft Access, and then switching to a more robust DBMS product later in the course.

Using Microsoft Access 2010

The primary advantage of Microsoft Access is accessibility. Most students already have a copy, and, if not, copies are easily obtained. Many students will have used Microsoft Access in their introduc- tory or other classes. Appendix A, “Getting Started with Microsoft Access 2010,” is a tutorial on Microsoft Access 2010 for students who have not used it but who wish to use it with this book.

However, Microsoft Access has several disadvantages. First, as explained in Chapter 1, Microsoft Access is a combination application generator and DBMS. Microsoft Access con- fuses students because it confounds database processing with application development. Also, Microsoft Access 2010 hides SQL behind its query processor and makes SQL appear as an afterthought rather than a foundation. Furthermore, as discussed in Chapter 2, Microsoft Access 2010 does not correctly process some of the basic SQL-92 standard statements in its default setup. Finally, Microsoft Access 2010 does not support triggers. You can simulate trig- gers by trapping Windows events, but that technique is nonstandard and does not effectively communicate the nature of trigger processing.

Using SQL Server, Oracle Database, or MySQL

Choosing which of these products to use depends on your local situation. Oracle Database 11g,

a superb enterprise-class DBMS product, is difficult to install and administer. However, if you have local staff to support your students, it can be an excellent choice. As shown in Chapter 10A, Oracle’s SQL Developer GUI tool (or SQL*Plus if you are dedicated to this beloved command- line tool) is a handy tool for learning SQL, triggers, and stored procedures. In our experience, students require considerable support to install Oracle on their own computers, and you may

be better off to use Oracle from a central server.

SQL Server 2008 R2, although probably not as robust as Oracle Database 11g, is easy to install on Windows machines, and it provides the capabilities of an enterprise-class DBMS product. The standard database administrator tool is the Microsoft SQL Server Management Studio GUI tool. As shown in Chapter 10, SQL Server 2008 R2 can be used to learn SQL, triggers, and stored procedures.

MySQL 5.5, discussed in Chapter 10B, is an open-source DBMS product that is receiving increased attention and market share. The capabilities of MySQL are continually being upgraded, and MySQL 5.5 supports stored procedures and triggers. MySQL also has an excellent GUI tool (the MySQL Workbench) and an excellent command-line tool (the MySQL Command Line Client). It is the easiest of the three products for students to install on their own computers. It also works with the Linux operating system, and is popular as part of the AMP (Apache–MySQL–PHP) package (known as WAMP on Windows and LAMP on Linux).

If the DBMS you use is not driven by local circumstances and you do have a choice, we recommend using SQL Server 2008 R2. It has all of the features of an enterprise-class DBMS product, and it is easy to install and use. Another option is to start with Microsoft Access 2010 if it is available, and switch to SQL Server 2008 R2 at Chapter 7. Chapters 1 and 2 and Appendix A are written specifically to support this approach. A variant is to use Microsoft Access 2010 as the development tool for forms and reports running against an SQL Server 2008 R2 database.