A flawed data design

A flawed data design

If you were attentive in reading the previous pages, you will remember the break- down of the data. There are one or more categories, and each category will contain many products. Each product will have many styles, and each style can have a number of substyles.

This might lead you to believe that a simple hierarchical structure of our tables would work just fine. Figure 10-7 shows the one-to-many relationships that would create this hierarchical effect.

cat egories

cat egory_id

product s

st yles

subst yles

cat egory descript ion

product _id

st yle_id

subst yle_id

cat egory_id

product _id

st yle_id

product

st yle

subst yle

descript ion

descript ion

descript ion

image_src

image_src

Figure 10 - 7 : Flawed catalog schema

Now consider what would happen if we were to add data to these tables. Let’s take the example of t-shirts. There is a category for t-shirts, and a number of prod- ucts (different clever phrases stenciled on the t-shirts) for this category. Each prod- uct will come in a number of styles (colors), and each color will come in a number of substyles (sizes). Figure 10-8 shows what data in the hierarchical table form might look like. (Note that the tables have been simplified).

256 Part IV: Not So Simple Applications

cat egories

cat egory_id

cat egory 1 t - shirt s 2 shoes

product s

product _id

product

cat egory_id

1 Just Say Oops

2 I Love M ilk

st yles

st yle_id

st yle

product _id

sub- st yles

sub- st yle_id

sub- st yle

st yle_id

4 XL

8 XL

Figure 10 - 8 : Sample hierarchical data

Take a look at the substyles table in Figure 10-8. It is already starting to get a bit messy. Even with just a couple of t-shirts there is some repeating data. As you can see, size small (S) appears several times, and as we add to the catalog, more rows will be inserted and this table will get even messier.

Thinking about the data a bit more carefully, you might notice something inter- esting: In the case of the t-shirts, the sizes are not dependent on the color of the t-shirts at all. If you remember back to Chapter 1, you will remember that a lack of

a dependency is a bad thing. So the above data really aren’t properly normalized. In fact, the sizes in which a t-shirt is available are not dependent on the color

Chapter 10: Catalog 257

on the category. All t-shirts (category_id=1) will come in S, M, L, or XL. Therefore, in the final schema there will be a relationship between the category table and the substyle table.

Tip

Test your schemas. Before you go live with your own applications use some test data and see what happens. What seems right in theory could have some serious flaws in practice.

Tip

Before you make a single table in your database,work with pencil and paper to draw out your tables and relationships. Erasing a line there is a lot less trouble than deleting a column.