Threaded Discussion 317

Chapter 12: Threaded Discussion 317

T ABLE 1 2 - 2 PROBLEM ATIC TOPICS

t opic_id t opic_dat e

t opic_aut hor

t opic_subject

t opic_t ext

You bet cha 2 08/20/2000

Ellen

Re: Snacks Rule

Erners

Re: Snacks Rule

Indeed

Now let’s move away from this ill-considered idea and move toward Brad’s sound plan. Think about what information needs to be stored for each post to the mailing list. Start with the obvious stuff. You need a column that stores the subject of the thread (for instance, “Nachos, food of the gods”), one that stores the author’s name, and one that records the date the item was posted. So the table starts with these columns —I’ve thrown in some sample information in Table 12-3 and an auto_increment primary key just to keep it clear.

T ABLE 1 2 - 3 START OF A USEABLE TABLE

post _id Subject

Aut hor

Dat e

1 Nachos rule

Jay

2 Cheet os are t he best

Brad

But of course this isn’t enough. Somehow there needs to be a way to track the ancestry and lineage of any specific post. (Look again at Figure 12-1 if you are not sure what I mean.) So how are you going to be able to do this? If you are looking to track the ancestry of any particular thread, it would probably make sense to add

a field that indicated the post that started the thread, which we’re calling the root. Take a close look at Table 12-4. Start with the first row. Here the root_id is the same as the post_id. Now look at the third row. Here the root_id (1) matches the post_id of the first row. So you know that the thread to which row three belongs started with post_id 1 —“Nachos Rule.” Similarly, row 6 must be a reply to row 2.

318 Part IV: Not So Simple Applications

T ABLE 1 2 - 4 A M ORE COM PLETE TABLE

post _ID Root _ID

1 1 Nachos rule

Jay

2 2 Cheet os are t he best

Ed 3/12/2000

3 1 Re: Nachos rule

Don

4 1 Re: Nachos rule

Bill

5 5 What about cookies

Evany

6 2 Re: Cheet os are t he best

Ed 3/13/2000

Now look at rows 1, 2, and 5. Notice that in these rows the post_id and root_id are identical. At this point you can probably guess that whenever these two are the same, it indicates a root-level subject. Easy enough, right? The following SQL state- ment that would retrieve all of the root-level posts

select * from topics where root_id=post_id. However, in this application, you will see us using a self join to get root-level

topics. select distinct current.topic_id, current.parent_id,

current.root_id, current.name, current.description, current.author, current.author_host, current.create_dt, current.modify_dt

from topics current, topics child where current.topic_id = child.root_id

This join will have the same effect, finding rows where the topic_id and root_id columns are the same. We use it here because, even though it’s a little slower, it’s more flexible, and is easier to adapt in the event there are changes to the system.

Now that you’ve added a root_id field to the table you should know the begin- ning of a thread. But how can you get all the posts that came between the original post and the one you’re interested in? Initially you may think that it would be pru- dent to add a column that lists the ancestors. You could call the column ancestors and in it you’d have a listing of topic_ids. It might contain a string like “1, 6, 9, 12”. This would be a very, very bad idea. Why, you ask? Well, the most important rea- son worth mentioning is that you should never put multiple values in a single field —you’ll open yourself up to all kinds of hassles.