PostgreSQL, 2nd Edition Free ebook download for u

  , Second Edition

PostgreSQL

  by Korry Douglas; Susan Douglas

  • --------------------------------------------------------------------------------

    Publisher: Sams Pub Date: July 26, 2005 Print ISBN-10: 0-672-32756-2 Print ISBN-13: 978-0-672-32756-8 Pages: 1032

  

The Real Value in Free Soft w are

  These days, it seem s t hat m ost discussion of open- source soft ware cent ers around t he idea t hat you should not have t o t ie

your fut ure t o t he whim of som e giant corporat ion. People say t hat open- source soft ware is bet t er t han propriet ary soft ware

because it is developed and m aint ained by t he users inst ead of a faceless com pany out t o light en your wallet .

  [ 1 ] I t hink t hat t he real value in free soft ware is educat ion. I have never learned anyt hing by reading m y own code . On t he

ot her hand, it 's a rare occasion when I 've looked at code writ t en by som eone else and haven't com e away wit h anot her t ool

in m y t oolkit . People don't t hink alike. I don't m ean t hat people disagree wit h each ot her; I m ean t hat people solve problem s

in different ways. Each person brings a unique set of experiences t o t he t able. Each person has his own set of goals and biases. Each person has his own int erest s. All of t hese t hings will shape t he way you t hink about a problem . Oft en, I 'll find

m yself in a heat ed disagreem ent wit h a colleague only t o realize t hat we are each correct in our approach. Just because I 'm

right , doesn't m ean t hat m y colleague can't be right as well.

  [ 1 ]

Maybe I should say t hat I have never learned anyt hing new by reading m y own code. I 've cert ainly looked

at code t hat I 've writ t en and wondered what I was t hinking at t he t im e, learning t hat I 'm not nearly as clever

as I had rem em bered. Oddly enough, t hose who have read m y code have reached a sim ilar conclusion.

  

Open-source soft ware is a great way t o learn. You can learn about program m ing. You can learn about design. You can learn

about debugging. Som et im es, you'll learn how not t o design, code, or debug; but t hat 's a valuable lesson, t oo. You can learn

sm all t hings, like how t o cache file descript ors on syst em s where file descript ors are a scarce and expensive resource, or how

t o use t he select() function to im plem ent fine- grained tim ers. You can learn big t hings, like how a query optim izer works or how t o writ e a parser, or how t o develop a good m em ory - m anagem ent st rat egy.

  

Post greSQL is a great exam ple. I 've been using dat abases for t he last t wo decades. I 've used m ost of t he m aj or com m ercial

dat abases: Oracle, Sybase, DB2, and MS SQL Server. Wit h each com m ercial dat abase, t here is a wall of knowledge bet ween

m y needs and t he vendor's need t o prot ect his int ellect ual propert y. Unt il I st art ed exploring open- source dat abases, I had an incom plet e underst anding of how a dat abase works. Why was t his part icular feat ure im plem ent ed t hat way? Why am I get t ing poor perform ance when I t ry t his? That 's a neat feat ure; I wonder how t hey did t hat ? Every com m ercial dat abase t ries t o expose a sm all piece of it s inner workings. The explain st at em ent will show you why t he database m akes it s

opt im izat ion decisions. But , you only get t o see what t he vendor want s you to see. The vendor isn't t rying t o hide t hings from

you ( in m ost cases) , but wit hout com plet e access t o t he source code, t hey have t o pick and choose how t o expose inform at ion in a m eaningful way. Wit h open- source soft ware, you can dive deep int o t he source code and pull out all t he inform at ion you need. While writ ing t his book, I 've spent a lot of t im e reading t hrough t he Post greSQL source code. I 've added a lot of m y own code t o reveal m ore inform at ion so t hat I could explain t hings m ore clearly. I can't do t hat wit h a com m ercial dat abase.

  

There are gem s of brilliance in m ost open - source proj ect s. I n a well- designed, well- fact ored proj ect , you will find designs and

code t hat you can use in your own proj ect s. Many open - source proj ect s are st art ing t o split t heir code int o reusable libraries.

The Apache Port able Runt im e is a good exam ple. The Apache Web server runs on m any diverse plat form s. The Apache

developm ent t eam saw t he need for a layer of abst ract ion t hat would provide a port able int erface t o syst em funct ions such as

shared m em ory and net work access. They decided t o fact or t he port abilit y layer int o a library separat e from t heir m ain

proj ect . The result is t he Apache Port able Runt im e—a library of code t hat can be used in ot her open - source proj ect s ( such as

Post greSQL) .

  

Som e developers hat e t o work on som eone else's code. I love working on code writ t en by anot her developer—I always learn

som et hing from t he experience. I st rongly encourage you t o dive int o t he Post greSQL source code. You will learn from it . You

m ight even decide t o cont ribut e t o t he proj ect .

  —Korry Douglas

I nt roduct ion

  Post greSQL is a relat ional dat abase wit h a long hist ory. I n t he lat e 1970s, t he Universit y of California at Berkeley began

developm ent of Post greSQL's ancest or —a relat ional dat abase known as I ngres. Relat ional Technologies t urned I ngres int o a

com m ercial product . Relat ional Technologies becam e I ngres Corporat ion and was lat er acquired by Com put er Associat es. Around 1986, Michael St onebraker from UC Berkeley led a t eam t hat added obj ect - orient ed feat ures t o t he core of I ngres; t he new version becam e known as Post gres. Post gres was again com m ercialized; t his t im e by a com pany nam ed I llust ra, which becam e part of t he I nform ix Corporat ion. Andrew Yu and Jolly Chen added SQL support t o Post gres in t he m id- '90s.

Prior versions had used a different , Post gres- specific query language known as Post quel. I n 1996, m any new feat ures were

added, including t he MVCC t ransact ion m odel, m ore adherence t o t he SQL92 st andard, and m any perform ance im provem ent s. Post gres once again t ook on a new nam e: Post greSQL.

P o st g r e SQ L Fe a t u r e s

  Server- side code is m ost com m only writ t en in PL/ pgSQL, a procedural language sim ilar t o Oracle's PL/ SQL. You can also develop server - side code in Tcl, Perl, even bash ( t he open- source Linux/ Unix shell) .

  W h o I s T h i s Bo o k Fo r ?

  Fort unat ely, t he Post greSQL developers t ry very hard t o m aint ain forward com pat ibilit y—new feat ures t end not t o break

exist ing applicat ions. This m eans t hat all t he feat ures discussed in t his book should st ill be available and subst ant ially sim ilar

in lat er versions of Post greSQL. I have t ried t o avoid t alking about feat ures t hat have not been released at t he t im e of writ ing—where I have m ent ioned fut ure developm ent s, I will point t hem out .

  

The first edit ion of t his book covered versions 7.1 t hrough 7.3. I n t his edit ion, we've updat ed t he basics and added coverage

for t he new feat ures int roduced in versions 7.4 and 8.0. Throughout t he book, I 'll be sure t o let you know which feat ures work only in new releases, and, in a few cases, I 'll explain feat ures t hat have been deprecat ed ( t hat is, feat ures t hat are obsolet e) . You can use t his book t o inst all, configure, t une, program , and m anage Post greSQL versions 7.1 t hrough 8.0.

  som et hing t hat you need, you can usually add it yourself. For exam ple, you can add new dat a t ypes, new funct ions and operat ors, and even new procedural and client languages. There are m any cont ribut ed packages available on t he I nt ernet . For exam ple, Refract ions Research, I nc. has developed a set of geographic dat a t ypes t hat can be used t o efficient ly m odel spat ial ( GI S) dat a.

  € Ext ensibilit y — One of t he m ost im port ant feat ures of Post greSQL is t hat it can be ext ended. I f you don't find

  

you will also find geom et ric t ypes, a Boolean dat a t ype, and dat a t ypes designed specifically to deal wit h net work

addresses.

  € Mult iple- client API s— Post greSQL support s t he developm ent of client applicat ions in m any languages. This book describes how t o int erface t o Post greSQL from C, C+ + , ODBC, Perl, PHP, Tcl/ Tk, and Pyt hon. € Unique dat a t ypes— Post greSQL provides a variet y of dat a t ypes. Besides t he usual num eric, st ring, and dat a t ypes,

  € Mult iple procedural languages— Triggers and ot her procedures can be writ t en in any of several procedural languages.

  Today, Post greSQL is developed by an int ernat ional group of open - source soft ware proponent s known as t he Post greSQL Global Developm ent group. Post greSQL is an open- source product —it is not propriet ary in any way. Red Hat has recent ly com m ercialized Post greSQL, creat ing t he Red Hat Dat abase, but Post greSQL it self will rem ain free and open source.

  

relat ionships as well as t riggers. Business rules can be expressed wit hin t he dat abase rat her t han relying on an

ext ernal t ool.

  € Referent ial int egrit y — Post greSQL im plem ent s com plet e referent ial int egrit y by support ing foreign and prim ary key

  processing. The t ransact ion m odel used by Post greSQL is based on m ult i-version concurrency cont rol ( MVCC) . MVCC

provides m uch bet t er perform ance t han you would find wit h ot her product s t hat coordinat e m ult iple users t hrough

t able- , page-, or row- level locking.

  € Transact ion processing— Post greSQL prot ect s dat a and coordinat es m ult iple concurrent users t hrough full t ransact ion

  

m em bers have been enhancing Post greSQL's perform ance and feat ure set since at least 1996. One advant age t o

Post greSQL's open - source nat ure is t hat t alent and knowledge can be recruit ed as needed. The fact t hat t his t eam is

int ernat ional ensures t hat Post greSQL is a product t hat can be used product ively in any nat ural language, not j ust

English.

  € Open source— An int ernat ional t eam of developers m aint ains Post greSQL. Team m em bers com e and go, but t he core

  Where differences in synt ax occur, t hey are m ost oft en relat ed t o feat ures unique t o Post greSQL.

  € Obj ect - relat ional— I n Post greSQL, every t able defines a class. Post greSQL im plem ent s inherit ance bet ween t ables ( or, if you like, bet ween classes) . Funct ions and operat ors are polym orphic. €

St andards com pliant — Post greSQL synt ax im plem ent s m ost of t he SQL92 st andard and m any feat ures of SQL99.

  Post greSQL has benefit ed well from it s long hist ory. Today, Post greSQL is one of t he m ost advanced dat abase servers available. Here are a few of t he feat ures found in a st andard Post greSQL dist ribut ion:

W h a t V e r si o n s D o e s T h i s Bo o k Co v e r ?

  

I f you are already using Post greSQL, you should find t his book a useful guide t o som e of t he feat ures t hat you m ight be less

fam iliar wit h. The first part of t he book provides an int roduct ion t o SQL and Post greSQL for t he new user. You'll also find inform at ion t hat shows how t o obt ain and inst all Post greSQL on a Unix/ Linux host , as well as on Microsoft Windows.

  I f you are developing an applicat ion t hat will st ore dat a in Post greSQL, t he second part of t his book will provide you wit h a great deal of inform at ion relat ing t o Post greSQL program m ing. You'll find inform at ion on bot h server -side and client -side program m ing in a variet y of languages.

  Every dat abase needs occasional adm inist rat ive work. The final part of t he book should be of help if you are a Post greSQL adm inist rat or, or a developer or user t hat needs t o do occasional adm inist rat ion. You will also find inform at ion on how t o secure your dat a against inappropriat e use.

  Finally, if you are t rying to decide which dat abase to use for your current proj ect ( or for fut ure proj ect s) , t his book should provide all t he inform at ion you need t o evaluat e whet her Post greSQL will fit your needs.

W h a t T o p i cs D o e s T h i s Bo o k Co v e r ?

  Post greSQL is a huge product . I t 's not easy t o find t he right m ix of t opics when you are t rying t o fit everyt hing int o a single book. This book is divided int o t hree part s. The first part , " General Post greSQL Use ," is an int roduct ion and user's guide for Post greSQL. Chapt er 1 , "I nt roduct ion t o Post greSQL and SQL," covers t he basics—how t o obt ain and inst all Post greSQL ( if you are running Linux, chances are you

already have Post greSQL and it m ay be inst alled) . The first chapt er also provides a gent le int roduct ion t o SQL and discusses

t he sam ple dat abase we'll be using t hroughout t he book. Chapt er 2 , "Working wit h Dat a in Post greSQL," describes t he m any

dat a t ypes support ed by a st andard Post greSQL dist ribut ion; you'll learn how t o ent er values ( lit erals) for each dat a t ype, what kind of dat a you can st ore wit h each t ype, and how t hose dat a t ypes are com bined int o expressions. Chapt er 3 , "Post greSQL SQL Synt ax and Use," fills in som e of t he det ails we glossed over in t he first t wo chapt ers. You'll learn how to creat e new dat abases, new t ables and indexes, and how Post greSQL keeps your dat a safe t hrough t he use of t ransact ions.

  Chapt er 4 , "Perform ance," describes t he Post greSQL opt im izer. I 'll show you how t o get inform at ion about t he decisions m ade by t he opt im izer, how t o decipher t hat inform at ion, and how t o influence t hose decisions.

Part I I , "Program m ing wit h Post greSQL," is all about Post greSQL program m ing. I n Chapt er 5 , "I nt roduct ion t o Post greSQL Program m ing," we st art off by describing t he opt ions you have when developing a dat abase applicat ion t hat works wit h Post greSQL ( and t here are a lot of opt ions) . Chapt er 6 , "Ext ending Post greSQL," briefly describes how t o ext end Post greSQL

  

by adding new funct ions, dat a t ypes, and operat ors. Chapt er 7 , "PL/ pgSQL," describes t he PL/ pgSQL language. PL/ pgSQL is a

server- based procedural language. Code t hat you writ e in PL/ pgSQL execut es wit hin t he Post greSQL server and has very fast

access t o dat a. Each chapt er in t he rem ainder of t he program m ing sect ion deals wit h a client- based API . You can connect t o

a Post greSQL server using a num ber of languages. I show you how t o int erface t o Post greSQL using C, C+ + , ecpg, ODBC, JDBC, Perl, PHP, Tcl/ Tk, Pyt hon, and Microsoft 's .NET. Chapt ers 8 t hrough 18 all follow t he sam e pat t ern: you develop a series of client applicat ions in a given language. The first client applicat ion shows you how t o est ablish a connect ion t o t he

dat abase ( and how t hat connect ion is represent ed by t he language in quest ion) . The next client adds error checking so t hat

you can int ercept and react t o unusual condit ions. The t hird client in each chapt er dem onst rat es how t o process SQL

com m ands from wit hin t he client . The final client wraps everyt hing t oget her and shows you how t o build an int eract ive query

processor using t he language being discussed. Even if you program in only one or t wo languages, I would encourage you t o

st udy t he ot her chapt ers in t his sect ion. I t hink you'll find t hat looking at t he sam e applicat ion writ t en in a variet y of languages will help you underst and t he philosophy followed by t he Post greSQL developm ent t eam , and it 's a great way t o st art learning a new language. Chapt er 19 , "Ot her Useful Program m ing Tools," int roduces you t o a few program m ing t ools

( and int erfaces) t hat you m ight find useful: PL/ Java and PL/ Perl. I 'll also show you how t o use Post greSQL inside of bash shell

script s.

  The final part of t his book (

  Part I I I , "Post greSQL Adm inist rat ion") deals wit h adm inist rat ive issues. The final six chapt ers of t his book show you how t o perform t he occasional dut ies required of a Post greSQL adm inist rat or. I n t he first t wo chapt ers,

Chapt er 20 , "I nt roduct ion t o Post greSQL Adm inist rat ion," and Chapt er 21 , "Post greSQL Adm inist rat ion," you'll learn how t o

st art up, shut down, back up, and rest ore a server. I n Chapt er 22 , "I nt ernat ionalizat ion and Localizat ion," you will learn how

Post greSQL support s int ernat ionalizat ion and localizat ion. Post greSQL underst ands how t o st ore and process a variet y of single-byt e and m ult i- byt e charact er set s including Unicode, ASCI I , and Japanese, Chinese, Korean, and Taiwan EUC. I n Chapt er 23 , "Securit y," I 'll show you how t o secure your dat a against unaut horized uses ( and unaut horized users) . I n Chapt er 24 , " Replicat ing Post greSQL wit h Slony, " you'll learn how t o replicat e dat a wit h Post greSQL's Slony replicat ion

syst em . Chapt er 25 , "Cont ribut ed Modules," int roduces a few open-source proj ect s t hat work well wit h Post greSQL. I 'll show

you how t o query a Post greSQL dat abase using XML, how t o configure and use TSEARCH2 ( a full- t ext indexing and search syst em ) , and how t o inst all and use PgAdm in I I I , a graphical user int erface specifically designed for Post greSQL.

W h a t ' s N e w i n t h e Se co n d Ed i t i o n ?

  The first edit ion of t his book hit t he shelves in February 2003—at t hat t im e, t he Post greSQL developers had j ust released

version 7.3.2. Release 7.4 was unleashed in Novem ber 2003. I n January 2005, t he Post greSQL developers released version

8.0—a m aj or release full of new feat ures. We t im ed t he second edit ion of t his book t o coincide wit h t he release of version 8.0

( t he book will appear in bookst ores a few m ont hs aft er 8.0 hit s t he st reet s) . I n t his edit ion, we've added coverage for all of

t he ( m aj or) new feat ures in 7.3, 7.4, and 8.0, including

  € I nst alling, securing, and m anaging Post greSQL on Windows host s € Tablespaces

  € Schem as € New quot ing m echanism s for st ring values € New dat a t ypes (

  ecpg ( the em bedded SQL processor for C)

  € Replicat ion € Using Post greSQL wit h XML € Full- t ext search

  Point - in- t im e recovery

  €

  Ot her useful program m ing t ools ( PL/ Java, pgpash, pgcurl, et c.)

  €

  npgsql—the PostgreSQL .NET data provider

  € New feat ures in t he ODBC, JDBC ( Java) , Perl, Pyt hon, PHP, and Tcl/ Tk client int erfaces €

  € Set -ret urning funct ions € Except ion handling in PL/ pgSQL € libpqxx, the new PostgreSQL int erface for C+ + clients € New feat ures in

  €

  Prepared- st at em ent execut ion ( t he PREPARE/ EXECUTE m odel)

  € Aut o-vacuum €

  The new Post greSQL buffer m anager

  €

  SAVEPOINT's)

  € Nest ed t ransact ions (

  The st andards- conform ing

  

We hope you enj oy t his book and find it useful. The Post greSQL developers have done an incredible j ob of enhancing what

was already a world- class dat abase product . Now dig in.

Pa r t I : Ge n e r a l Post g r e SQL U se

  1 I n t r od u ct ion t o Post g r eSQL an d SQL

  2 Wor k in g w it h Dat a in Post g r eSQL

  3 Post g r eSQL SQL Sy n t ax an d Use

  4 Per f or m an ce

Cha pt e r 1 . I n t r odu ct ion t o Post gr e SQL a n d SQL

  Post g r eSQL is an op en - sou r ce, clien t / ser v er , r elat ion al d at ab ase. Post g r eSQL of f er s a u n iq u e m ix of f eat u r es t h at com p ar e w ell t o t he m aj or com m er cial d at ab ases su ch as Sy b ase, Or acle, an d DB2 . On e of t h e m aj or ad v an t ag es t o Post g r eSQL is t h at it is op en sou r ce —y ou can see t h e sou r ce cod e f or Post g r eSQL. Post g r eSQL is n ot ow n ed by an y sin gle com p an y . I t is d ev elop ed , m ain t ain ed , b r ok en , an d f ix ed b y a g r ou p of v olu n t eer d ev elop er s ar ou n d t he w or ld . You d on ' t h av e t o b u y Post g r eSQL—it 's f r ee. You w on ' t h av e t o p ay an y m ain t en an ce f ees ( alt h ou gh y ou can cer t ain ly f in d com m er cial sou r ces f or t ech n ical su p p or t ) .

  Post g r eSQL of f er s all t h e u su al f eat u r es of a r elat ion al d at ab ase plu s q u it e a f ew u n iq u e f eat u r es. Post g r eSQL offer s in h er it an ce ( for y ou ob j ect - or ien t ed r eader s) . You can ad d y ou r ow n dat a t y p es t o Post g r eSQL. ( I k n ow , som e of y ou ar e p r ob ab ly t h in k in g t h at y ou can d o t h at in y ou r f av or it e d at ab ase. ) Most d at ab ase sy st em s allow y ou t o giv e a n ew n am e t o an ex ist in g t y pe. Som e sy st em s allow y ou t o d ef in e com p osit e t y pes. Wit h Post g r eSQL, y ou can ad d n ew f u n d am en t al d at a t y p es. Post g r eSQL in clu des su p p or t f or point, line segment, box, polygon, an d circle. Post gr eSQL uses in dex in g st r u ct u r es t hat m ak e g eom et r ic dat a t y p es su ch as g eom et r ic dat a t y p es f ast . Post g r eSQL can be ex t en ded —y ou can bu ild n ew f u n ct ion s, n ew oper at or s, an d n ew d at a t y p es in t h e lan g u ag e of y ou r ch oice. Post g r eSQL is bu ilt ar ou n d clien t / ser v er ar ch it ect u r e. You can bu ild clien t applicat ion s in a n u m b er of dif f er en t lan g u ag es, in clu din g C, C+ + , Jav a, Py t h on , Per l, TCL/ Tk , an d ot h er s. On t h e ser v er side, Post g r eSQL spor t s a p ow er f u l p r oced u r al lan g u ag e, PL/ p g SQL ( ok ay , t h e lan g u ag e is spor t ier t h an t h e n am e) . You can add p r oced u r al lan g u ag es t o t h e ser v er . You w ill f in d p r oced u r al lan g u ag es su p p or t in g Per l, TCL/ Tk , an d ev en t h e bash shell.

A Sa m ple D a t a ba se

  Th r ou g h ou t t h is b ook , I 'll u se a sim ple ex am p le d at ab ase t o h elp ex plain som e of t h e m or e com p lex con cep t s. Th e sam p le d at ab ase r ep r esen t s som e of t h e dat a st or ag e an d r et r iev al r eq u ir em en t s t h at y ou m ig h t en cou n t er w h en r u n n in g a v id eo r en t al st or e. I w on ' t pr et en d t h at t h e sam p le d at ab ase is u sef u l for an y r eal- w or ld scen ar ios; in st ead , t h is d at ab ase w ill h elp u s ex plor e h o w Post g r eSQL w or k s an d sh ou ld illu st r at e m an y Post g r eSQL f eat u r es.

  To b eg in w it h , t h e sam p le d at ab ase ( w h ich is called m ov ies) con t ain s t h r ee k in ds of r ecor ds: cu st om er s, t ap es, an d r en t als. Wh en ev er a cu st om er w alk s in t o ou r im ag in ar y v id eo st or e, y ou w ill con su lt y ou r d at ab ase t o d et er m in e w h et h er y ou alr eady k n ow t h is cu st om er . I f n ot , y ou ' ll ad d a n ew r ecor d . Wh at it em s of in f or m at ion sh ou ld y ou st or e f or each cu st om er ? At t h e v er y least , y ou w ill w an t t o r ecor d t h e cu st om er ' s n am e. You w ill w an t t o en su r e t h at each cu st om er h as a u n iq u e iden t if ier —y ou m ig h t h av e t w o cu st om er s n am ed " Dan n y Joh n son , " an d y ou ' ll w an t t o k eep t h em st r aigh t . A n am e is a poor ch oice f or a u n iq u e iden t if ier —n am es m ig h t n ot be u n iqu e, an d t h ey can of t en b e spelled in dif f er en t w ay s. ( " Was t h at Dan n y , Dan , or Dan iel?" ) You ' ll assig n each cu st om er a u n iq u e cu st om er I D. You m ig h t also w an t t o st or e t h e cu st om er ' s bir t h d at e so t h at y ou k n ow w h et h er h e sh ou ld b e allow ed t o r en t cer t ain m ov ies. I f y ou find t h at a cu st om er h as an ov er d u e t ap e r en t al, y ou ' ll p r ob ab ly w an t t o p h on e h im , so y ou bet t er st or e t h e cu st om er ' s p h on e n u m b er . I n a r eal- w or ld b u sin ess, y ou w ou ld p r ob ab ly w an t t o k n ow m u ch m or e in f or m at ion ab ou t each cu st om er ( su ch as h is h om e ad d r ess) , b u t f or t h ese p u r p oses, y ou ' ll k eep y ou r st or ag e r eq u ir em en t s t o a m in im u m . Nex t , y ou w ill n eed t o k eep t r ack of t h e v id eos t h at y ou st ock . Each v id eo h as a t it le an d a d u r at ion —y ou ' ll st or e t h ose. You m ig h t ow n sev er al cop ies of t h e sam e m ov ie an d y ou w ill cer t ain ly h av e m an y m ov ies w it h t h e sam e du r at ion , so y ou can ' t u se eit h er on e for a u n iq u e iden t if ier . I n st ead , y ou ' ll assign a u n iq u e I D t o each v id eo. Fin ally , y ou w ill n eed t o t r ack r en t als. Wh en a cu st om er r en t s a t ape, y ou w ill st or e t h e cu st om er I D, t ap e I D, an d r en t al d at e. Not ice t h at y ou w on ' t st or e t h e cu st om er n am e w it h each r en t al. As lon g as y ou st or e t h e cu st om er I D, y ou can alw ay s r et r iev e t h e cu st om er n am e. You w on ' t st or e t h e m ov ie t it le w it h each r en t al, eit h er —y ou can find t h e m ov ie t it le by it s u n iq u e iden t ifier .

  At a f ew poin t s in t h is b ook , w e m ig h t m ak e ch an g es t o t h e lay ou t of t h e sam p le d at ab ase, b u t t h e basic sh ap e w ill r em ain t h e sam e.

  B a si c D a t a b a s e T e r m i n o l o g y

  Before we get int o t he int erest ing st uff, it m ight b e useful t o get acquaint ed w it h a few of t he t er m s t hat y ou w ill encount er in y our Post gr eSQL life. Post gr eSQL has a long hist or y —you can t race it s hist ory back t o 1977 and a pr ogr am k now n as I ngr es. A lot has changed in t he relat ional dat abase w or ld since 1977. When y ou ar e br eak ing gr ound w it h a new product ( as t he I ngr es developer s w er e) , y ou don't have t he lux ur y of using st andard, well- under st ood, and well- accept ed t er m inology —you have t o m ak e it up as y ou go along. Many of t he t erm s used by Post gr eSQL have sy nony m s ( or at least close analogies) in t oday's r elat ional m ar k et place. I n t his sect ion, I 'll show you a few of t he t erm s t hat you'll encount er in t his book and t ry t o ex plain how t hey r elat e t o sim ilar concept s in ot her dat abase pr oduct s.

  € Schem a A schem a is a nam ed collect ion of t ables. ( see t able) . A schem a can also cont ain view s, index es, sequences, dat a t ypes, oper at or s, and funct ions. Ot her relat ional dat abase pr oduct s use t he t er m cat alog .

  € Dat abase A dat abase is a nam ed collect ion of schem as. When a client applicat ion connect s t o a Post gr eSQL server, it specifies t he nam e of t he dat abase t hat it w ant s t o access. A client cannot int er act w it h m ore t han one dat abase per connect ion but it can open any num ber of connect ions in or der t o access m ult iple dat abases sim ult aneously. € Com m and

  A com m and is a st ring t hat y ou send t o t he ser v er in hopes of hav ing t he ser v er do som et hing useful. Som e people use t he word st at em ent t o m ean com m and . The t w o w or ds ar e v er y sim ilar in m eaning and, in pr act ice, are int er changeable. € Quer y A query is a t ype of com m and t hat r et r ieves dat a fr om t he ser v er .

  € Table ( r elat ion, file, class) A t able is a collect ion of r ow s. A t able usually has a nam e, alt hough som e t ables ar e t em por ar y and ex ist only t o carry out a com m and. All t he r ow s in a t able have t he sam e shape ( in ot her w ords, every r ow in a t able cont ains t he sam e set of colum ns) . I n ot her dat abase sy st em s, y ou m ay see t he t er m s relat ion, file, or ev en class—t hese ar e all equivalent t o a t able. € Colum n ( field, at t r ibut e) A colum n is t he sm allest unit of st or age in a r elat ional dat abase. A colum n represent s one piece of infor m at ion about an obj ect .

  Every colum n has a nam e and a dat a t ype. Colum ns ar e gr ouped int o r ow s, and r ow s ar e gr ouped int o t ables. I n Figure 1.1 , t he shaded area depict s a single colum n.

  Figu r e 1 .1 . A colu m n ( h igh ligh t e d) .

  The t er m s field and at t ribut e have sim ilar m eanings. € Row ( r ecor d, t uple)

  A row is a collect ion of colum n values. Ev er y r ow in a t able has t he sam e shape ( in ot her w or ds, ev er y r ow is com posed of t he sam e set of colum ns) . I f y ou are t rying t o m odel a r eal- world applicat ion, a r ow r epr esent s a r eal- world obj ect . For ex am ple, if y ou are vehicles t able. Each r ow in t he vehicles t able represent s a car ( or t ruck, or r unning an aut o dealership, y ou m ight hav e a m ot or cycle, and so on) . The k inds of infor m at ion t hat you st or e ar e t he sam e for all vehicles ( t hat is, every car has a color, a vehicle I D, an engine, and so on) . I n Figure 1.2 , t he shaded area depict s a row .

  Figu r e 1 .2 . A r ow ( h ig h lig h t e d ) . You m ay also see t he t erm s record or t uple—t hese ar e equiv alent t o a row . € Com posit e t y pe

  St art ing w it h Post gr eSQL ver sion 8, y ou can cr eat e new dat a t ypes t hat are com posed of m ult iple values. For ex am ple, you could cr eat e a com posit e t ype nam ed address t hat holds a st reet address, cit y, st at e/ province, and post al code. When you creat e a t able t hat cont ains a colum n of t y pe address, you can st ore all four com ponent s in a single field. We discuss com posit e t ypes in m ore det ail in Chapt er 2 , " Wor king w it h Dat a in Post gr eSQL."

  € Dom ain A dom ain defines a nam ed specializat ion of anot her dat a t y pe. Dom ains ar e useful w hen y ou need t o ensur e t hat a single dat a t y pe accountNumber t hat cont ains a single let t er followed by is used in sever al t ables. For exam ple, y ou m ight define a dom ain nam ed accountNumber in a general ledger account s t able, an account s receivable cust om er four digit s. Then y ou can cr eat e colum ns of t ype t able, and so on.

  € View A view is an alt er nat ive w ay t o pr esent a t able ( or t ables) . You m ight t hink of a v iew as a " vir t ual" t able. A v iew is ( usually) defined in t er m s of one or m ore t ables. When y ou cr eat e a v iew , y ou ar e not st or ing m or e dat a, y ou ar e inst ead creat ing a different w ay of looking at exist ing dat a. A v iew is a useful w ay t o giv e a nam e t o a com plex quer y t hat y ou m ay hav e t o use r epeat edly .

  € Client / ser v er Post gr eSQL is built around a client / server ar chit ect ur e. I n a client / ser ver pr oduct , t her e ar e at least t w o pr ogr am s inv olv ed. One is a client and t he ot her is a ser v er . These pr ogr am s m ay ex ist on t he sam e host or on differ ent host s t hat ar e connect ed by som e sort of net w or k . The ser v er offers a ser vice; in t he case of Post gr eSQL, t he ser v er offer s t o st ore, ret rieve, and change dat a. The client asks a server t o per for m w or k ; a Post gr eSQL client ask s a Post gr eSQL ser v er t o ser ve up relat ional dat a.

  € Client A client is an applicat ion t hat m akes r equest s of t he Post gr eSQL server. Before a client applicat ion can t alk t o a server, it m ust postmaster) and est ablish it s ident it y. Client applicat ions provide a user int erface and can be writ t en connect t o a post m ast er ( see in m any languages. Chapt er s 8 t hr ough

  

19 w ill show you how t o w r it e a client applicat ion.

  € Ser v er The Post gr eSQL ser ver is a program t hat ser vices com m ands com ing fr om client applicat ions. The Post gr eSQL ser v er has n o user int erface—you can't t alk t o t he ser v er dir ect ly , y ou m ust use a client applicat ion.

  € Post m ast er Because Post gr eSQL is a client / ser v er dat abase, som et hing has t o list en for connect ion r equest s com ing fr om a client applicat ion. postmaster does. When a connect ion request arrives, t he postmaster creat es a new server process in t he host

  That 's w hat t he oper at ing sy st em . € Tr ansact ion

  A t r ansact ion is a collect ion of dat abase oper at ions t hat are t reat ed as a unit . Post gr eSQL guar ant ees t hat all t he oper at ions w it hin a t ransact ion com plet e or t hat none of t hem com plet e. This is an im por t ant pr oper t y —it ensur es t hat if som et hing goes w r ong in t he m iddle of a t r ansact ion, changes m ade befor e t he point of failur e w ill not be reflect ed in t he dat abase. A t r ansact ion usually st ar t s w it h a BEGIN com m and and ends wit h a COMMIT or ROLLBACK ( see t he next ent ries) .

  € Com m it

  A com m it m arks t he successful end of a t r ansact ion. When y ou per for m a com m it , y ou are t elling Post gr eSQL t hat you have com plet ed a unit of oper at ion and t hat all t he changes t hat y ou m ade t o t he dat abase should becom e per m anent .

  € Rollback A rollback m arks t he un successful end of a t r ansact ion. When y ou roll back a t r ansact ion, you ar e t elling Post gr eSQL t o discard any changes t hat y ou hav e m ade t o t he dat abase ( since t he beginning of t he t r ansact ion) .

  € I ndex An index is a dat a st r uct ur e t hat a dat abase uses t o r educe t he am ount of t im e it t akes t o per for m cert ain oper at ions. An index can also be used t o ensur e t hat duplicat e values don't appear w her e t hey ar en't w ant ed. I 'll t alk about index es in Chapt er 4 , " Per for m ance."

  € Tablespace A t ablespace defines an alt er nat ive st or age locat ion w her e y ou can cr eat e t ables and indexes. When y ou cr eat e a t able ( or index ) , you can specify t he nam e of a t ablespace—if y ou don't specify a t ablespace, Post gr eSQL cr eat es all obj ect s in t he sam e dir ect or y t r ee. You can use t ablespaces t o dist r ibut e t he w or k load acr oss m ult iple disk dr ives. € Result set

  When you issue a quer y t o a dat abase, y ou get back a result set . The r esult set cont ains all t he r ow s t hat sat isfy y our quer y. A result set m ay be em pt y .

Pr e r e q u isit e s

  Bef or e I g o m u ch f u r t h er , let ' s t alk ab ou t in st allin g Post g r eSQL. Ch ap t er s 2 1 , " Post g r eSQL Ad m in ist r at ion , " an d 2 3 , " Secu r it y , " d iscu ss Post g r eSQL in st allat ion in d et ail, b u t I ' ll sh o w y ou a t y p ical in st allat ion p r oced u r e h er e. Wh en y ou in st all Post g r eSQL, y ou can st ar t w it h p r eb u ilt b in ar ies o r y ou can com p ile Post g r eSQL f r om sou r ce cod e. I n t h is ch ap t er , I 'll sh ow y ou h o w t o in st all Post g r eSQL o n a Lin u x h ost st ar t in g f r o m p r eb u ilt b in ar ies. I f y ou d ecid e t o in st all Post g r eSQL f r om sou r ce cod e, m a n y of t h e st ep s ar e t h e sam e. I ' ll sh ow y ou h o w t o bu ild Post g r eSQL f r om sou r ce cod e in Ch ap t er 2 1 . I n old er v er sion s of Post g r eSQL, y ou cou ld r u n t h e Post g r eSQL ser v er o n a Win d ow s h ost b u t y ou h ad t o in st all a Un ix - lik e in f r ast r u ct u r e ( Cy g w in ) f ir st : Post g r eSQL w asn ' t a n at iv e Win d ow s ap p licat ion . St ar t in g w it h Post g r eSQL v er sion 8 . 0 , t h e Post g r eSQL ser v er h as b een p or t ed t o t h e Win d ow s en v ir on m en t as a n at iv e - Win d ow s ap p licat ion . I n st allin g Post g r eSQL o n a Win d ow s ser v er is v er y sim p le; sim p ly d ow n load an d r u n t h e in st aller p r og r am . Yo u d o h av e a f ew ch oices t o m ak e, an d w e cov er t he en t ir e p r oced u r e in Ch ap t er 2 1 .

I n st a llin g Post g r e SQL U sin g a n RPM

  Th e easiest w ay t o in st all Post g r eSQL is t o u se a p r eb u ilt RPM p ack ag e. RPM is t h e Red Hat Pack ag e Man ag er . I t ' s a sof t w ar e p ack ag e d esig n ed t o in st all ( an d m an ag e) ot h er sof t w ar e p ack ag es. I f y ou ch oose t o in st all u sin g so m e m et h od ot h er t h an RPM, con su lt t h e d ocu m en t at ion t h at com es w it h t h e d ist r ib u t ion y o u ar e u sin g .

  Post g r eSQL is d ist r ib u t ed as a collect ion of RPM p ack ag es—y ou d on ' t h av e t o in st all all t h e p ack ag es t o u se Post g r eSQL. Tab le 1 . 1 list s t h e RPM p ack ag es av ailab le as of r elease 7 . 4 . 5 .

  T a b l e 1 . 1 . P o st g r e SQ L R P M P a ck a g e s a s o f R e l e a se 7 . 4 . 5 P a ck a g e D e scr i p t i o n

  p ost g r esq l Clien t s, lib r ar ies, an d d ocu m en t at ion p ost g r esq l- ser v er Pr og r am s ( an d d at a f iles) r eq u ir ed t o r u n a ser v er p ost g r esq l- d ev el Files r eq u ir ed t o cr eat e n ew clien t ap p licat ion s p ost g r esq l- j d b c JDBC d r iv er f or Post g r eSQL p ost g r esq l- t cl Tcl clien t an d PL/ Tcl p ost g r esq l- p y t h on Post g r eSQL' s Py t h on lib r ar y p ost g r esq l- t est Reg r ession t est su it e f or Post g r eSQL p ost g r esq l- libs Sh ar ed lib r ar ies f or clien t ap p licat ion s p ost g r esq l- d ocs Ex t r a d ocu m en t at ion n ot in clu d ed in t h e p ost g r esq l b ase p ack ag e p ost g r esq l- con t r ib Con t r ib u t ed sof t w ar e

  Don ' t w or r y if y ou d on ' t k n ow w h ich of t h ese y ou n eed ; I 'll ex p lain m o st of t h e p ack ag es in lat er ch ap t er s. Yo u can st ar t w or k in g w it h Post g r eSQL b y d ow n load in g t h e p ost g r esq l, p ost g r esq l- libs, a n d p ost g r esq l - ser v er p ack ag es. Th e act u al f iles ( at t h e postgresql-7.4.5-2PGDG.i686.rpm, for ex am p le.

  w w w . p ost g r esq l. or g w eb sit e) h av e n am es t h at in clu d e a v er sion n u m b er :

  I st r on g ly r ecom m en d cr eat in g an em p t y d ir ect or y , an d t h en d ow n load in g t h e Post g r eSQL p ack ag es in t o t h at d ir ect or y . Th at w ay y ou can in st all all t h e Post g r eSQL p ack ag es w it h a sin gle co m m a n d . Af t er y ou h av e d ow n load ed t h e d esir ed p ack ag es, u se t h e rpm com m an d t o p er f or m t h e in st allat ion pr ocedu r e. You m u st h av e su p er u ser p r iv ileg es t o in st all Post g r eSQL. To in st all t h e Post g r eSQL p ack ag es, cd in t o t h e dir ect or y t h at con t ain s t h e p ack ag e files an d issu e t h e f ollow in g com m an d : # rpm -ihv *.rpm rpm com m an d in st alls all t h e pack ages in y ou r cu r r en t dir ect or y . You sh ou ld see r esu lt s sim ilar t o w h at is sh ow n in Th e Fig u r e 1 . 3 .

  

Fi g u r e 1 . 3 . U si n g t h e rpm co m m a n d t o in st a ll P o st g r e SQ L.

  

[ View f u ll size im ag e] Th e RPM in st aller sh ou ld h av e cr eat ed a n ew u ser ( n am ed postgres) for y ou r sy st em . Th is u ser I D ex ist s so t h at all d at ab ase files accessed b y Post g r eSQL can b e o w n ed b y a sin g le u ser . Each RPM p ack ag e is com p osed o f m a n y f iles. You can v iew t h e list of f iles in st alled f or a g iv en p ack ag e u sin g t h e rpm -ql co m m an d : # rpm -ql postgresql-server /etc/rc.d/init.d/postgresql /usr/bin/initdb /usr/bin/initlocation ... /var/lib/pgsql/data # rpm -ql postgresql-libs /usr/lib/libecpg.so.3 /usr/lib/libecpg.so.3.2.0 /usr/lib/libpgeasy.so.2 ... /usr/lib/libpq.so.2.1 At t h is p oin t ( assu m in g t h at ev er y t h in g w or k ed ) , y ou h av e in st alled Post g r eSQL o n y ou r sy st em . No w it ' s t im e t o cr eat e a d at ab ase t o p lay , er , w or k in . Wh ile y ou h av e su p er u ser p r iv ileg es, issu e t h e f ollow in g co m m an d s: # su - postgres bash-2.04$ echo $PGDATA /var/lib/pgsql/data bash-2.04$ initdb su - postgres) ch an g es y ou r iden t it y f r om t h e OS su p er u ser ( r oot ) t o t h e Post g r eSQL su p er u ser ( postgres) . Th e f ir st co m m an d ( echo $PGDATA) sh ow s y ou w h er e t h e Post g r eSQL d at a files w ill b e cr eat ed. Th e final com m an d cr eat es t h e Th e secon d co m m an d ( t w o p r ot ot y p e d at ab ases ( template0 an d template1) . You sh ou ld g et ou t p u t t h at look s lik e t h at sh ow n in Fig u r e 1 . 4 .

  initdb. Fi g u r e 1 . 4 . Cr e a t i n g t h e p r o t o t y p e d a t a b a se s u si n g

[ View f u ll size im ag e] template0 an d template1. You r eally sh ou ld n ot cr eat e n ew t ables in eit h er of t h ese You n o w h av e t w o em p t y d at ab ases n am ed template0 an d d at ab ases—a t em p lat e d at ab ase con t ain s all t h e d at a r eq u ir ed t o cr eat e ot h er d at ab ases. I n ot h er w or d s, template1 act as p r ot ot y p es for cr eat in g ot h er d at ab ases. I n st ead , let ' s cr eat e a d at ab ase t h at y ou can p lay in . Fir st , st ar t t h e postmaster p r ocess. Th e postmaster is a p r og r am t h at list en s f or con n ect ion r equ est s com in g f r om clien t applicat ion s. Wh en a con n ect ion r eq u est ar r iv es, t h e postmaster st ar t s a n ew ser v er pr ocess. You can ' t do an y t h in g in Post g r eSQL w it h ou t a postmaster. postmaster st ar t ed.

  Fig u r e 1 . 5 sh o w s y ou h o w t o g et t h e

Fi g u r e 1 . 5 . Cr e a t i n g a n e w d a t a b a se w i t h createdb.

  

[ View f u ll size im ag e]

  postmaster, u se t h e createdb com m an d t o cr eat e t h e movies d at ab ase ( t h is is also sh ow n in Af t er st ar t in g t h e

  Fig u r e 1 . 5 ) . Most movies d at ab ase.

  of t h e ex am p les in t h is b ook t ak e p lace in t h e

  

[ 1]

Not ice t h at I u sed t h e pg_ctl com m an d t o st ar t t h e postmaster . [ 1]

  postmaster t o st ar t w h en ev er y ou b oot y ou r com p u t er , bu t t h e ex act in st r u ct ion s Yo u can also ar r an g e f or t h e v ar y d ep en d in g on w h ich op er at in g sy st em y ou ar e u sin g . See t h e sect ion t it led " Ar r an g in g f or Post g r eSQL St ar t u p

  an d Sh u t d ow n " in Ch ap t er 2 1

  Th e pg_ctl p r og r am m ak es it easy t o st ar t an d st op t h e postmaster. To see a full descr ipt ion of t h e pg_ctl com m an d , en t er t h e co m m an d pg_ctl --help. You w ill g et t h e ou t pu t sh ow n in Fig u r e 1 . 6 .

  

Fi g u r e 1 . 6 . pg_ctl o p t i o n s.

  

[ View f u ll size im ag e]

  I f y ou u se a r ecen t RPM file t o in st all Post g r eSQL, t h e t w o p r ev iou s st ep s ( initdb an d pg_ctl start) can b e au t om at ed . I f y ou f in d a file n am ed postgresql in t h e /etc/rc.d/init.d dir ect or y , y ou can u se t h at sh ell scr ipt t o init ialize t h e d at ab ase an d st ar t t h e postmaster. Th e /etc/rc.d/init.d/postgresql scr ipt can b e in v ok ed w it h an y of t h e com m an d - line opt ion s sh ow n in Tab le 1 . 2 .

  T a b l e 1 . 2 . /etc/rc.d/init.d/postgresql O p t i o n s O p t i o n D e scr i p t i o n