Problem Solution Discussion Basic Web Page Generation

16.2 Basic Web Page Generation

16.2.1 Problem

You want t o produce a web page from a script rat her t han by writ ing it m anually.

16.2.2 Solution

Writ e a program t hat generat es t he page when it execut es. This gives you m ore cont rol over what get s sent t o t he client t han when you writ e a st at ic page, alt hough it m ay also require t hat you provide m ore part s of t he response. For exam ple, it m ay be necessary t o writ e t he headers t hat precede t he page body.

16.2.3 Discussion

HTML is a m arkup language t hat s what t he ML st ands for t hat consist s of a m ix of plain t ext t o be displayed and special m arkup indicat ors or const ruct s t hat cont rol how t he plain t ext is displayed. Here is a very sim ple HTML page t hat specifies a t it le in t he page header, and a body wit h whit e background cont aining a single paragraph: html head titleWeb Page Titletitle head body bgcolor=white pWeb page body.p body html I t s possible t o writ e a script t hat produces t he sam e page, but doing so differs in som e ways from writ ing a st at ic page. For one t hing, youre writ ing in t wo languages at once. The script is writ t en in your program m ing language, and t he script it self writ es HTML. Anot her difference is t hat you m ay have t o produce m ore of t he response t hat is sent t o t he client . When a web server sends a st at ic page t o a client , it act ually sends a set of one or m ore header lines first t hat provide addit ional inform at ion about t he page. For exam ple, an HTML docum ent would be preceded by a Content-Type: header t hat let s t he client know what kind of inform at ion t o expect , and a blank line t hat separat es any headers from t he page body: Content-Type: texthtml html head titleWeb Page Titletitle head body bgcolor=white pWeb page body.p body html The web server produces header inform at ion aut om at ically for st at ic HTML pages. When you writ e a web script , you m ay need t o provide t he header inform at ion yourself. Som e API s such as PHP m ay send a cont ent -t ype header aut om at ically, but allow you t o override t he default t ype. For exam ple, if your script sends a JPEG im age t o t he client inst ead of an HTML page, you would want t o have t he script change t he cont ent t ype from texthtml t o imagejpeg . Writ ing web script s also differs from writ ing com m and- line script s, bot h for input and for out put . On t he input side, t he inform at ion given t o a web script is provided by t he web server rat her t han by com m and- line argum ent s or by input t hat you t ype in. This m eans your script s do not obt ain input using read st at em ent s. I nst ead, t he web server put s inform at ion int o t he execut ion environm ent of t he script , which t hen ext ract s t hat inform at ion from it s environm ent and act s on it . On t he out put side, com m and- line script s t ypically produce plain t ext out put , whereas web script s produce HTML, im ages, or what ever ot her t ype of cont ent you need t o send t o t he client . Out put produced in a web environm ent usually m ust be highly st ruct ured, t o ensure t hat it can be underst ood by t he receiving client program . Any API allows you t o generat e out put by m eans of print st at em ent s, but som e also offer special assist ance for producing web pages. This support can be eit her built int o t he API it self or provided by m eans of special m odules: • For Perl script s, a popular m odule is CGI .pm . I t provides feat ures for generat ing HTML m arkup, form processing, and m ore. • PHP script s are writ t en as a m ix of HTML and em bedded PHP code. That is, you writ e HTML lit erally int o t he script , t hen drop int o program m ode whenever you need t o generat e out put by execut ing code. The code is replaced by it s out put in t he result ing page t hat is sent t o t he client . • Pyt hon includes cgi and urllib m odules t hat help perform web program m ing t asks. • For Java, well writ e script s according t o t he JSP specificat ion, which allows script ing direct ives and code t o be em bedded int o web pages. This is sim ilar t o t he way PHP w orks. Ot her page-generat ing packages are available besides t hose used in t his book—som e of w hich can have a m arked effect on t he way you use a language. For exam ple, Mason, em bPerl, ePerl, and AxKit allow you t o t reat Perl as an em bedded language, som ewhat like t he way t hat PHP works. Sim ilarly, t he m od_snake Apache m odule allows Pyt hon code t o be em bedded int o HTML t em plat es. Before you can run any script s in a web environm ent , your web server m ust be set up properly. I nform at ion about doing t his for Apache and Tom cat is provided in Recipe 16.3 and Recipe 16.4 , but concept ually, a web server t ypically runs a script in one of t wo ways. First , t he web server can use an ext ernal program t o execut e t he script . For exam ple, it can invoke an inst ance of t he Pyt hon int erpret er t o run a Pyt hon script . Second, if t he server has been enabled wit h t he appropriat e language processing abilit y, it can execut e t he script it self. Using an ext ernal program t o run script s requires no special capabilit y on t he part of t he web server, but is slower because it involves st art ing up a separat e process, as well as som e addit ional overhead for writ ing request inform at ion t o t he script and reading t he result s from it . I f you em bed a language processor int o t he web server, it can execut e script s direct ly, result ing in m uch bet t er perform ance. Like m ost web servers, Apache can run ext ernal script s. I t also support s t he concept of ext ensions m odules t hat becom e part of t he Apache process it self eit her by being com piled in or dynam ically loaded at runt im e . One com m on use of t his feat ure is t o em bed language processors int o t he server t o accelerat e script execut ion. Perl, PHP, and Pyt hon script s can be execut ed eit her way. Like com m and- line script s, ext ernally execut ed web script s are writ t en as execut able files t hat begin wit h a line specifying t he pat hnam e of t he appropriat e language int erpret er. Apache uses t he pat hnam e t o det erm ine which int erpret er runs t he script . Alt ernat ively, you can ext end Apache using m odules such as m od_perl for Perl, m od_php for PHP, and m od_pyt hon or m od_snake for Pyt hon. This gives Apache t he abilit y t o direct ly execut e script s writ t en in t hose languages. For Java JSP script s, t he script s are com piled int o Java servlet s and run inside a process known as a servlet cont ainer. This is sim ilar t o t he em bedded- int erpret er approach in t he sense t hat t he script s are run by a server process t hat m anages t hem , rat her t han by st art ing up an ext ernal process for each script . The first t im e a JSP page is request ed by a client , t he cont ainer com piles it int o a servlet in t he form of execut able Java byt e code, t hen loads it and runs it . The cont ainer caches t he byt e code, so subsequent request s for t he script run direct ly wit h no com pilat ion phase. I f you m odify t he script , t he cont ainer not ices t his when t he next request arrives, recom piles t he script int o a new servlet , and reloads it . The JSP approach provides a significant advant age over writ ing servlet s direct ly, because you dont have t o com pile code yourself or handle servlet loading and unloading. Tom cat can handle t he responsibilit ies of bot h t he servlet cont ainer and of t he web server t hat com m unicat es wit h t he cont ainer. I f you run m ult iple servers on t he sam e host , t hey m ust list en for request s on different port num bers. I n a t ypical configurat ion, Apache list ens on t he default HTTP port 80 and Tom cat list ens on anot her port such as 8080. The exam ples here use server host nam es of apache.snake.net and t om cat .snake.net t o represent URLs for script s processed using Apache and Tom cat . These m ay or m ay not m ap t o t he sam e physical m achine, depending on your DNS set t ings, so t he exam ples use a different port 8080 for Tom cat . Typical form s for URLs t hat youll see in t his book are as follows: ht t p: apache.snake.net cgi-bin m y_perl_script .pl ht t p: apache.snake.net cgi-bin m y_pyt hon_script .py ht t p: apache.snake.net m cb m y_php_script .php ht t p: t om cat .snake.net : 8080 m cb m y_j sp_script .j sp Youll need t o change t he host nam e and port num ber appropriat ely for pages served by your own servers.

16.3 Using Apache to Run Web Scripts