Problem Solution Discussion Encoding Special Characters in Web Output

This script should be inst alled in t he m cb subdirect ory of your Tom cat servers webapps direct ory, and you can invoke it as follows: ht t p: t om cat .snake.net : 8080 m cb show_t ables.j sp Like t he PHP script shown in Recipe 16.3 , t he JSP script does not produce any Content- Type: header explicit ly. The JSP engine produces a default header wit h a cont ent t ype of texthtml aut om at ically.

16.5 Encoding Special Characters in Web Output

16.5.1 Problem

Cert ain charact ers are special in HTML pages and m ust be encoded if you want t o display t hem lit erally. Because dat abase cont ent oft en cont ains t hese charact ers, script s t hat include query result s in web pages should encode t hose result s t o prevent browsers from m isint erpret ing t he inform at ion.

16.5.2 Solution

Use t he m et hods t hat are provided by your API for perform ing HTML-encoding and URL- encoding.

16.5.3 Discussion

HTML is a m arkup language—it uses cert ain charact ers as m arkers t hat have a special m eaning. To include lit eral inst ances of t hese charact ers in a page, you m ust encode t hem so t hat t hey are not int erpret ed as having t heir special m eanings. For exam ple, should be encoded as lt; t o keep a brow ser from int erpret ing it as t he beginning of a t ag. Furt herm ore, t here are act ually t wo kinds of encoding, depending on t he cont ext in which you use a charact er. One encoding is appropriat e for general HTML t ext , anot her is used for t ext t hat is par t of a URL in a hyperlink. The MySQL show- t ables script s shown in Recipe 16.3 and Recipe 16.4 are sim ple dem onst rat ions of how t o produce web pages using program s. But wit h one except ion, t he script s have a com m on failing: t hey t ake no care t o properly encode special charact ers t hat occur in t he inform at ion ret rieved from t he MySQL server. The except ion is t he JSP version of t he scr ipt ; t he c:out t ag used t here handles encoding aut om at ically, as well discuss short ly. As it happens, I deliberat ely chose inform at ion t o display t hat is unlikely t o cont ain any special charact ers, so t hey should work properly even in t he absence of any encoding. However, in t he general case, it s unsafe t o assum e t hat a query result will cont ain no special charact ers and t hus you m ust be prepared t o encode it . Neglect ing t o do t his oft en result s in script s t hat generat e pages cont aining m alform ed HTML t hat displays incorrect ly. This sect ion describes how t o handle special charact ers, beginning wit h som e general principles, and t hen discusses how each API im plem ent s encoding support . The API - specific exam ples show how t o process inform at ion drawn from a dat abase t able, but t hey can be adapt ed t o any cont ent you include in a web page, no m at t er it s source.

16.5.4 General Encoding Principles