Do You Always Need to Encode Web Page Output?
I f you know a value is legal in a part icular cont ext wit hin a web page, you need not encode it . For exam ple, if you obt ain a value from an int eger- valued colum n in a
dat abase t able t hat cannot be
NULL
, it m ust necessarily be an int eger. No HTML- or URL- encoding is needed t o include t he value in a web page, because digit s are not
special in HTML t ext or wit hin URLs. On t he ot her hand, suppose you solicit an int eger value using a field in a web form . You m ight be expect ing t he user t o provide
an int eger, but t he user m ight be confused and ent er an illegal value. You could handle t his by displaying an error page t hat shows t he value and explains t hat it s
not an int eger. But if t he value cont ains special charact ers and you dont encode it , t he page w ont display t he value properly, possibly confusing t he user furt her.
16.5.5 Encoding Special Characters Using Web APIs
The following encoding exam ples show how t o pull values out of MySQL and perform bot h HTML-encoding and URL- encoding on t hem t o generat e hyperlinks. Each exam ple reads a
t able nam ed
phrase
t hat cont ains short phrases, using it s cont ent s t o const ruct hyperlinks t hat point t o a hypot het ical script t hat searches for inst ances of t he phrases in som e ot her
t able. The t able looks like t his:
mysql SELECT phrase_val FROM phrase ORDER BY phrase_val; +--------------------------+
| phrase_val | +--------------------------+
| are we there yet? | | cats dogs |
| rhinoceros | | the whole sum of parts |
+--------------------------+
The goal here is t o generat e a list of hyperlinks using each phrase bot h as t he hyperlink label which requires HTML-encoding and in t he URL as a param et er t o t he search script which
requires URL- encoding . The result ing links look like t his: a href=cgi-binmysearch.pl?phrase=are20we2022there2220yet3F
are we quot;therequot; yet?a a href=cgi-binmysearch.pl?phrase=cats202620dogs
cats amp; dogsa a href=cgi-binmysearch.pl?phrase=rhinoceros
rhinocerosa a href=cgi-binmysearch.pl?phrase=the20whole203E20sum20of20parts
the whole gt; sum of partsa
The links produced by som e API s will look slight ly different , because t hey encode spaces as
+
rat her t han as
20
.
16.5.5.1 Perl
The Perl CGI .pm m odule provides t wo m et hods,
escapeHTML
and
escape
, t hat handle HTML-encoding and URL- encoding. There are t hree ways t o use t hese m et hods t o
encode a st ring
str
:
•
I nvoke
escapeHTML
and
escape
as CGI class m et hods using a
CGI::
prefix:
•
use CGI; printf s\ns\n, CGI::escape str, CGI::escapeHTML str;
•
Creat e a
CGI
obj ect and invoke
escapeHTML
and
escape
as obj ect m et hods:
•
use CGI;
•
my cgi = new CGI; printf s\ns\n, cgi-escape str, cgi-escapeHTML str;
•
I m port t he nam es explicit ly int o your script s nam espace. I n t his case, neit her a
CGI
obj ect nor t he
CGI::
prefix is necessary and you can invoke t he m et hods as st andalone funct ions. The following exam ple im port s t he t wo m et hod nam es in
addit ion t o t he set of st andard nam es:
•
use CGI qw:standard escape escapeHTML; printf s\ns\n, escape str, escapeHTML str;
I prefer t he last alt ernat ive because it is consist ent wit h t he CGI .pm funct ion call int erface t hat you use for ot her im port ed m et hod nam es. Just rem em ber t o include t he encoding m et hod
nam es in t he
use CGI
st at em ent for any Perl script t hat requires t hem , or youll get undefined subrout ine errors when t he script execut es.
The following code reads t he cont ent s of t he
phrase
t able and produces hyperlinks from t hem using
escapeHTML
and
escape
: my query = SELECT phrase_val FROM phrase ORDER BY phrase_val;
my sth = dbh-prepare query; sth-execute ;
while my phrase = sth-fetchrow_array {
URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label
my url = cgi-binmysearch.pl?phrase= . escape phrase; my label = escapeHTML phrase;
print a {-href = url}, label . br . \n; }
16.5.5.2 PHP
I n PHP, t he
htmlspecialchars
and
urlencode
funct ions perform HTML-encoding and URL- encoding. Theyre used as follows:
query = SELECT phrase_val FROM phrase ORDER BY phrase_val; result_id = mysql_query query, conn_id;
if result_id {
while list phrase = mysql_fetch_row result_id {
URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label
url = mcbmysearch.php?phrase= . urlencode phrase; label = htmlspecialchars phrase;
printf a href=\s\sabr \n, url, label; }
mysql_free_result result_id; }
16.5.5.3 Python
I n Pyt hon, t he
cgi
and
urllib
m odules cont ain t he relevant encoding m et hods.
cgi.escape
perform s HTML- encoding and
urllib.quote
does URL- encoding:
import cgi import urllib
query = SELECT phrase_val FROM phrase ORDER BY phrase_val cursor = conn.cursor
cursor.execute query for phrase, in cursor.fetchall :
URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label
url = cgi-binmysearch.py?phrase= + urllib.quote phrase label = cgi.escape phrase, 1
print a href=\s\sabr url, label cursor.close
The first argum ent t o
cgi.escape
is t he st ring t o be HTML-encoded. By default , t his funct ion convert s
, , and
charact ers t o t heir corresponding HTML ent it ies. To t ell
cgi.escape
also t o convert double quot es t o t he
quot;
ent it y, pass a second argum ent of
1
, as shown in t he exam ple. This is especially im port ant if youre encoding values t o be placed int o a double- quot ed t ag at t ribut e.
16.5.5.4 Java
The
c:out
JSTL t ag aut om at ically perform s HTML- encoding for JSP pages. St rict ly speaking, it perform s XML- encoding, but t he set of charact ers affect ed is
, ,
, , and
, which includes all t hose needed for HTML- encoding. By using
c:out
t o display t ext in a
web page, you need not even t hink about convert ing special charact ers t o HTML ent it ies. I f for som e reason you want t o suppress encoding, invoke
c:out
like t his: c:out value=
value to display
encodeXML=false To URL- encode param et ers for inclusion in a URL, use t he
c:url
t ag. Specify t he URL st ring in t he t ags
value
at t ribut e, and include any param et er values and nam es in
c:param
t ags in t he body of t he
c:url
t ag. A param et er value can be given eit her in t he
value
at t ribut e of a
c:param
t ag or in it s body. Heres an exam ple t hat shows bot h ways:
c:url var=urlStr value=myscript.jsp c:param name=id value =47
c:param name=colorsky bluec:param c:url
This will URL- encode t he values of t he
id
and
color
param et ers and add t hem t o t he end of t he URL. The result is placed in an obj ect nam ed
urlStr
, which you can display as follows:
c:out value={urlStr} The
c:url
t ag does not encode special charact ers such as spaces in t he st ring supplied in it s
value
at t ribut e. You m ust encode t hem yourself, so it s probably best j ust t o avoid creat ing pages wit h spaces in t heir nam es, t o avoid t he likelihood t hat youll need t o refer t o
t hem . The
c:out
and
c:url
t ags can be used as follow s t o display ent ries from t he
phrase
t able: sql:query var=rs dataSource={conn}
SELECT phrase_val FROM phrase ORDER BY phrase_val sql:query
c:forEach var=row items={rs.rows} -- URL-encode the phrase value for use in the URL --
-- HTML-encode the phrase value for use in the link label -- c:url var=urlStr value=mcbmysearch.jsp
c:param name=phrase value ={row.phrase_val} c:url
a href=c:out value={urlStr} c:out value={row.phrase_val} a
br c:forEach
Chapter 17. Incorporating Query Results into Web Pages
Sect ion 17.1. I nt roduct ion Sect ion 17.2. Displaying Query Result s as Paragraph Text
Sect ion 17.3. Displaying Query Result s as List s Sect ion 17.4. Displaying Query Result s as Tables
Sect ion 17.5. Displaying Query Result s as Hyperlinks Sect ion 17.6. Creat ing a Navigat ion I ndex from Dat abase Cont ent
Sect ion 17.7. St oring I m ages or Ot her Binary Dat a Sect ion 17.8. Ret rieving I m ages or Ot her Binary Dat a
Sect ion 17.9. Serving Banner Ads Sect ion 17.10. Serving Query Result s for Download