Encoding Special Characters Using Web APIs

Do You Always Need to Encode Web Page Output? I f you know a value is legal in a part icular cont ext wit hin a web page, you need not encode it . For exam ple, if you obt ain a value from an int eger- valued colum n in a dat abase t able t hat cannot be NULL , it m ust necessarily be an int eger. No HTML- or URL- encoding is needed t o include t he value in a web page, because digit s are not special in HTML t ext or wit hin URLs. On t he ot her hand, suppose you solicit an int eger value using a field in a web form . You m ight be expect ing t he user t o provide an int eger, but t he user m ight be confused and ent er an illegal value. You could handle t his by displaying an error page t hat shows t he value and explains t hat it s not an int eger. But if t he value cont ains special charact ers and you dont encode it , t he page w ont display t he value properly, possibly confusing t he user furt her.

16.5.5 Encoding Special Characters Using Web APIs

The following encoding exam ples show how t o pull values out of MySQL and perform bot h HTML-encoding and URL- encoding on t hem t o generat e hyperlinks. Each exam ple reads a t able nam ed phrase t hat cont ains short phrases, using it s cont ent s t o const ruct hyperlinks t hat point t o a hypot het ical script t hat searches for inst ances of t he phrases in som e ot her t able. The t able looks like t his: mysql SELECT phrase_val FROM phrase ORDER BY phrase_val; +--------------------------+ | phrase_val | +--------------------------+ | are we there yet? | | cats dogs | | rhinoceros | | the whole sum of parts | +--------------------------+ The goal here is t o generat e a list of hyperlinks using each phrase bot h as t he hyperlink label which requires HTML-encoding and in t he URL as a param et er t o t he search script which requires URL- encoding . The result ing links look like t his: a href=cgi-binmysearch.pl?phrase=are20we2022there2220yet3F are we quot;therequot; yet?a a href=cgi-binmysearch.pl?phrase=cats202620dogs cats amp; dogsa a href=cgi-binmysearch.pl?phrase=rhinoceros rhinocerosa a href=cgi-binmysearch.pl?phrase=the20whole203E20sum20of20parts the whole gt; sum of partsa The links produced by som e API s will look slight ly different , because t hey encode spaces as + rat her t han as 20 .

16.5.5.1 Perl

The Perl CGI .pm m odule provides t wo m et hods, escapeHTML and escape , t hat handle HTML-encoding and URL- encoding. There are t hree ways t o use t hese m et hods t o encode a st ring str : • I nvoke escapeHTML and escape as CGI class m et hods using a CGI:: prefix: • use CGI; printf s\ns\n, CGI::escape str, CGI::escapeHTML str; • Creat e a CGI obj ect and invoke escapeHTML and escape as obj ect m et hods: • use CGI; • my cgi = new CGI; printf s\ns\n, cgi-escape str, cgi-escapeHTML str; • I m port t he nam es explicit ly int o your script s nam espace. I n t his case, neit her a CGI obj ect nor t he CGI:: prefix is necessary and you can invoke t he m et hods as st andalone funct ions. The following exam ple im port s t he t wo m et hod nam es in addit ion t o t he set of st andard nam es: • use CGI qw:standard escape escapeHTML; printf s\ns\n, escape str, escapeHTML str; I prefer t he last alt ernat ive because it is consist ent wit h t he CGI .pm funct ion call int erface t hat you use for ot her im port ed m et hod nam es. Just rem em ber t o include t he encoding m et hod nam es in t he use CGI st at em ent for any Perl script t hat requires t hem , or youll get undefined subrout ine errors when t he script execut es. The following code reads t he cont ent s of t he phrase t able and produces hyperlinks from t hem using escapeHTML and escape : my query = SELECT phrase_val FROM phrase ORDER BY phrase_val; my sth = dbh-prepare query; sth-execute ; while my phrase = sth-fetchrow_array { URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label my url = cgi-binmysearch.pl?phrase= . escape phrase; my label = escapeHTML phrase; print a {-href = url}, label . br . \n; }

16.5.5.2 PHP

I n PHP, t he htmlspecialchars and urlencode funct ions perform HTML-encoding and URL- encoding. Theyre used as follows: query = SELECT phrase_val FROM phrase ORDER BY phrase_val; result_id = mysql_query query, conn_id; if result_id { while list phrase = mysql_fetch_row result_id { URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label url = mcbmysearch.php?phrase= . urlencode phrase; label = htmlspecialchars phrase; printf a href=\s\sabr \n, url, label; } mysql_free_result result_id; }

16.5.5.3 Python

I n Pyt hon, t he cgi and urllib m odules cont ain t he relevant encoding m et hods. cgi.escape perform s HTML- encoding and urllib.quote does URL- encoding: import cgi import urllib query = SELECT phrase_val FROM phrase ORDER BY phrase_val cursor = conn.cursor cursor.execute query for phrase, in cursor.fetchall : URL-encode the phrase value for use in the URL HTML-encode the phrase value for use in the link label url = cgi-binmysearch.py?phrase= + urllib.quote phrase label = cgi.escape phrase, 1 print a href=\s\sabr url, label cursor.close The first argum ent t o cgi.escape is t he st ring t o be HTML-encoded. By default , t his funct ion convert s , , and charact ers t o t heir corresponding HTML ent it ies. To t ell cgi.escape also t o convert double quot es t o t he quot; ent it y, pass a second argum ent of 1 , as shown in t he exam ple. This is especially im port ant if youre encoding values t o be placed int o a double- quot ed t ag at t ribut e.

16.5.5.4 Java

The c:out JSTL t ag aut om at ically perform s HTML- encoding for JSP pages. St rict ly speaking, it perform s XML- encoding, but t he set of charact ers affect ed is , , , , and , which includes all t hose needed for HTML- encoding. By using c:out t o display t ext in a web page, you need not even t hink about convert ing special charact ers t o HTML ent it ies. I f for som e reason you want t o suppress encoding, invoke c:out like t his: c:out value= value to display encodeXML=false To URL- encode param et ers for inclusion in a URL, use t he c:url t ag. Specify t he URL st ring in t he t ags value at t ribut e, and include any param et er values and nam es in c:param t ags in t he body of t he c:url t ag. A param et er value can be given eit her in t he value at t ribut e of a c:param t ag or in it s body. Heres an exam ple t hat shows bot h ways: c:url var=urlStr value=myscript.jsp c:param name=id value =47 c:param name=colorsky bluec:param c:url This will URL- encode t he values of t he id and color param et ers and add t hem t o t he end of t he URL. The result is placed in an obj ect nam ed urlStr , which you can display as follows: c:out value={urlStr} The c:url t ag does not encode special charact ers such as spaces in t he st ring supplied in it s value at t ribut e. You m ust encode t hem yourself, so it s probably best j ust t o avoid creat ing pages wit h spaces in t heir nam es, t o avoid t he likelihood t hat youll need t o refer t o t hem . The c:out and c:url t ags can be used as follow s t o display ent ries from t he phrase t able: sql:query var=rs dataSource={conn} SELECT phrase_val FROM phrase ORDER BY phrase_val sql:query c:forEach var=row items={rs.rows} -- URL-encode the phrase value for use in the URL -- -- HTML-encode the phrase value for use in the link label -- c:url var=urlStr value=mcbmysearch.jsp c:param name=phrase value ={row.phrase_val} c:url a href=c:out value={urlStr} c:out value={row.phrase_val} a br c:forEach

Chapter 17. Incorporating Query Results into Web Pages

Sect ion 17.1. I nt roduct ion Sect ion 17.2. Displaying Query Result s as Paragraph Text Sect ion 17.3. Displaying Query Result s as List s Sect ion 17.4. Displaying Query Result s as Tables Sect ion 17.5. Displaying Query Result s as Hyperlinks Sect ion 17.6. Creat ing a Navigat ion I ndex from Dat abase Cont ent Sect ion 17.7. St oring I m ages or Ot her Binary Dat a Sect ion 17.8. Ret rieving I m ages or Ot her Binary Dat a Sect ion 17.9. Serving Banner Ads Sect ion 17.10. Serving Query Result s for Download