General Encoding Principles Encoding Special Characters in Web Output
16.5.4 General Encoding Principles
One form of encoding applies t o charact ers t hat are used in writ ing HTML const ruct s, anot her applies t o t ext t hat is included in URLs. I t s im port ant t o underst and t his dist inct ion so t hat you dont encode t ext inappropriat ely. Not e t oo t hat encoding t ext for inclusion in a web page is an ent irely different issue t han encoding special charact ers in dat a values for inclusion in a SQL st at em ent . The lat t er issue is discussed in Recipe 2.8 .16.5.4.1 Encoding characters that are special in HTML
HTML m arkup uses and charact ers t o begin and end t ags, t o begin special ent it y nam es such as nbsp; t o signify a non- breaking space , and t o quot e at t ribut e values in t ags such as p align=left . Consequent ly, t o display lit eral inst ances of t hese charact ers, you m ust encode t hem as HTML ent it ies so t hat browsers or ot her client s underst and your int ent . To do t his, convert , , , and t o t he corresponding HTML ent it y designat ors lt; less t han , gt; great er t han , amp; am persand , and quot; quot e . Suppose you want t o display t he following st ring lit erally in a web page: Paragraphs begin and end with p p tags. I f you send t his t ext t o t he client browser exact ly as shown, t he browser will m isint erpret it . The p and p t ags w ill be t aken as paragraph m arkers and t he m ay be t aken as t he beginning of an HTML ent it y designat or. To display t he st ring t he way you int end, t he special charact ers should be encoded as t he lt; , gt; , and amp; , ent it ies: Paragraphs begin and end with lt;pgt; amp; lt;pgt; tags. The principle of encoding t ext t his way is also useful wit hin t ags. For exam ple, HTML t ag at t ribut e values usually are enclosed wit hin double quot es, so it s im port ant t o perform HTML- encoding on at t ribut e values. Suppose you want t o include a t ext input box in a form , and you want t o provide an init ial value of Rich Goose Gossage t o be displayed in t he box. You cannot writ e t hat value lit erally in t he t ag like t his: input type=text name=player_name value=Rich Goose Gossage The problem here is t hat t he double-quot ed value at t ribut e includes int ernal double quot es, which m akes t he input t ag m alform ed. The proper way t o writ e it is t o encode t he double quot es: input type=text name=player_name value=Rich quot;Goosequot; Gossage When a browser receives t his t ext , it will decode t he quot; ent it ies back t o charact ers and int erpret t he value at t ribut e value properly.16.5.4.2 Encoding characters that are special in URLs
URLs for hyperlinks t hat occur wit hin HTML pages have t heir own synt ax, and t heir own encoding. This encoding applies t o at t ribut es wit hin several t ags: a href= URL img src= URL form action= URL frame src= URL Many charact ers have special m eaning wit hin URLs, such as : , , ? , = , , and ; . The following URL cont ains som e of t hese charact ers: ht t p: apache.snake.net m yscript .php?id= 428nam e= Gandalf Here t he : and charact ers segm ent t he URL int o com ponent s, t he ? charact er indicat es t hat param et ers are present , and t he charact ers separat es t he param et ers, each of which is specified as a name = value pair . The ; charact er is not present in t he URL j ust shown, but com m only is used inst ead of t o separat e param et ers. I f you want t o include any of t hese charact ers lit erally wit hin a URL, you m ust encode t hem t o prevent t he browser from int erpret ing t hem wit h t heir usual special m eaning. Ot her charact ers such as spaces require special t reat m ent as well. Spaces are not allowed wit hin a URL, so if you want t o reference a page nam ed m y hom e page.ht m l on t he sit e apache.snake.net , t he URL in t he following hyperlink wont work: a href=http:apache.snake.netmy home page.htmlMy Home Pagea URL- encoding for special and reserved charact ers is perform ed by convert ing each such charact er t o followed by t wo hexadecim al digit s represent ing t he charact ers ASCI I code. For exam ple, t he ASCI I value of t he space charact er is 32 decim al, or 20 hexadecim al, so youd writ e t he preceding hyperlink like t his: Mya href=http:apache.snake.netmy20home20page.htmlMy Home Pagea Som et im es youll see spaces encoded as + in URLs. This t oo is legal.16.5.4.3 Encoding interactions
Be sure t o encode inform at ion properly for t he cont ext in which youre using it . Suppose you want t o creat e a hyperlink t o t rigger a search for it em s m at ching a search t erm , and you want t he t erm it self t o appear as t he link label t hat is displayed in t he page. I n t his case, t he t erm appears as a param et er in t he URL, and also as HTML t ext bet ween t he a and a t ags. I f t he search t erm is cat s dogs , t he unencoded hyperlink const ruct looks like t his: a href=cgi-binmyscript?term=cats dogscats dogsa That is incorrect because is special in bot h cont ext s and t he spaces are special in t he URL. The link should be writ t en like t his inst ead: a href=cgi-binmyscript?term=cats202620dogscats amp; dogsa Here, is HTML-encoded as amp; for t he link label, and is URL-encoded as 26 for t he URL, which also includes spaces encoded as 20 . Grant ed, it s a pain t o encode t ext before writ ing it t o a web page, and som et im es you know enough about a value t hat you can skip t he encoding see t he sidebar Do You Always Need t o Encode Web Page Out put ? . But encoding is t he safe t hing t o do m ost of t he t im e. Fort unat ely, m ost API s provide funct ions t o do t he work for you. This m eans you need not know every charact er t hat is special in a given cont ext . You j ust need t o know which kind of encoding t o perform , and call t he appropriat e funct ion t o produce t he int ended result . Do You Always Need to Encode Web Page Output? I f you know a value is legal in a part icular cont ext wit hin a web page, you need not encode it . For exam ple, if you obt ain a value from an int eger- valued colum n in a dat abase t able t hat cannot be NULL , it m ust necessarily be an int eger. No HTML- or URL- encoding is needed t o include t he value in a web page, because digit s are not special in HTML t ext or wit hin URLs. On t he ot her hand, suppose you solicit an int eger value using a field in a web form . You m ight be expect ing t he user t o provide an int eger, but t he user m ight be confused and ent er an illegal value. You could handle t his by displaying an error page t hat shows t he value and explains t hat it s not an int eger. But if t he value cont ains special charact ers and you dont encode it , t he page w ont display t he value properly, possibly confusing t he user furt her.16.5.5 Encoding Special Characters Using Web APIs
Parts
» O'Reilly-MySQL.Cookbook.eBook-iNTENSiTY. 4810KB Mar 29 2010 05:03:43 AM
» Introduction Using the mysql Client Program
» Problem Solution Discussion Setting Up a MySQL User Account
» Problem Solution Discussion Starting and Terminating mysql
» Problem Solution Discussion Specifying Connection Parameters by Using Option Files
» Problem Solution Discussion Mixing Command-Line and Option File Parameters
» Problem Solution Discussion What to Do if mysql Cannot Be Found
» Problem Solution Discussion Setting Environment Variables
» Problem Solution Discussion Repeating and Editing Queries
» Problem Solution Discussion Preventing Query Output from Scrolling off the Screen
» Problem Solution Discussion Specifying Arbitrary Output Column Delimiters
» Problem Solution Discussion Logging Interactive mysql Sessions
» Discussion Using mysql as a Calculator
» Writing Shell Scripts Under Unix
» Writing Shell Scripts Under Windows
» MySQL Client Application Programming Interfaces
» Perl Connecting to the MySQL Server, Selecting a Database, and Disconnecting
» PHP Connecting to the MySQL Server, Selecting a Database, and Disconnecting
» Python Connecting to the MySQL Server, Selecting a Database, and Disconnecting
» Java Connecting to the MySQL Server, Selecting a Database, and Disconnecting
» Problem Solution Discussion Checking for Errors
» Python Java Checking for Errors
» Problem Solution Discussion Writing Library Files
» Python Writing Library Files
» SQL Statement Categories Issuing Queries and Retrieving Results
» Perl Issuing Queries and Retrieving Results
» Python Issuing Queries and Retrieving Results
» Java Issuing Queries and Retrieving Results
» Problem Solution Discussion Moving Around Within a Result Set
» Problem Solution Discussion Using Prepared Statements and Placeholders in Queries
» Perl Using Prepared Statements and Placeholders in Queries
» PHP Python Java Using Prepared Statements and Placeholders in Queries
» Problem Solution Discussion Including Special Characters and NULL Values in Queries
» Perl Including Special Characters and NULL Values in Queries
» PHP Including Special Characters and NULL Values in Queries
» Python Java Including Special Characters and NULL Values in Queries
» PHP Python Java Handling NULL Values in Result Sets
» Problem Solution Discussion Writing an Object-Oriented MySQL Interface for PHP
» Class Overview Writing an Object-Oriented MySQL Interface for PHP
» Connecting and Disconnecting Writing an Object-Oriented MySQL Interface for PHP
» Error Handling Issuing Queries and Processing the Results
» Quoting and Placeholder Support
» Problem Solution Discussion Ways of Obtaining Connection Parameters
» Getting Parameters from the Command Line
» Getting Parameters from Option Files
» Conclusion and Words of Advice
» Problem Solution Discussion Avoiding Output Column Order Problems When Writing Programs
» Problem Solution Discussion Using Column Aliases to Make Programs Easier to Write
» Problem Solution Discussion Selecting a Result Set into an Existing Table
» Problem Solution Discussion Creating a Destination Table on the Fly from a Result Set
» Problem Solution Discussion Moving Records Between Tables Safely
» Problem Solution Discussion Cloning a Table Exactly
» Problem Solution Discussion Generating Unique Table Names
» Problem Solution Discussion Using TIMESTAMP Values
» Problem Solution Discussion Using ORDER BY to Sort Query Results
» Solution Discussion Working with Per-Group and Overall Summary Values Simultaneously
» Problem Solution Discussion Changing a Column Definition or Name
» Problem Solution Discussion Changing a Table Type
» Problem Solution Discussion Adding Indexes
» Introduction Obtaining and Using Metadata
» Problem Solution Discussion Perl PHP
» Problem Solution Discussion Perl
» PHP Obtaining Result Set Metadata
» Python Obtaining Result Set Metadata
» Java Obtaining Result Set Metadata
» Using Result Set Metadata to Get Table Structure
» Problem Solution Discussion Database-Independent Methods of Obtaining Table Information
» Problem Solution Discussion Displaying Column Lists Interactive Record Editing
» Mapping Column Types onto Web Page Elements Adding Elements to ENUM or SET Column Definitions
» Selecting All Except Certain Columns
» Problem Solution Discussion Listing Tables and Databases
» Problem Solution Writing Applications That Adapt to the MySQL Server Version
» Discussion Writing Applications That Adapt to the MySQL Server Version
» Problem Solution Discussion Determining Which Table Types the Server Supports
» General Import and Export Issues
» Problem Solution Discussion Importing Data with LOAD DATA and mysqlimport
» Problem Solution Discussion Specifying the Datafile Location
» Problem Solution Discussion Specifying the Datafile Format
» Problem Solution Discussion Dealing with Quotes and Special Characters
» Problem Solution Discussion Handling Duplicate Index Values
» Problem Solution Discussion Getting LOAD DATA to Cough Up More Information
» Problem Solution Discussion Dont Assume LOAD DATA Knows More than It Does
» Problem Solution Discussion Skipping Datafile Columns
» Problem Solution Discussion Exporting Query Results from MySQL
» Using the mysql Client to Export Data
» Problem Solution Discussion Exporting Tables as Raw Data
» Problem Solution Discussion Exporting Table Contents or Definitions in SQL Format
» Problem Solution Discussion Copying Tables or Databases to Another Server
» Problem Solution Discussion Writing Your Own Export Programs
» Problem Solution Discussion Converting Datafiles from One Format to Another
» Problem Solution Discussion Extracting and Rearranging Datafile Columns
» Problem Solution Discussion Validating and Transforming Data
» Writing an Input-Processing Loop Putting Common Tests in Libraries
» Problem Solution Discussion Validation by Pattern Matching
» Problem Solution Discussion Using Patterns to Match Numeric Values
» Problem Solution Discussion Using Patterns to Match Dates or Times
» See Also Using Patterns to Match Dates or Times
» Problem Solution Discussion Using Patterns to Match Email Addresses and URLs
» Problem Solution Discussion Validation Using Table Metadata
» Problem Solution Discussion Issue Individual Queries Construct a Hash from the Entire Lookup Table
» Use a Hash as a Cache of Already-Seen Lookup Values
» Problem Solution Discussion Converting Two-Digit Year Values to Four-Digit Form
» Problem Solution Discussion Performing Validity Checking on Date or Time Subparts
» Problem Solution Discussion Writing Date-Processing Utilities
» Problem Solution Discussion Performing Date Conversion Using SQL
» Problem Solution Discussion Guessing Table Structure from a Datafile
» Problem Solution Discussion A LOAD DATA Diagnostic Utility
» Problem Solution Discussion Exchanging Data Between MySQL and Microsoft Access
» Problem Solution Discussion Exchanging Data Between MySQL and Microsoft Excel
» Problem Solution Discussion Exchanging Data Between MySQL and FileMaker Pro
» Problem Solution Discussion Importing XML into MySQL
» Epilog Importing and Exporting Data
» Introduction Generating and Using Sequences
» Problem Solution Discussion Using AUTO_INCREMENT To Set Up a Sequence Column
» Problem Solution Discussion Choosing the Type for a Sequence Column
» Problem Solution Discussion Ensuring That Rows Are Renumbered in a Particular Order
» Problem Solution Discussion Managing Multiple Simultaneous AUTO_INCREMENT Values
» Problem Solution Discussion Using AUTO_INCREMENT Values to Relate Tables
» Problem Solution Discussion Generating Repeating Sequences
» Problem Solution Discussion See Also
» Performing a Related-Table Update Using Table Replacement
» Performing a Related-Table Update by Writing a Program
» Performing a Multiple-Table Delete by Writing a Program
» Problem Solution Discussion Dealing with Duplicates at Record-Creation Time
» Problem Solution Discussion Using Transactions in Perl Programs
» Problem Solution Discussion Using Transactions in Java Programs
» Problem Solution Discussion Using Alternatives to Transactions
» Grouping Statements Using Locks
» Rewriting Queries to Avoid Transactions
» Introduction Introduction to MySQL on the Web
» Problem Solution Discussion Basic Web Page Generation
» Problem Solution Discussion Using Apache to Run Web Scripts
» Problem Solution Discussion Using Tomcat to Run Web Scripts
» Installing the mcb Application
» Installing the JSTL Distribution
» Problem Solution Discussion Encoding Special Characters in Web Output
» General Encoding Principles Encoding Special Characters in Web Output
» Encoding Special Characters Using Web APIs
» Introduction Incorporating Query Results into Web Pages
» Problem Solution Discussion Creating a Navigation Index from Database Content
» Creating a Multiple-Page Navigation Index
» Problem Solution Discussion Storing Images or Other Binary Data
» Storing Images with LOAD_FILE Storing Images Using a Script
» Problem Solution Discussion Retrieving Images or Other Binary Data
» Problem Solution Discussion Serving Banner Ads
» Problem Solution Discussion Serving Query Results for Download
» Introduction Processing Web Input with MySQL
» Problem Solution Discussion Creating Forms in Scripts
» Problem Solution Discussion Creating Multiple-Pick Form Elements from Database Content
» Problem Solution Discussion Loading a Database Record into a Form
» Problem Solution Discussion Collecting Web Input
» Web Input Extraction Conventions Perl
» Problem Solution Discussion Validating Web Input
» Problem Solution Discussion Using Web Input to Construct Queries
» Problem Solution Discussion Processing File Uploads
» Perl Processing File Uploads
» Problem Solution Discussion Performing Searches and Presenting the Results
» Problem Solution Discussion Generating Previous-Page and Next-Page Links
» Paged Displays with Previous-Page and Next-Page Links
» Paged Displays with Links to Each Page
» Problem Solution Discussion Web Page Access Counting
» Problem Solution Discussion Web Page Access Logging
» Problem Solution Discussion Setting Up Database Logging
» Other Logging Issues Using MySQL for Apache Logging
» Session Management Issues Introduction
» Problem Solution Discussion Installing Apache::Session
» The Apache::Session Interface
» A Sample Application Using MySQL-Based Sessions in Perl Applications
» Problem Solution Discussion The PHP 4 Session Management Interface
» Specifying a User-Defined Storage Module
» Problem Solution Discussion Using MySQL for Session BackingStore with Tomcat
» The Servlet and JSP Session Interface A Sample JSP Session Application
Show more