PHP Collecting Web Input

18.6.5 Perl

The Perl CGI .pm m odule m akes input param et ers available t o script s t hrough t he param funct ion. param provides access t o input subm it t ed via eit her GET or POST , w hich sim plifies your t ask as t he script writ er. I f a form cont aining id and name param et ers w as subm it t ed via POST , you can process it t he sam e way as if t he param et ers were subm it t ed at t he end of t he URL via GET . You dont need t o perform any decoding, eit her; param handles t hat as well. To obt ain a list of nam es of all available param et ers, call param wit h no argum ent s: names = param ; To obt ain t he value of a specific param et er, pass it s nam e t o param . id = param id; options = param options; I n scalar cont ext , param ret urns t he param et er value if it is single- valued, t he first value if it is m ult iple- values, or undef if t he param et er is not available. I n array cont ext , param ret urns a list cont aining all t he param et ers values, or an em pt y list if t he param et er is not available. A param et er wit h a given nam e m ight not be available if t he form field wit h t he sam e nam e was left blank, or if t here isnt any field wit h t hat nam e. Not e t oo t hat a param et er value m ay be defined but em pt y. For good m easure, you m ay want t o check bot h possibilit ies. For exam ple, t o check for an age param et er and assign a default value of unknown if t he param et er is m issing or em pt y, you can do t his: age = param age; age = unknown if defined age || age eq ; CGI .pm underst ands bot h ; and as URL param et er separat or charact ers.

18.6.6 PHP

I nput param et ers can be available t o PHP in several ways, depending on your version of PHP and on your configurat ion set t ings: • I f t he register_globals set t ing is on, param et ers are assigned t o global variables of t he sam e nam e. I n t his case, t he value of a field nam ed id w ill be available as t he variable id , regardless of whet her t he request was sent via GET or POST . • I f t he track_vars configurat ion set t ing is on, param et ers are available in t he HTTP_GET_VARS and HTTP_POST_VARS arrays. For exam ple, if a form cont ains a field nam ed id , t he value will be available as HTTP_GET_VARS[id] or HTTP_POST_VARS[id] , depending on whet her t he form was subm it t ed via GET or POST . HTTP_GET_VARS and HTTP_POST_VARS m ust be declared using t he global keyw ord t o m ake t hem accessible in a non-global script , such as w it hin a funct ion. • As of PHP 4.1, param et ers are available in t he _GET and _POST arrays. These are analogous t o HTTP_GET_VARS and HTTP_POST_VARS except t hat t hey are superglobal arrays t hat are aut om at ically available in any scope. For exam ple, it is unnecessary t o declare _GET and _POST w it h global inside funct ions. The _GET and _POST arrays are now t he preferred m eans of get t ing at input param et ers. The track_vars and register_globals set t ings can be com piled in or configured in t he PHP php.ini file. As of PHP 4.0.3, track_vars is alw ays on, and I suspect t hat m ost inst allat ions of earlier versions enable t his set t ing as well. For t his reason, I ll assum e your version of PHP has track_vars enabled. register_globals m akes it convenient t o access input param et ers t hrough global variables, but t he PHP developers recom m end t hat it be disabled for securit y reasons. Why is t hat ? Well, suppose you writ e a script t hat requires t he user t o supply a password, which is represent ed by t he password variable. You m ight check t he password in a script like t his: if check_password password password_is_ok = 1; The int ent here is t hat if t he password m at ches, t he script set s password_is_ok t o 1 . Ot herwise password_is_ok is left unset which com pares false in Boolean expressions . But suppose register_variables is enabled and som eone invokes your script as follows: ht t p: your.host .com chkpass.php?password_is_ok= 1 I n t his case, PHP sees t hat t he password_is_ok param et er is set t o 1 , and set s t he password_is_ok variable t o 1 . The result is t hat when your script execut es, password_is_ok is 1 no m at t er what password was given, or even if no password w as given The problem wit h register_globals is t hat it allows out side users t o supply default values for global variables in your script s. One solut ion is t o disable register_globals , in which case youll need t o check t he global arrays _GET , _POST for input param et er values. I f you dont want t o do t hat , you should t ake care not t o assum e t hat PHP variables have no value init ially. Unless youre expect ing a global variable t o be set from an input param et er, it s best t o init ialize it explicit ly t o a known value. The password-checking code should be writ t en like t his t o m ake sure t hat password_is_ok is assigned a value what ever t he result of t he t est : password_is_ok = 0; if check_password password password_is_ok = 1; The PHP script s in t his book do not rely on register_globals . I nst ead, t hey obt ain input t hrough t he global param et er arrays. Anot her com plicat ing fact or when ret rieving input param et ers in PHP is t hat t hey m ay need som e decoding, depending on t he value of t he magic_quotes_gpc configurat ion set t ing. I f m agic quot es are enabled, any quot e, backslash, and NUL charact ers in input param et er values will be escaped wit h backslashes. I guess t his is supposed t o save you a st ep by allowing you t o ext ract values and use t hem direct ly in query st rings. However, t hat s only useful if you plan t o use web input in a query wit h no preprocessing or validit y checking, which is dangerous. You should check your input first , in which case it s necessary t o st rip out t he slashes anyway. That m eans having m agic quot es t urned on isnt really very useful. Given t he various sources t hrough which input param et ers m ay be available, and t he fact t hat t hey m ay or m ay not cont ain ext ra backslashes, ext ract ing input in PHP script s can be an int erest ing problem . I f you have cont rol of your server and can set t he values of t he various configurat ion set t ings, you can of course writ e your script s based on t hose set t ings. But if you do not cont rol your server or are writ ing script s t hat need t o run on several m achines, you m ay not know in advance what t he set t ings are. Fort unat ely, wit h a bit of effort it s possible t o writ e reasonably general purpose param et er ext ract ion code t hat works correct ly wit h very few assum pt ions about your PHP operat ing environm ent . The following ut ilit y funct ion, get_param_val , t akes a param et er nam e as it s argum ent and ret urns t he corresponding param et er value. I f t he param et er is not available, t he funct ion ret urns an unset value. function get_param_val name { global HTTP_GET_VARS, HTTP_POST_VARS; unset val; if isset _GET[name] val = _GET[name]; else if isset _POST[name] val = _POST[name]; else if isset HTTP_GET_VARS[name] val = HTTP_GET_VARS[name]; else if isset HTTP_POST_VARS[name] val = HTTP_POST_VARS[name]; if isset val get_magic_quotes_gpc val = strip_slash_helper val; return val; } To use t his funct ion t o obt ain t he value of a single- valued param et er nam ed id , call it like t his: id = get_param_val id; You can t est id t o det erm ine whet her t he id param et er was present in t he input : if isset id ... id parameter is present ... else ... id parameter is not present ... For a form field t hat m ay have m ult iple values such as a checkbox group or a m ult iple- pick scrolling list , you should represent it in t he form using a nam e t hat ends in [ ] . For exam ple, a list elem ent const ruct ed from t he SET colum n accessories in t he cow_order t able has one it em for each allowable set value. To m ake sure PHP t reat s t he elem ent value as an array, dont nam e t he field accessories , nam e it accessories[ ] . See Recipe 18.4 for an exam ple. When t he form is subm it t ed, PHP places t he array of values in a param et er nam ed wit hout t he [ ] , so t o access it , do t his: accessories = get_param_val accessories; The accessories variable will be an array. This will be t rue whet her t he param et er has m ult iple values, a single value, or even no values. The det erm ining fact or is not whet her t he param et er act ually has m ult iple values, but whet her you nam e t he corresponding field in t he form using [ ] not at ion. The get_param_val funct ion checks t he _GET , _POST , HTTP_GET_VARS , and HTTP_POST_VARS arrays for param et er values. Thus, it works correct ly for PHP 3 and PHP 4, whet her t he request was m ade by GET or POST , and w het her or not register_globals is t urned on. The only t hing t hat t he funct ion assum es is t hat track_vars is enabled. get_param_val also works correct ly regardless of whet her m agic quot ing is enabled. I t uses a helper funct ion strip_slash_helper t hat perform s backslash st ripping from param et er values if necessary: function strip_slash_helper val { if is_array val val = stripslashes val; else { reset val; while list k, v = each val val[k] = strip_slash_helper v; } return val; } strip_slash_helper checks w het her a value is a scalar or an array and processes it accordingly. The reason it uses a recursive algorit hm for array values is t hat in PHP 4 it s possible t o creat e nest ed arrays from input param et ers. To m ake it easy t o obt ain a list of all param et er nam es, writ e anot her ut ilit y funct ion: function get_param_names { global HTTP_GET_VARS, HTTP_POST_VARS; construct an associative array in which each element has a parameter name as both key and value. Using an associative array eliminates duplicates. keys = array ; if isset _GET { reset _GET; while list k, v = each _GET keys[k] = k; } else if isset HTTP_GET_VARS { reset HTTP_GET_VARS; while list k, v = each HTTP_GET_VARS keys[k] = k; } if isset _POST { reset _POST; while list k, v = each _POST keys[k] = k; } else if isset HTTP_POST_VARS { reset HTTP_POST_VARS; while list k, v = each HTTP_POST_VARS keys[k] = k; } return keys; } get_param_names ret urns a list of param et er nam es present in t he HTTP variable arrays, wit h duplicat e nam es rem oved if t here is overlap bet ween t he arrays. The ret urn value is an associat ive array wit h bot h t he keys and values set t o t he param et er nam es. This way you can use eit her t he keys or t he values as t he list of nam es. The following exam ple print s t he nam es, using t he values: param_names = get_param_names ; while list k, v = each param_names print htmlspecialchars v . br \n; For PHP 3 script s, t he param et ers in URLs should be separat ed by charact ers. That s also t he default for PHP 4, alt hough you can change it using t he arg_separator configurat ion set t ing in t he PHP init ializat ion file.

18.6.7 Python