cont ain slashes, so it s easier t o use a different charact er around t he pat t ern t o avoid having t o escape t he slashes wit h backslashes:
mhttp:|ftp:|mailto:i The alt ernat ives in t he pat t ern are grouped wit hin parent heses because ot herwise t he
will anchor only t he first of t hem t o t he beginning of t he st ring. The
i
m odifier follows t he pat t ern because prot ocol specifiers in URLs are not case sensit ive. The pat t ern is ot herwise fairly
unrest rict ive, because it allows anyt hing t o follow t he prot ocol specifier. I leave it t o you t o add furt her rest rict ions as necessary.
10.28 Validation Using Table Metadata
10.28.1 Problem
You need t o check input values against t he legal m em bers of an
ENUM
or
SET
colum n.
10.28.2 Solution
Get t he colum n definit ion, ext ract t he list of m em bers from it , and check dat a values against t he list .
10.28.3 Discussion
Som e form s of validat ion involve checking input values against inform at ion st ored in a dat abase. This includes values t o be st ored in an
ENUM
or
SET
colum n, w hich can be checked against t he valid m em bers st ored in t he colum n definit ion. Dat abase-backed validat ion also
applies when you have values t hat m ust m at ch t hose list ed in a lookup t able t o be considered legal. For exam ple, input records t hat cont ain cust om er I Ds can be required t o m at ch a record
in a
customers
t able, or st at e abbreviat ions in addresses can be verified against a t able t hat list s each st at e. This sect ion describes
ENUM
- and
SET
- based validat ion, and Recipe 10.29
discusses how t o use lookup t ables. One way t o check input values t hat correspond t o t he legal values of
ENUM
or
SET
colum ns is t o get t he list of legal colum n values int o an array using t he inform at ion ret urned by
SHOW COLUMNS
, t hen perform an array m em bership t est . For exam ple, t he favorit e- color colum n
color
fr om t he
profile
t able is an
ENUM
t hat is defined as follow s:
mysql SHOW COLUMNS FROM profile LIKE color\G 1. row
Field: color Type: enumblue,red,green,brown,black,white
Null: YES Key:
Default: NULL Extra:
I f you ext ract t he list of enum erat ion m em bers from t he
Type
value and st ore t hem in an array
members
, you can perform t he m em bership t est like t his: valid = grep vali, members;
The pat t ern const ruct or begins and ends wit h and
t o require
val
t o m at ch an ent ire enum erat ion m em ber rat her t han j ust a subst ring . I t also is followed by an
i
t o specify a case- insensit ive com parison, because
ENUM
colum ns are not case sensit ive. I n
Recipe 9.7 , we wrot e a funct ion
get_enumorset_info
t hat ret urns
ENUM
or
SET
colum n m et adat a. This includes t he list of m em bers, so it s easy t o use t hat funct ion t o writ e anot her ut ilit y rout ine,
check_enum_value
, t hat get s t he legal enum erat ion values and perform s t he m em bership t est . The rout ine t akes four argum ent s: a dat abase handle, t he
t able nam e and colum n nam e for t he
ENUM
colum n, and t he value t o check. I t ret urns t rue or false t o indicat e whet her or not t he value is legal:
sub check_enum_value {
my dbh, tbl_name, col_name, val = _; my valid = 0;
my info = get_enumorset_info dbh, tbl_name, col_name; if info info-{type} eq enum
{ use case-insensitive comparison; ENUM
columns are not case sensitive valid = grep vali, {info-{values}};
} return valid;
}
For single- value t est ing, such as t o validat e a value subm it t ed in a web form , t hat kind of t est works well. However, if youre going t o be t est ing a lot of values like an ent ire colum n in a
dat afile , it s bet t er t o read t he enum erat ion values int o m em ory once, t hen use t hem repeat edly t o check each of t he dat a values. Furt herm ore, it s a lot m ore efficient t o perform
hash lookups t han array lookups in Perl at least . To do so, ret rieve t he legal enum erat ion values and st ore t hem as keys of a hash. Then t est each input value by checking whet her or
not it exist s as a hash key. I t s a lit t le m ore work t o const ruct t he hash, which is why
check_enum_value
doesnt do so. But for bulk validat ion, t he im proved lookup speed m ore t han m akes up for t he hash const ruct ion overhead.
[ 4]
[ 4]
I f you want t o check for yourself t he relat ive efficiency of array m em bership t est s versus hash lookups, t ry t he lookup_t im e.pl script in t he t ransfer
direct ory of t he
recipes
dist r ibut ion. Begin by get t ing t he m et adat a for t he colum n, t hen convert t he list of legal enum erat ion
m em bers t o a hash:
my ref = get_enumorset_info dbh, tbl_name, col_name; my members;
foreach my member {ref-{values}} {
convert hash key to consistent case; ENUM isnt case sensitive members{lc member} = 1;
}
The loop m akes each enum erat ion m em ber exist as t he key of a hash elem ent . The hash key is what s im port ant here; t he value associat ed wit h it is irrelevant . The exam ple shown set s
t he value t o
1
, but you could use
undef
, , or any ot her value. Not e t hat t he code convert s
t he hash keys t o lowercase before st oring t hem . This is done because hash key lookups in Perl are case sensit ive. That s fine if t he values t hat youre checking also are case sensit ive, but
ENUM
colum ns are not . By convert ing t he enum erat ion values t o a given let t ercase before st oring t hem in t he hash, t hen convert ing t he values you want t o check sim ilarly, you perform
in effect a case insensit ive key exist ence t est : valid = exists members{lc val};
The preceding exam ple convert s enum erat ion values and input values t o lowercase. You could j ust as well use uppercase—as long as you do so for all values consist ent ly.
Not e t hat t he exist ence t est m ay fail if t he input value is t he em pt y st ring. Youll have t o decide how t o handle t hat case on a colum n-by- colum n basis. For exam ple, if t he colum n
allow s
NULL
values, you m ight int erpret t he em pt y st ring as equivalent t o
NULL
and t hus as being a legal value.
The validat ion procedure for
SET
values is sim ilar t o t hat for
ENUM
values, except t hat an input value m ight consist of any num ber of
SET
m em bers, separat ed by com m as. For t he value t o be legal, each elem ent in it m ust be legal. I n addit ion, because any num ber of m em bers
includes none, t he em pt y st ring is a legal value for any
SET
colum n. For one- shot t est ing of individual input values, you can use a ut ilit y rout ine
check_set_value
t hat is sim ilar t o
check_enum_value
: sub check_set_value
{ my dbh, tbl_name, col_name, val = _;
my valid = 0; my info = get_enumorset_info dbh, tbl_name, col_name;
if info info-{type} eq set {
return 1 if val eq ; empty string is legal element use case-insensitive comparison; SET
columns are not case sensitive valid = 1; assume valid until we find out otherwise
foreach my v split ,, val {
if grep vi, {info-{values}}
{ valid = 0; value contains an invalid element
last; }
} }
return valid; }
For bulk t est ing, const ruct a hash from t he legal
SET
m em bers. The procedure is t he sam e as for producing a hash from
ENUM
elem ent s: my ref = get_enumorset_info dbh, tbl_name, col_name;
my members; foreach my member {ref-{values}}
{ convert hash key to consistent case; SET isnt case sensitive
members{lc member} = 1; }
To validat e a given input value against t he
SET
m em ber hash, convert it t o t he sam e let t ercase as t he hash keys, split it at com m as t o get a list of t he individual elem ent s of t he
value, t hen check each one. I f any of t he elem ent s are invalid, t he ent ire value is invalid: valid = 1; assume valid until we find out otherwise
foreach my elt split ,, lc val {
if exists members{elt} {
valid = 0; value contains an invalid element last;
} }
Aft er t he loop t erm inat es,
valid
is t rue if t he value is legal for t he
SET
colum n, and false ot herwise. Em pt y st rings are always legal
SET
values, but t his code doesnt perform any special-case t est for an em pt y st ring. No such t est is necessary, because in t hat case t he
split
operat ion ret urns an em pt y list , t he loop never execut es, and
valid
rem ains t rue.
10.29 Validation Using a Lookup Table