Problem Solution Discussion Converting Datafiles from One Format to Another

{ csv-combine {sth-{NAME}} or die cannot process column labels\n; print csv-string ; } my count = 0; while my val = sth-fetchrow_array { ++count; csv-combine val or die cannot process column values, row count\n; print csv-string ; } The sep_char and quote_char opt ions in t he name call set t he colum n delim it er sequence and quot ing charact er. The escape_char opt ion is set t o t he sam e value as quote_char so t hat inst ances of t he quot e charact er occurring wit hin dat a values will be doubled in t he out put . The eol opt ion indicat es t he line- t erm inat ion sequence. Norm ally, Text : : CSV_XS leaves it t o you t o print t he t erm inat or for out put lines. By passing a non- undef eol value t o new , t he m odule adds t hat value t o every out put line aut om at ically. The binary opt ion is useful for processing dat a values t hat cont ain binary charact ers. The colum n labels are available in sth-{NAME} aft er invoking execute . Each line of out put is produced using combine and string . The combine m et hod t akes an array of values and convert s t hem t o a properly form at t ed st ring. string ret urns t he st ring so w e can print it .

10.19 Converting Datafiles from One Format to Another

10.19.1 Problem

You want t o convert a file t o a different form at t o m ake it easier t o work wit h, or so t hat anot her program can underst and it .

10.19.2 Solution

Use t he cvt _file.pl convert er script described here.

10.19.3 Discussion

The m ysql_t o_t ext .pl script discussed in Recipe 10.18 uses MySQL as a dat a source and produces out put in t he form at you specify via t he - - delim , - - quot e, and - - eol opt ions. This sect ion describes cvt _file.pl, a ut ilit y t hat provides sim ilar form at t ing opt ions, but for bot h input and out put . I t reads dat a from a file rat her t han from MySQL, and convert s it from one form at t o anot her. For exam ple, t o read a t ab-delim it ed file dat a.t xt , convert it t o colon- delim it ed form at , and writ e t he result t o t m p, you would invoke cvt _file.pl like t his: cvt_file.pl --idelim=\t --odelim=: data.txt tmp The cvt _file.pl script has separat e opt ions for input and out put . Thus, whereas m ysql_t o_t ext .pl has j ust a - - delim for specifying t he colum n delim it er, cvt _file.pl has separat e - - idelim and - - odelim opt ions t o set t he input and out put line colum n delim it ers. But as a short cut , - - delim is also support ed; it set s t he delim it er for bot h input and out put . The full set of opt ions t hat cvt _file.pl underst ands is as follows: - - idelim = str , - - odelim = str , - - delim = str Set t he colum n delim it er sequence for input , out put , or bot h. The opt ion value m ay consist of one or m or e char act er s. - - iquot e = c , - - oquot e = c , - - quot e = c Set t he colum n quot e char act er for input , out put , or bot h. - - ieol = str , - - oeol = str , - - eol = str Set t he end- of- line sequence for input , out put , or bot h. The opt ion value m ay consist of one or m ore charact ers. - - iform at = format , - - oform at = format , - - form at = format , Specify an input form at , an out put form at , or bot h. This opt ion is short hand for set t ing t he quot e and delim it er values. For exam ple, - - ifor m at =csv set s t he input quot e and delim it er char act er s t o double quot e and com m a. - - ifor m at =tab set s t hem t o no quot es and t ab. - - ilabels, - - olabels, - - labels Expect an init ial line of colum n labels for input , w rit e an init ial line of labels for out put , or bot h. I f you request labels for t he out put but do not r ead labels fr om t he input , cvt _file.pl uses colum n labels of c1 , c2 , and so for t h. cvt _file.pl assum es t he sam e default file form at as LOAD DATA and SELECT INTO ... OUTFILE , t hat is, t ab- delim it ed lines t erm inat ed by linefeeds. cvt _file.pl can be found in t he t ransfer direct ory of t he recipes dist ribut ion. I f you expect t o use it regularly, you should inst all it in som e direct ory t hat s list ed in your search pat h so t hat you can invoke it from anywhere. Much of t he source for t he script is sim ilar t o m ysql_t o_t ext .pl, so rat her t han showing t he code and discussing how it works, I ll j ust give som e exam ples illust rat ing how t o use it : • Read a file in CSV form at wit h CRLF line t erm inat ion, writ e t ab-delim it ed out put wit h linefeed t erm inat ion: • cvt_file.pl --iformat=csv --ieol=\r\n --oformat=tab --oeol=\n \ data.txt tmp • Read and writ e CSV form at , convert ing CRLF line t erm inat ors t o carriage ret urns: cvt_file.pl --format=csv --ieol=\r\n --oeol=\r data.txt tmp • Produce a t ab-delim it ed file from t he colon-delim it ed et c passwd file: cvt_file.pl --idelim=: etcpasswd tmp • Convert t ab-delim it ed query out put from mysql int o CSV form at : • mysql -e SELECT FROM profile cookbook \ | cvt_file.pl --oformat=csv profile.csv

10.20 Extracting and Rearranging Datafile Columns