Command-Line Arguments and Arrays

IT-SC 110 This new variable is in effect only during the time the program is in the subroutine. Notice in the output from the print statement at the end of Example 6-2 that even though a variable called dna is lengthened inside the subroutine, the original variable, dna , outside the subroutine isnt changed.

6.3 Command-Line Arguments and Arrays

Example 6-3 is another program that uses subroutines. You use the command line to give the program information it needs such as filenames, or strings of DNA without having to interactively answer the programs prompts. This is useful if youre scheduling a program to run at a time when you wont be there, for instance. Example 6-3 also shows a little more about using arrays. Youll see how to use subscripts to access a specific element of an array. For command-line programs, you type the name of the program, followed by the arguments to the program, if any, and then hit the Enter or Return key to start the program running. In Example 6-3 , when the user types the program name, she follows that with the argument, which, in this case, is just the string of DNA in which shell count the Gs. So the program is called and returns an answer like so: AAGGGGTTTCCC The DNA AAGGGGTTTCCC has 4 Gs in it Of course, many programs come with a graphical user interface GUI. This gives the program some or all of the computer screen and usually includes such things as menus, buttons, and places to type in values to set parameters from the keyboard. However, many programs are run from a command line. Even the newer MacOS X, which is built on top of Unix, now provides a command line. Although most Windows users dont use the MS-DOS command window much, its still useful, e.g., for running Perl programs. As already mentioned, running a program noninteractively, passing parameters in as command-line arguments, allows you to run the program automatically, say in the middle of the night when no one is actually sitting at the computer. Example 6-3 counts the number of Gs in a string of DNA. Example 6-3. Counting the Gs in some DNA on the command line usrbinperl -w Counting the number of Gs in some DNA on the command line use strict; Collect the DNA from the arguments on the command line IT-SC 111 when the user calls the program. If no arguments are given, print a USAGE statement and exit. 0 is a special variable that has the name of the program. myUSAGE = 0 DNA\n\n; ARGV is an array containing all command-line arguments. If it is empty, the test will fail and the print USAGE and exit statements will be called. unlessARGV { print USAGE; exit; } Read in the DNA from the argument on the command line. mydna = ARGV[0]; Call the subroutine that does the real work, and collect the result. mynum_of_Gs = countG dna ; Report the result and exit. print \nThe DNA dna has num_of_Gs G\s in it\n\n; exit; Subroutines for Example 6-3 sub countG { return a count of the number of Gs in the argument dna initialize arguments and variables mydna = _; mycount = 0; Use the fourth method of counting nucleotides in DNA, as shown in Chapter Four, Motifs and Loops IT-SC 112 count = dna =~ trGg; return count; } Now lets look at how this program works, while examining and explaining the new features. For starters, notice the new line: use strict; which I will use from now on to ensure all variables are declared with my , thus enforcing lexical scoping. Perl has some special variables it sets so you can easily use the arguments from the command line. Every Perl program has an array variable ARGV that contains any command-line arguments. Also, theres a special variable called a zero that has the name of the program as it was called from the command line. Notice in Example 6-3 that an informative message is defined in the variable USAGE and that it begins with the value of the variable , followed an indication of the arguments the program needs. This is a common practice; if the user doesnt give the program what it needs, which is determined by some kind of test, the program prints information about how to properly use it and exits. In fact, this program does check to see if any arguments were typed on the command line. It checks if ARGV has anything in it, in which case it evaluates to true ; or if it is completely empty, in which case it evaluates to false . If you want the program to require an argument be given, you can use the unless conditional, and if ARGV is empty, to print out the USAGE statement and exit the program: unlessARGV { print USAGE; exit; } The next bit of code shows something new about arrays, namely, how to extract one element from an array, as referenced by a subscript. In other words, it shows how to get at the first, fourth, or whichever element. The code in Example 6-3 shows how to extract the first element, which as youve seen, is numbered 0: mydna = ARGV[0]; Now you already know there is a first element, since youve just tested to make sure the array isnt empty. You get the first element of array ARGV by changing the to a and appending square brackets containing the desired subscript; 0 for the first element, 1 for the second element, and so on. This syntax indicates that since youre now looking at just one element of the array, and its a scalar variable, you use the dollar sign, as you would any other scalar variables. IT-SC 113 In Example 6-3 , you copy this first and only element of the command-line array ARGV into the variable dna . Finally comes the call to the subroutine, which contains nothing new but fulfills a dream from the final paragraph of Chapter 5 : mynum_of_Gs = countG dna ;

6.4 Passing Data to Subroutines