IT-SC 110
This new variable is in effect only during the time the program is in the subroutine. Notice in the output from the
print statement at the end of
Example 6-2 that even
though a variable called dna
is lengthened inside the subroutine, the original variable, dna
, outside the subroutine isnt changed.
6.3 Command-Line Arguments and Arrays
Example 6-3 is another program that uses subroutines. You use the command line to
give the program information it needs such as filenames, or strings of DNA without having to interactively answer the programs prompts. This is useful if youre scheduling
a program to run at a time when you wont be there, for instance.
Example 6-3 also shows a little more about using arrays. Youll see how to use
subscripts to access a specific element of an array. For command-line programs, you type the name of the program, followed by the
arguments to the program, if any, and then hit the Enter or Return key to start the program running. In
Example 6-3 , when the user types the program name, she follows
that with the argument, which, in this case, is just the string of DNA in which shell count the Gs. So the program is called and returns an answer like so:
AAGGGGTTTCCC The DNA AAGGGGTTTCCC has 4 Gs in it
Of course, many programs come with a graphical user interface GUI. This gives the program some or all of the computer screen and usually includes such things as menus,
buttons, and places to type in values to set parameters from the keyboard.
However, many programs are run from a command line. Even the newer MacOS X, which is built on top of Unix, now provides a command line. Although most Windows
users dont use the MS-DOS command window much, its still useful, e.g., for running Perl programs. As already mentioned, running a program noninteractively, passing
parameters in as command-line arguments, allows you to run the program automatically, say in the middle of the night when no one is actually sitting at the computer.
Example 6-3 counts the number of Gs in a string of DNA.
Example 6-3. Counting the Gs in some DNA on the command line
usrbinperl -w Counting the number of Gs in some DNA on the command
line use strict;
Collect the DNA from the arguments on the command line
IT-SC 111
when the user calls the program. If no arguments are given, print a USAGE statement and
exit. 0 is a special variable that has the name of the program.
myUSAGE = 0 DNA\n\n; ARGV is an array containing all command-line arguments.
If it is empty, the test will fail and the print USAGE and exit
statements will be called. unlessARGV {
print USAGE; exit;
} Read in the DNA from the argument on the command line.
mydna = ARGV[0]; Call the subroutine that does the real work, and collect
the result. mynum_of_Gs = countG dna ;
Report the result and exit. print \nThe DNA dna has num_of_Gs G\s in it\n\n;
exit; Subroutines for
Example 6-3 sub countG {
return a count of the number of Gs in the argument dna
initialize arguments and variables mydna = _;
mycount = 0; Use the fourth method of counting nucleotides in DNA,
as shown in Chapter Four, Motifs and Loops
IT-SC 112
count = dna =~ trGg; return count;
}
Now lets look at how this program works, while examining and explaining the new features. For starters, notice the new line:
use strict; which I will use from now on to ensure all variables are declared with
my , thus enforcing
lexical scoping. Perl has some special variables it sets so you can easily use the arguments from the
command line. Every Perl program has an array variable ARGV
that contains any command-line arguments. Also, theres a special variable called
a zero that has the name of the program as it was called from the command line.
Notice in Example 6-3
that an informative message is defined in the variable USAGE
and that it begins with the value of the variable , followed an indication of the
arguments the program needs. This is a common practice; if the user doesnt give the program what it needs, which is determined by some kind of test, the program prints
information about how to properly use it and exits.
In fact, this program does check to see if any arguments were typed on the command line. It checks if
ARGV has anything in it, in which case it evaluates to
true ; or if it is
completely empty, in which case it evaluates to false
. If you want the program to require an argument be given, you can use the
unless conditional, and if
ARGV is
empty, to print out the USAGE
statement and exit the program: unlessARGV {
print USAGE; exit;
}
The next bit of code shows something new about arrays, namely, how to extract one element from an array, as referenced by a subscript. In other words, it shows how to get
at the first, fourth, or whichever element. The code in Example 6-3
shows how to extract the first element, which as youve seen, is numbered 0:
mydna = ARGV[0]; Now you already know there is a first element, since youve just tested to make sure the
array isnt empty. You get the first element of array ARGV
by changing the to a
and appending square brackets containing the desired subscript; 0 for the first element, 1 for
the second element, and so on. This syntax indicates that since youre now looking at just one element of the array, and its a scalar variable, you use the dollar sign, as you would
any other scalar variables.
IT-SC 113
In Example 6-3
, you copy this first and only element of the command-line array ARGV
into the variable dna
. Finally comes the call to the subroutine, which contains nothing new
but fulfills a dream from the final paragraph of Chapter 5
: mynum_of_Gs = countG dna ;
6.4 Passing Data to Subroutines