How To Run and Debug a Perl Script
Context, and more on variables
Back to the introduction
Just like every other programming or scripting language you have been exposed to, Perl
allows you to create and use variables (we will call
Perl falls somewhere in between these two categories: it has several kinds of variables, and we can determine the kind of a variable by looking at the first character of its name. However, the exact type of the value associated with a given variable name may not be known until the program is run. Perl uses the following conventions for variable names:
Scalars are the most commonly used variables in Perl. A scalar variable (or, more
precisely, the use of a variable in a scalar context), always starts with a
If you attempt to use a scalar variable that has not been assigned a value, you will
get the value 0 or "", depending on whether it is to be used as number or as a string. The
statement
Unless you know what you are doing, do not use numeric or special characters
immediately after the
When looking at a Perl program, because of your C++ background, you will probably be asking yourself: are these scalars of type int, float, char or string? The answer is: all of the above. Think of scalars as magical strings. Their type will change according to the value that they contain. For example, the program
$string_or_int = "Hello"; print "$string_or_int World\n"; $string_or_int = 12; print "Twelve * two equals: "; print $string_or_int * 2; print "\n";
would output
Hello World Twelve * two equals: 24
Arrays are similar to their C++ counterpart, and contain a group of scalars that are
accessed by their position in the array. An array name must start with the
@letters = ('a', 'b', 'c', 'd');
creates an array with 4 elements. Similar to predefined scalars, there also exist
predefined arrays such as
As in most other programming languages, a specific element in a Perl array is
referenced using the
$arrayname[index]
Observe that because each array element is a scalar, we use
To assign more than one element of an array to individual variables, one can use the
(var1, var2, ..., vart) = @arrayname;by which
($x, $y, @a, $z) = (1, 2, 3, 4, 5)
would assign 1 to
Note that if we use an array name, such as
@numbers = (5, 6, 7, 8);
@letters = ('a', 'b', 'c', 'd');
@numbers_and_letters = (@numbers, @letters);
$size = @numbers."";
print "Look mom, I can count: @numbers\n";
print "I know some characters too. See: @letters\n";
print "I'm on a roll now: @numbers_and_letters\n";
print "The first letter in the alphabet is $letters[0]\n";
print "The last element in my numbers array is $numbers[$#numbers]\n";
print "The number of elements stored in my numbers array is $size\n";
and the output produced is:
Look mom, I can count: 5 6 7 8 I know some characters too. See: a b c d I'm on a roll now: 5 6 7 8 a b c d The first letter in the alphabet is a The last element in my numbers array is 8 The number of elements stored in my numbers array is 4
As you can see from this example, the use of the array data type in Perl is similar to the use of arrays in any other programming language.
The third and final type of Perl variable is referred to as a
The name of a hash begins with the
%responsibilities = (
"Ian" => "instructor", "Moyra" => "systems administrator",
"George" => "instructor", "Bob" => "department head"
);
print "Ian's job: $responsibilities{'Ian'} \n";
print "Bob's job: $responsibilities{'Bob'} \n";
will produce the following output:
Ian's job: instructor Bob's job: department head
Keys must be unique. However, values do not have to satisfy this condition and hence we can have many keys with the same value (e.g. "instructor"). You can see how the hash data type has the potential to be very useful. A hash data type is to Perl what the alist (association list) data structure is to Scheme.
Suppose we have a hash table of size 10 that we will use to store integer items. First we wish to insert the integer 1. Using the modulo division hash function, we obtain the address 1 (since 1 mod 10 = 1). Next we want to insert the integer 11. Using the modulo division hash function, we obtain the address 1 (since 11 mod 10 = 1). These two insertion operations have resulted in a collision. That is, two keys have mapped to the same index in the hash table. What can we do to resolve the collision? The simplest collision resolution technique is called Linear Probing. Linear Probing works as follows. If you encounter a collision at address A, look at address A+1 to see if it is occupied. If it is occupied, look at address A+2 and so on. When an unoccupied address is identified, insert the item into it.
To retrieve items from a hash table that uses linear probing collision resolution, we apply the hash function to the requested item and obtain an address, A. We look up the address A in the hash table. If the address is not occupied, we conclude that the item is not in the hash table. If the address is occupied, we check to see that the correct item is the occupant. If this is not the case, we look at A+1, and so on.
The advantage of using hashing for storing a large number of items instead of using linked-lists is that the search and insert operations can be done in some constant amount of time (the amount of time used to compute the hash function). When using a linked-list, a traversal of the list is required to search for and insert items. This takes time proportional to the size of the linked list. Compared to standard arrays, hash tables make more efficient use of space for items taken from large data sets. Consider the case where I wish to store two numbers 1 and 1000. In an array, I would store the number 1 in index 1 and the number 1000 in index 1000. Thus, I must use an array with minimum size 1001. If I was using a hash table, I could use a table of size 5, store the number 1 at index 1 and the number 1000 at index 3 (using the modulo division hash function).
How To Run and Debug a Perl Script
Context, and more on variables
Back to the introduction
Copyright (C) 2000 - 2002, The University of British Columbia