A reference is a scalar that is like an arrow pointing to another Perl variable. In the real world, names are one kind of reference that you are already familiar with. For instance, consider the person who developed the theory of relativity. He was a complicated living organism (a human being), but if you want to talk about him, you would refer to him by his name, Albert Einstein. Unlike the name of a person (e.g. Peter Smith), a reference is unambiguous because it can only point to one distinct variable at a time.

Creating References

There are two ways to create references to variables. The first method is to precede the variable name with a \ symbol, such as:

$arrayRef = \@myArray;         # $arrayRef holds a reference to @myArray
$hashRef = \%myHash;           # $hashRef holds a reference to %myHash

The second method uses un-named references, or reference literals. This is analogous to using the number 55 or the string "Hi Bob\n" in a program without storing them in variables first. A statement of the form [ item, item, item, ...] creates a new array and returns a reference to the array. A statement of the form { item, item, item, ... } creates a new hash and returns a reference to the hash. For example:

$arrayRef = [2 , "hello", undef, 13 ]; # $arrayRef holds a reference to an array
$hashRef = { APR => 4, AUG => 8 };     # $hashRef holds a reference to a hash

The references created by the two methods are equivalent. For example, the statement: $arrayRef = [ 1, 2, 3 ]; is equivalent to:

@array = (1, 2, 3);
$arrayRef = \@array;

Using References

Since a reference is a scalar variable, you can store and assign the reference like any other scalar variable. For example, if $arrayRef is a reference to an array and $hashRef is a reference to a hash, then you can write:

$x = $arrayRef;               # $x holds a reference to the array
$arr[3] = $hashRef;           # $arr[3] holds a reference to the hash
$y = $arr[3];                 # $y holds a reference to the hash

One way to access the data that is referred to by the reference is to use the {} operator. If we store a reference to an array in the variable $arrayRef, then we can access the array that is being referenced by using the statement {$arrayRef}. Compare the use of an array and the use of a reference to an array:

Statement using a normal array Equivalent statement using a reference to the same array Result of executing either statement
@myArray @{$myArrayRef} An array
sort @myArray sort @{$myArrayRef} Sort the array
$myArray[5] ${$myArrayRef}[5] The 6th element of the array
$myArray[5] = 55 ${$myArrayRef}[5] = 55 Assign the value 55 to the 6th element of the array

References to hashes can also use the {} operator:

Statement using a normal hash Equivalent statement using a reference to the same hash Result of executing either statement
%myHash %{$myHashRef} A hash
keys %myHash keys %{$myHashRef} Obtain the keys in the hash
$myHash{'google'} ${$myHashRef}{'google'} The element in the hash indexed by the string 'google'
$myHash{'google'} = "Search Engine" ${$myHashRef}{'google'} = "Search Engine" Assign element in the hash indexed by the string 'google' the string value "Search Engine"

You may find the syntax using the {} operator difficult to read in some cases. There is an alternative syntax using the -> operator. Here are a few examples:

This statement... ...is equivalent to this statement
${$myArrayRef}[2] $myArrayRef->[2]
${$myHashRef}{'google'} $myHashRef->{'google'}

Be sure not to confuse the statements $myArrayRef->[2] and $myArrayRef[2]; they are totally different. The first returns the 3rd element of an array referenced by the variable $myArrayRef. The second returns the 3rd element of an array variable @myArrayRef. The same thing applies to the arrow syntax and hash references. Make sure you understand the difference between the statements $myHashRef->{'google'} and $myHashRef{'google'}.

Examples

Now that you have been introduced to references and their syntax, let us look at some examples of what references can do. One use of references is to combine them with Perl's three built-in data types to create new data structures. Our two examples will demonstrate how to accomplish this.

In the first example, we will use references to create multi-dimensional arrays. Recall that the value of the expression ['foo', 'bar', 'baz'] is a reference to an anonymous array containing three elements (the strings 'foo', 'bar', and 'baz'). Now examine this:

@twoDArray = ( 
    [1],
    [2, 3]
    [4, 5, 6],
    [7, 8, 9, 10]
);

The array @twoDArray contains four elements, each of which is a reference to an array.  For example, $twoDArray[2] is one of the references; it refers to the array (4, 5, 6). How would we access elements of the array (4, 5, 6)? We can do this using the arrow operator (->). For example:
$twoDArray[2]->[0]             # refers to the element 4
$twoDArray[2]->[2]             # refers to the element 6

You should now see that the variable @twoDArray behaves very similarly to a two dimensional array, with the additional benefit that not all the rows are required to have the same number of columns. Accessing elements in the data structure is as simple as the statement:

$twoDArray[row]->[column]

where row and column denote the index of the desired element.

Perl allows you to omit certain syntax when using arrays. For instance, when working with an array of references-to-arrays (such as the above example), Perl allows you to omit the arrow (->) operator altogether. The following two statements are thus equivalent:

$twoDArray[0][0]
$twoDArray[0]->[0]

and both refer to the element in row 0, column 0.

This new syntax gives the illusion that Perl supports multidimensional arrays. Let us compare the three different ways of accessing a three dimensional array (an array of references to references to arrays). The array will look like:

@threeDArray = (
    [ [ 1, 2 ], [3, 4], [5, 6] ],
    [ [ 6, 5 ], [4, 3], [2, 1] ],
    [ [ 9, 1 ], [8, 2], [7, 3] ],
    [ [ 4, 6 ], [5, 5], [6, 4] ]
);

Any of the following three expressions will retrieve the 8 from the array:

print $threeDArray[2][1][0];       # this...
print $threeDArray[2]->[1]->[0];   # or this...
print ${${$threeDArray[2]}[1]}[0]; # or even this (yuck!)

In our second example, we are going to create a new data structure to store information about countries and their cities. Suppose that we are given a file that is a list of city-country comma separated pairs. Each pair resides on its own line in the file. For example:

Chicago, USA
Victoria, Canada
Frankfurt, Germany
St. Johns, Canada
Berlin, Germany
Vancouver, Canada
Washington, USA
Helsinki, Finland
New York, USA
Ottawa, Canada

would be a valid list. Our task is to list each country, with a sorted list of the cities it contains, in alphabetical order. The output should look like this:

Canada: Ottawa, St. Johns, Vancouver, Victoria.
Finland: Helsinki.
Germany: Berlin, Frankfurt.
USA: Chicago, New York, Washington.

Assuming that the list of cities is stored in the file city.txt, here is one possible solution (you can ignore the actual file operations. To understand the code, it suffices to know that by the time Perl gets to the chomp;, variable $_ contains the next line in the file).

1     open (INFILE, 'city.txt');
2     while (<INFILE>) {
3         chomp; # avoid \n on last line
4         my ($city, $country) = split /, /;
5         push @{$table{$country}}, $city;
6     }
7
8     foreach $country (sort keys %table) {
9        print "$country: ";
10       my @cities = @{$table{$country}};
11       print join ', ', sort @cities;
12       print ".\n";
13    }

First, let us get a feel for the design of this script. We are going to use a hash of references-to-arrays as our main data structure. The keys in the hash will be the country names, while the values in the hash will be references-to-arrays. The referenced arrays will hold the city names.

The while loop in the program, parses the city.txt file and constructs the hash of references-to-arrays named %table (or country/city table). The line that does all of the work is line 5. The push function adds the city-name ($city at the end of the anonymous array accessed by looking in the hash table ($table) at index $country and following the reference. To make things clearer, look at the line:

push @array, $city;

Now replace the array name with the reference {$table{$country}} and you should begin to see this more clearly. Remember, we are indexing the hash variable $table with the key $country. The result is a reference to an (anonymous) array. Dereferencing this reference to an array with the {} gives us the array itself, to which we add the value held in variable $city.

The foreach loop prints out the cities in sorted order by sorting the keys in %table. The @cities array variable is constructed by indexing the table with the country name and following the returned reference. The @cities array is then sorted and joined into a string with each element separated by a comma (,) that is printed to the screen.

Suppose that the program has just read the first line in its input. Execution is at line 5, $country is 'USA', and $city is 'Chicago'. Since this is the first city in the USA, there is no key named 'USA' in the table yet, and so the value $table{$country} is undefined. Perl sees that you are trying to push 'Chicago' onto an array that does not exist, and thus it creates a new array, adds the value 'Chicago' at the end of the array, and places a reference to the array into $table.

Loose Ends

Remember that you can create a reference to any data type in Perl. This includes scalars, arrays, hashes, functions, and even other references. You should also know that it is possible to omit the curly brackets when following a reference, for example:

@$ref        # This is the same as...
@{$ref}      # this.

$$ref[5]     # And this...
${$ref}[5]   # is the same as this.

You can check whether or not a variable is a reference or not using the ref function. For example:

$aRef = [1, 2, 3];
print "A reference!\n" if ref($aRef);

The ref function returns true if its argument is a reference, false otherwise. If the argument provided to the ref function is a reference to a hash data type, it will return the string "HASH". If the argument is a reference to an array data type, it will return the string "ARRAY".

What happens if you try to use a reference in the same manner as a string by attempting to print it out? You will display strings such as "ARRAY(0x80f5dec)" or "HASH(0x826afc0)". If you ever encounter output from a script (that you are not debugging) that is similar, you will know right away that you have displayed a reference by accident. It is possible to compare the string representations of references with the eq operator. However, if you wish to test whether or not two references are the same, use the == operator on the references directly as it is much faster.

Lastly, there is the concept of symbolic (also called soft) references whereby a string can be used as though it was a reference. For example, this:

@myArray = ("one\n", "two\n", "three\n", "four\n");
$aRef = \@myArray;

print ${$aRef}[0];
print ${$aRef}[1];
print ${$aRef}[2];
print ${$aRef}[3];

is the same as this:

@myArray = ("one\n", "two\n", "three\n", "four\n");
$aRef = "myArray";

print ${$aRef}[0];
print ${$aRef}[1];
print ${$aRef}[2];
print ${$aRef}[3];