Perl

06. Subroutines

Introduction

Can create a subroutine in Perl to contain a block of code, like any other useful language.

Can be anywhere in the script file, the compile phase will find them.

Can reuse name of variable for a subroutine (but don't). If you define two sub's with same name, the later one overrides the earlier one (-w will yell at you).

Variables referred to in the subroutine are treated as globals, unless you use the special forms (my, local) described later.

Syntax

sub name {
   # body
}

sub hello {
   print "hello, world!\n";
}

# to invoke...
say_hello();

Use "&" in front of subroutine invocation for:

creating a reference to a subroutine
make sure perl knows you are refering to a subroutine call in ambiguous situations
making perl automatically pass @_ from local scope as the argument to the function call

Return Values

Return value of sub is the value of the last stmt executed.

sub add_two {
   $x + $y;
}

$x = 3; $y = 4;
$z = add_two();       # prints 7

Use return to more explicitly send a value (scalar or list) back to the caller.

sub return_seven {
   print "returning 7\n";

   return(7);
}

$seven = return_seven();

sub return_data {
   @data = (1, 2, 3);

   return(@data);       # or -- return(1, 2, 3);
}

sub addup {
   return($x + $y);     # result of expression, on global vars
}

Arguments

More useful when subs can take values from caller to work with.

The special @_ array variable is where you'll find the arguments to your subroutine.

sub add_two {
   return($_[0] + $_[1]);
}

sub add_three {
   @values = @_;

   return($values[0] + $values[1] + $values[2]);
}

It is up to you to remember how many args something should be taking, perl doesn't care if you don't use them all or if you try to use too many (undef).

Recommended that in any reasonably complex subroutine you get in the habit of renaming $_[0] and the like to meaningful things.

sub add_two {                          sub add_two {
   $one = $_[0];                          $one = shift(@_);
   $two = $_[1];                          $two = shift(@_);

   return($one + $two);                   return($one + $two);
}                                      }

@_ is private to your subroutine, in that it masks off any from the calling scope.

But, it is really a alias to the real values from the caller. Changing it's values causes changes to the actual variables that were used to sending values in.

sub square {
   foreach (@_) {
      $_ **= 2;       # changes list in caller
                      #   "**" is exponentiation operator
                      #   using in same manner as +=
   }
}

my Variables

If you increment the variable $sum in your subroutine, you could be affecting a $sum elsewhere. You need a way to create local variables for your subroutines.

Use my to create variable local to the block it is declared in.

sub thing {
   my($a);

   $a = 7;                # changes local $a, not global

   my @b;                 # create local array b, can omit parens

   my($c) = 0;            # can initialize
   my($c) = shift(@_);    # can init from params

   my($d, @e) = @_;       # declare/init a bunch at once
}

my uses "lexical scoping", the variable is only visible to the block it is declared in and blocks enclosed in that one.

$x = 7;
{
   $x = 9;
   print "inside: $x\n";     # 9
}
print "outside: $x\n";       # 9

$x = 7;
{
   my($x) = 9;               # x now local to this block
   print "inside: $x\n";     # 9
}
print "outside: $x\n";       # 7

Can hide a "my" value for a subroutine in a block with that sub, so that only that sub has it, a static value for all calls.

{
   my($count) = 0;
   sub countme {
      $count++;
      print "I've been called $count times\n";
   }
}

countme;     # 1
countme;     # 2
countme;     # 3
print "count is now $count\n";        # undef, warning

local Variables

Another type of private variable for subroutines are local variables.

sub one {                                   sub three {
   local($x);                                  my($x);

   $x = 7;                                     $x = 3;

   two();                                      four();
}                                           }

sub two {                                   sub four {
   print "$x\n";   # prints 7                  print "$x\n";   # undef, error
}                                           }

local vars use "dynamic scoping", visible to routine declared and all routines called from there.

Comparison

Perl Gods generally recommend using my over local for many reasons. my's are safer (smaller scope) and faster than local. They are also faster than regular, global vars.

Many gurus also recommend putting "use strict;" at top of programs. This is a compiler directive or pragma that tells perl not to accept any variables that havn't been been declared through my. This causes things to go faster and for any misspelled variables to be flagged at compile time.

Context of sub

You can find out what context you are in, and respond appropriately, with the wantarray function:

#!/usr/local/bin/perl -w

sub doublethem {
   my(@list) = @_;

   foreach $i (@list) {
      $i *= 2;
   }

   if (wantarray) {          # or: return wantarray ? @list : scalar(@list);
      return(@list);
   }
   else {
      return(scalar(@list));
   }
}

@list = (1, 2, 3);
@newlist = doublethem(@list);
print "list now: @newlist\n";

$changed = doublethem(@list);
print "changed: $changed\n";

output:

list now: 2 4 6
changed: 3