7. Pointers
7.1 Pointers as basic data types Pointers are fundamantal data types in C++. A pointer is an address of a storage unit in the main memory.
1028 1032 1033
-------------+--------------------+-----+-----+----------------
| -35 | 'H' | 'e' |
-------------+--------------------+-----+-----+----------------
Pointers can be variables or constants.
type * name;Variable name is defined as a pointer to data type type.
Example:
char *cp; int *ip; float *fp;Pointers to different base types are considered to be different data types by C++. Hence the above are all of different data types. For the variables
int x = -35; int *ip = &x; // &x is the address of x.if x has been allocated the cell with address 1028 as shown above, then the relation between ip and x can be depicted as follows:
1028
+--------+ +--------------------+
ip | 1028 o-|--------> | -35 | x
+--------+ +--------------------+
Let x be a variable in a C++ program. When the program is executed, x will be created in the main memory. The address of x in the main memeory is denoted by &x. Once a variable is created, its address is a constant throughout the execution of the program.
An integer is not an address. However, if we want to treat an integer as an address of a main memory unit, we can use the type casting operator, e.g. (char *) 1701, to explicitly change it to an address. An address like this is called an absolute address.
Example:
int *iptr1, *iptr2, ival = 23; iptr1 = &ival; iptr2 = reinterpret_cast< int * >(1900);Absolute address should not be used in application programs since in a multi-user environment, we don't know which part of the memory is allocated for our program.
Note that the address operator can only be applied to a variable, except a register variable.
Example: For the declarations
int i = 3, j = 5, k, *p = &i,
*q = &j, *r;
double x = 1.01;
are the following expressions legal? If so, what are their values?
p == &i **&p r = &x &(*p + 2) &100 *p/ *q *p/*q *p**q
7.1.1 The typedef construct
It is possible to create synonyms of pointer types so that pointer variables can be declared like other variables without the need to place an asterisk in front of each pointer variable. The typedef construct is designed to do exactly this.
Example:
typedef int* IntPtr; typedef double* RealPtr; IntPtr ip1, ip2; RealPtr rp1, rp2;Example:
typedef int INTEGER; typedef double REAL; INTEGER i, j, k; REAL x, y;There are two advantages to using defined pointer type names.
int *p1, p2; // Error if p2 is inteneded
// to be a pointer.
7.2 Accessing a storage cell via its address
Accessing a memory cell via a pointer to it is called dereferencing the pointer, or indirection.
+------+ +------+
ptr | o---|----> | *ptr |
+------+ +------+
where * preceeding a pointer is called the dereferencing or indirection operator.
Example:
char *cp, ch = 'X'; int *ip, i, j = -1; cp = &ch; ip = &j; cout << *cp << ' ' << *ip << endl; i = j; j = *ip + 5; cout << i << ' ' << j << endl;
int *ip, i, j, k;are the following statements legal?
(*ip)++; k = i**j; i = ip; j = &ip; ip = &i;
7.2.1 Dangling pointers
An uninitialized pointer variable is called a dangling pointer. Trying to dereference a dangling pointer produces unpredictable results, and usually disastrous. This is a very common source of errors among C and C++ programmers. Before applying the dereferencing operator to a pointer variable, one should be certain that the pointer variable points to some memory cell.
C++ has no built-in test to check dangling pointers. One way to avoid them is to set any dangling pointer to NULL which is a pointer pointing to no valid cell. Then a program can first check if a pointer variable is equal to NULL before dereferencing it.
7.2.2 Multiple levels of indirection
The number of *'s attached to a pointer to reference a base type value is called the level of indirection. A pointer variable can have many levels of indirection.
Example:
int i = 5, *p = &i, **pp = &p, ***ppp = &pp;Then i, *p, **pp, and ***ppp all refer to the same cell.
7.3 Operations on pointers
One can assign the value of one pointer to a pointer varaible, as long as the two pointers are of the same type.
Two pointers can be compared using ==, < , >, etc. In these cases, the numerical values of the two pointers as integers are compared.
Integers can be added to or subtracted from pointers except those pointing to void. A pointer to void is called a generic pointer which are important with the C based memeory allocation functions.
The integer added or subtracted is scaled by the size of the base type of the pointer. Let v(p) be the value of p as an integer. We have
v(p+i) = v(p) + i*sizeof(type)Hence p + i is the address of the i-th cell, of size of the base type, beyond p.
The increment and decrement operator can also be applied to a pointer. They will advance the pointer to the next cell, or retreat the pointer to the previous cell of the same type.
The difference of two pointers of the same type is the number of cells of the base type between these two addresses.
Example:
int *ip = reinterpret_cast< int * >(1320); double *dp = reinterpret_cast< double *>(1600); ip++; dp += 20; // What are the values of ip and // dp at this point?
7.4 Pointers and call-by-reference parameters
The address operator & is the same symbol used in function prototypes to specify a call-by-reference parameter. This is not a coincidence. In fact, a call-by-reference argument is implemented by giving its address (a reference) to the called function. Thus the called function knows where to find the argument, and it refers to that argument by using the corresponding parameter's name. So these two uses of the symbol & are similar. However, the usgaes are slightly different and we will consider them to be two different usages of the symbol &.
7.5 Arrays and pointers
An array's name represents a pointer to element zero of that array. Hence for the definitions:
int a[10]; int *iptr;The name a is the same as &a[0]. The assignment
iptr = a; // same as iptr = &a[0];is well-defined. In this case, a[i], *(a+i), iptr[i], and *(iptr+i) are all equivalent. In fact, a[i] is changed to *(a+i) by the compiler immediately.
Even though iptr and a can be used interchangeably in the above case, there are significant differences between them:
Example:
What is wrong?
char a[4]; char *p; strcpy(a, "ABC"); strcpy(p, "abc");
7.6 Arrays as parameters in function call
If an array is a parameter in a function prototype, its name can only be passed by value. Since the name of an array is the address of its first element, when an array is passed as an argument in a function call, the address of its first element is copied. Hence the entries of the array are passed by reference.
In the call f(x, k), where x is an array, and the definition of f is
f (int a[], int size)
{
...
}
the pointer to x[0] is copied to a in function f.
The array entries are not copied.
An equivalent definition of f is
f (int *a, int size)
{
...
}
Within function f, the array entries can be accessed either by
a[i] or *(a+i). However, the array notations are
recommended.
Example:
The following are equivalent implementation of the same function. Once again, the first implementation is recommended.
int find_max (int a[], int n)
// find index of maximum element
// in array a of size n.
{
int max = a[0], index = 0, i;
for (i = 1; i < n; i++)
{
if (max < a[i])
{
max = a[i];
index = i;
}
}
return (index);
}
int find_max (int *a, int n)
// find index of maximum element
// in array a of size n.
{
int max = *a, index = 0, i;
for (i = 1; i < n; i++)
{
if (max < *(a+i))
{
max = *(a+i);
index = i;
}
}
return (index);
}
7.7 Dynamic variables
Since a pointer can be used to refer to a variable, a program can manipulate variables even if the variables have no identifiers to name them. This is a very important feature of C++ because in many applications, the actual number of variables cannot be determined until the program is executed. Some variables should be created as needed while the program is being run. These nameless variables are called dynamic variables. Dynamic variables can be created using the new operator.
Example:
int *p; p = new int; cin >> *p; *p += 7; cout << *p;The operation new int above creates a nameless variable of type int and returns a pointer to it. Thus p is a pointer to this newly created variable. As a result, the variable can be refered to as *p. If new fails to create a new variable of the required type, it returns the special pointer NULL. It is important to always test the returned value of new before using the variable that it creates.
Example:
What is printed?
// Program to demonstrate dynamic varaibles.
#include < iostream.h >
void main()
{
int *p, *q;
p = new int;
*p = 42;
q = p;
cout << "*p == " << *p << endl;
cout << "*q == " << *q << endl;
*q = 53;
cout << "*p == " << *p << endl;
cout << "*q == " << *q << endl;
p = new int;
*p = 88;
cout << "*p == " << *p << endl;
cout << "*q == " << *q << endl;
cout << "Hope you got the point!.\n";
}
A dynamic variable, once created, will stay alive until the end of the program unless it is destroyed by the delete operation. For example, the operation
delete p;eliminates the dynamic variable pointed to by p and returns the memory cell to the system. The memory can then be reused to create new dynamic variables. A deleted pointer becomes a dangling pointer which should not be dereferenced.
7.8 Dynamic arrays
C++ demands that the sizes of all regular arrays must be known at the compilation time. However, one may not know the size of the array when the program is written. The size is only determined in run time. With the regular arrays, one has to estimate the largest possible size one may need. There are two problems with this:
Dynamic arrays can solve this problem very nicely.
Example:
// Program to demonstrate dynamic array.
#include < iostream.h >
void main()
{
IntPtr p;
int a[10];
int index;
for (index = 0; index < 10; index++)
a[index] = index;
p = a;
for (index = 0; index < 10; index++)
cout << p[index] << ' ';
cout << endl;
for (index = 0; index < 10; index++)
p[index]++;
for (index = 0; index < 10; index++)
cout << a[index] << ' ';
cout << endl;
}
To destroy a dynamic array, p say, one can use
delete [] p;
The square bracket tell C++ that a dynamic array variable is being eliminated, so the system checks the size of the array and removes that many cells.
Example:
// Program that sorts a dynamic array.
#include < iostream.h >
#include < stdlib.h >
typedef int* IntPtr;
void fill_array(int a[], int size);
void sort(int a[], int size);
void main()
{
int array_size, i;
IntPtr a;
cout << "How many numbers will be sorted? ";
cin >> array_size;
a = new int[array_size];
// Don't concern about new failure for the time being.
fill_array(a, array_size);
sort(a, array_size);
cout << "In sorted order the numbers are:\n";
for (int i = 0; i < array_size; i++)
cout << a[i] << ' ';
cout << endl;
delete [] a;
}
void fill_array(int a[], int size)
{
cout << "Enter " << size << " integers.\n";
for (int i = 0; i < size; i++)
cin >> a[i];
}
void sort(int a[], int size)
{
...
}
7.9 Multidimensional Arrays
int a[m][n];The entries are referenced by
a[i][j], 0 <= i < m, 0 <= j < n.
What is the output of the following?
int a[5][6]; cout << sizeof(a) << endl; cout << sizeof(a[0]) << endl;
*(&a[0][0] + i * n + j) *(a[i] + j) *(*(a + i) + j) (*(a + i))[j]
int *a; a = new int[m * n];The cells can be accessed by *(a + i*n + j), but not the standard array notation a[i][j].
int **a;
a = new (int *)[m];
for (i = 0; i < m; i++)
a[i] = new int[n];
The situation can be visualized as:
+------+ +---+---+---+---+---+---+---+---+
a --> | a[0] |---> | | | | | | | | |
+------+ +---+---+---+---+---+---+---+---+
| a[1] |---> | | | | | | | | |
+------+ +---+---+---+---+---+---+---+---+
| . | | . . . . . . . |
| . | | . . . . . . . |
| . | | . . . . . . . |
+------+ +---+---+---+---+---+---+---+---+
|a[m-1]|---> | | | | | | | | |
+------+ +---+---+---+---+---+---+---+---+