1. Arrays, Pointers, and Structures

Objective: Discuss these three C++ features: arrays, pointers, and structures. We'll show

how pointers and primitive arrays are used;
how the vector is used to implement arrays in C++;
how the string is used to implement strings in C++;
how dynamic memory allocation is used;
how they are passed as parameters to functions;
how structures are used.

Reference: Weiss, Chapter One.

1.1 Pointers

A pointer is an address in the main memory.
The pointer constant &x is the address of variable x. The address operator & can only be applied to a variable.
Declaration of pointer variables:
Example:
```
  int x;
  int *ip;      // LEGAL: ip is uninitialized
  int *jp = &x; // LEGAL: jp now points to x
  int *kp = x;  // ILLEGAL: x is not an address
```
Pointers to different base types are considered to be different data types by C++ and they cannot be assigned to one another directly.
A pointer to void is called a generic pointer.

   int x = -35;
   int *ip = &x;

if x has been allocated the cell with address 1028 as shown above, then the relation between ip and x can be depicted as follows:

                             1028
        +--------+          +--------------------+
     ip | 1028 o-|--------> |                -35 | x
        +--------+          +--------------------+

Accessing a memory cell via a pointer to it is called dereferencing the pointer, or indirection.

          +------+      +------+
     ptr  |  o---|----> | *ptr |
          +------+      +------+

For the declarations:

  int *ip, i, j, k;

are the following statements legal?

  (*ip)++;
  *ip++;
  k = i**j;
  i = ip;
  j = &ip;
  ip = &i;

For the declarations

  int i, j, k, *p = &i,
      *q = &j, *r;
  double x;

are the following expressions legal?

  p == &i
  **&p
  r = &x
  &(*p + 2)
  &100
  *p/ *q
  *p/*q
  *p**q

A common source of error is dereferencing an uninitialized pointer: a dangling pointer.
The number of *'s attached to a pointer to reference a base type value is called the level of indirection.
```
  int i = 5, *p = &i, **pp = &p, ***ppp = &pp;
```
In this case i, *p, **pp, and ***ppp all refer to the same cell.

1.2 Operations on Pointers

Assignment: You can assign the value of one pointer to a pointer varaible, as long as the two pointers are of the same type. pointers of different types can be assigned only with an explicit cast. However, a generic pointer can be assigned a pointer of any type without explicit casting. The NULL pointer can be assigned as a pointer value.
Comparison: Pointers to entries of the same array can be compared using ==, <, >, etc. In these cases, the numerical values of the two pointers as integers are compared. Any pointer can be checked for equality with NULL.
p + i or p - i (i is an integer): Integers can be added to or subtracted from pointers except generic pointers.
The integer added or subtracted is scaled by the size of the base type of the pointer. That is, if p is a pointer to data type type, then (numerical value of p + i) is equal to (numerical value of p) + i * sizeof(type). Hence p + i is the address of the i-th cell, of size of the base type, beyond p.

The increment and decrement operators can also be applied to a pointer. They will advance the pointer to the next cell, or retreat the pointer to the previous cell of the same type.

Example:

  int    *ip = reinterpret_cast<int *>(1320);
  double *dp = reinterpret_cast<double *>(1600);
  ip++;
  dp += 20;
  // What are the values of ip and
  // dp at this point?

Example:

int strcmp(const char *r, const char *s)
{
  while (*r == *s)
  {
    if (*r == '\0')
      return 0;
    r++; s++;
  }
  return (*r < *s) ? -1 : 1;
}

Example:

int strcpy(char *s, const char *cs)
{
   char *tmp = s;
   while (*cs != '\0')
   {
      *(tmp++) = *(cs++);
   }
   *tmp = '\0';
   return s;
}

Pointer subtraction: Pointers of the same type can be subtracted. In practice, only pointers to the entries of the same array are subtracted. In this case, the difference is the offset (number of elements) of the first entry from the second entry.

1.3 Reference Variables

A reference type is an alias for another variable and may be viewed as a pointer constant that is always dereferenced implicitly.
```
  int x = 0;
  int& y = x;
  y += 3;
  cout << "x = " << x << endl;
```
Reference variable must be initialized when they are declared.
They cannot be changed to reference another variable.

Call by reference

#include <iostream>
using namespace std;

void swapWrong(int a, int b)
{
   int tmp = a;
   a = b;
   b = tmp;
}

void swapPtr(int *a, int *b)
{
   int tmp = *a;
   *a = *b;
   *b = tmp;
}

void swapRef(int& a, int& b)
{
   int tmp = a;
   a = b;
   b = tmp;
}

1.4 Arrays and Pointers

An array's name represents a pointer to element zero of that array. Hence for the definitions:

   int a[10];
   int *iptr;

The name a is the same as &a[0]. The assignment

   iptr = a;  // same as iptr = &a[0];

is well-defined. In this case, a[i], *(a+i), iptr[i], and *(iptr+i) are all equivalent. In fact, a[i] is changed to *(a+i) by the compiler immediately.

Even though iptr and a can be used interchangeably in the above case, there are significant differences between them:

iptr is a variable while a is a constant.
The definition of a[10] allocates ten cells for the array elements, while the definition of iptr only allocates space for itself. No space is allocated to any base type elements that it points to.

1.5 Arrays as Function Parameters

If an array is a parameter in a function prototype, its name can only be a passed by value. Since the name of an array is the address of its first element, when an array is passed as an argument in a function call, the address of its first element is copied. Hence the entries of the array are passed by reference.

Example: The following are equivalent implementation of the same function:

   int find_max (int a[], int n)
   // find index of maximum element/
   // in array a of size n./
   {
      int max = a[0], index = 0, i;

      for (i = 1; i < n; i++)
      {
         if (max < a[i])
         {
            max = a[i];
            index = i;
         }
      }
      return index;
   }

   int find_max (int *a, int n)
   // find index of maximum element
   // in array a of size n.
   {
      int max = *a, index = 0, i;

      for (i = 1; i < n; i++)
      {
         if (max < *(a+i))
         {
            max = *(a+i);
            index = i;
         }
      }
      return (index);
   }

1.6 Multidimensional Arrays

Declaration -- A two-dimensional array of integers of size m by n is declared by
```
   int a[m][n];
```
The entries are referenced by
a[i][j], where 0 <= i < m, and 0 <= j < n
Entries are stored row-by-row.
The name a[i], where 0 <= i < m, represents the array containing all elements of the i-th row. We call this a row pointer. Hence the two-dimensional array a is an array of arrays. What is the output of the following?
```
   int a[5][6];
   cout << sizeof(a) << endl;
   cout << sizeof(a[0]) << endl;
```

a[i][j] can be accessed as any one of the following:

   *(a[i] + j)
   *(*(a + i) + j)
   (*(a + i))[j]
   *(&a[0][0] + i * n + j)

To declare a multidimensional array as a function's parameter, we must specify the number of cells in all dimensions beyond the first.

1.7 Dynamic Variables

Since a pointer can be used to refer to a variable, a program can manipulate variables even if the variables have no identifiers to name them. This is a very important feature of C++ because in many applications, the actual number of variables cannot be determined until the execution time. Some variables should be created as needed while the program is being run. These variables are called dynamic variables. Nameless dynamic variables can be created using the new operator.

Example:

   int *p;
   p = new int;
   cin >> *p;
   *p += 7;
   cout << *p;

The operation new int above creates a nameless variable of type int and returns a pointer to it. Thus p is a pointer to this newly created variable. As a result, the variable can be refered to as *p. If new fails to create a new variable of the required type, it invokes a handling routine. The default action is to exit the program.

Example: What is printed?

// Program to demonstrate dynamic varaibles.
#include <iostream>
using namespace std;

int main()
{
   int *p, *q;
   p = new int;
   *p = 42;
   q = p;
   cout << "*p == " << *p << endl;
   cout << "*q == " << *q << endl;
   *q = 53;
   cout << "*p == " << *p << endl;
   cout << "*q == " << *q << endl;
   p = new int;
   *p = 88;
   cout << "*p == " << *p << endl;
   cout << "*q == " << *q << endl;
   cout << "Hope you got the point!.\n";
   return 0;
}

A dynamic variable, once created, will stay alive until the end of the program unless it is destroyed by the delete operation. For example, the operation
```
   delete p;
```
eliminates the dynamic variable pointed to by p and returns the memory cell to the system. The memory can then be reused to create new dynamic variables.
Memory leak refers to the situation where some unused dynamic variables are not deleted.
A deleted pointer becomes a stale pointer which no longer points to a valid object.
A double-delete occurs when one attempts to call delete on the same object more than once. It will very likely lead to run-time error.
When a function returns a pointer, make sure the pointer has something to point to after the function returns.
```
char *stupid()
{
  char *s = "stupid";
  return s;
}

int main()
{
  cout << stupid() << endl;
  return 0;
}
```
Making s static in stupid() will fix the problem.

1.8 Dynamic Arrays

C++ demands that the sizes of all regular arrays must be known at the compilation time. However, one may not know the size of the array when the program is written. The size is only determined in run time. With the regular arrays, one has to estimate the largest possible size one may need. There are two problems with this:

The estimate may be too low such that it will not work in all situations.
The estimate may be too high for most situations so that the memory is wasted.

Dynamic arrays can solve this problem very nicely.

Example:

// Program to demonstrate dynamic array.
char *dstrcpy (const char *cs)
{
   char *s, *tmp;
   unsigned size = strlen(cs) + 1;

   tmp = s = new char[size];

   while (*cs != '\0')
   {
      *(tmp++) = *(cs++);
   }

   *tmp = '\0';
   return s;
}

To destroy a dynamic array, p say, one can use

   delete [] p;

The square bracket tell C++ that a dynamic array variable is being eliminated, so the system checks the size of the array and removes that many cells.

Example:

// Program that sorts a dynamic array.
#include <cstdlib>
#include <iostream>
using namespace std;

typedef int* IntPtr;
void fill_array(int a[], int size);
void sort(int a[], int size);

int main()
{
   int array_size, i;
   IntPtr a;
   cout << "How many numbers will be sorted? ";
   cin >> array_size;

   a = new int[array_size];

   fill_array(a, array_size);
   sort(a, array_size);
   cout << "In sorted order the numbers are:\n";

   for (int i = 0; i < array_size; i++)
   {
      cout << a[i] << ' ';
   }
   cout << endl;

   delete [] a;
   return 0;
}

void fill_array(int a[], int size)
{
   cout << "Enter " << size << " integers.\n";
   for (int i = 0; i < size; i++)
   {
      cin >> a[i];
   }
}

void sort(int a[], int size)
{
  ...
}

1.9 Dynamic Two-Dimensional Arrays

Syntax of dynamic allocation does not allow direct two-dimensional notations. Thus, code such as

double **arr = new double[m][n];  // syntax error

are not possible. However, the function dyn2d returns a double** that can be used as a two-dimensional array:

double **dyn2d(int m, int n)
{
   double *arr = new double[m*n];
   double **a = new double*[m];

   for (int i = 0; i < m; i++)
   {
      a[i] = arr + i*n;
   }

   return a;
}

void del2d(double **a)
{
   delete [] *a;  // free data cells
   delete [] a;   // free pointer cells
}

Example: Allocating and deallocating two-dimensional arrays can be done alternatively as follows:

double **dyn2d_1(int m, int n)
{
   double **a = new double*[m]; // pointer cells

   for (int i = 0; i < m; i++)
   {
      a[i] = new double[n];     // data cells
   }

   return a;
}

void del2d_1(double **a, int m)
{
   for (int i = 0; i < m; i++)
   {
      delete [] a[i];  // free data cells
   }

   delete [] a;        // free pointer cells
}

1.10 The `vector` Class Template

The vector class template is part of the STL.
The vector class solves most problems of the built-in array type.
To use the standard vector, a program must contain this line
```
#include <vector>
```
with a using directive if one has not been provided.
To declare a vector
```
vector<int> a(3);
```
which defines a vector of three integers which can be referenced as a[0], a[1], and a[2]. An equivalent declaration would be
```
vector<int> a;  // 0 int object
a.resize(3);      // 3 int objects: a[0], a[1], a[2]
```

example:

#include <cstdlib>
#include <iostream>
#include <vector>
using namespace std;

// Generate numbers (from 1 - 100).
// Print number of occurrences of each.
int main()
{
   const int SIZE = 100;
   int i, totalNumbers;

   cout << "How many numbers to generate? ";
   cin >> totalNumbers;

   vector<int> numbers(SIZE + 1);

   for (i = 1; i < SIZE; i++)
   {
      numbers[i] = 0;
   }

   for (i = 0; i < totalNumbers; i++)
   {
      numbers[rand() % SIZE + 1]++;
   }

   for (i = 1; i <= SIZE; i++)
   {
      cout << i << " occurs " << numbers[i]
           << " times\n";
   }

   return 0;
}

A more concrete example:

#include <iostream> 
#include <vector>
using namespace std;

// Read an unlimited number of ints with no attempts
// at error recovery.
void getInts(vector<int>& array)
{
   int itemsRead = 0;
   int inputVal;

   cout << "Enter any number of integers: ";

   while (cin >> inputVal)
   {
      if (itemsRead == array.size())
      {
         array.resize(array.size() * 2 + 1);
      }
      array[itemsRead++] = inputVal;
   }
   array.resize(itemsRead);
}

int main()
{
   vector<int> array;

   getInts(array);

   for (int i = 0; i < array.size(); i++)
   {
      cout << array[i] << endl;
   }

   return 0;
}

Using the push_back function:
```
void getInts(vector<int>& array)
{
   int inputVal;

   array.resize(0);
   cout << "Enter any number of integers: ");

   while (cin >> inputVal)
   {
      array.push_back(inputVal);
   }
}
```
The push_back function increases the size by 1, and adds a new item to the array at the appropriate position, expanding capacity if needed.
Why is it necessary to resize the array to 0 initially?

Call by constant reference:

// Return the index of the maximal entry.
int findMax(const vector<vector>& a)
{
   int maxIdx = 0;

   for (int i = 1; i < a.size(); i++)
   {
      if (a[i] > a[maxIdx])
      {
         maxIdx = i;
      }
   }

   return maxIdx;
}

What if the header in the above example is changed to either of these?
```
int findMax(vector<vector> a)
```
or
```
int findMax(vector<vector>& a)
```

1.11 The `string` Class

The string class is part of the STL.
To use the standard string, a program must contain this line
```
#include <string>
```
with a using directive if one has not been provided.
The string class is a first-class type, i.e. the common operations like I/O, copying, and comparisons work as one would expect.

A simple example:

#include <iostream>
#include <string>
using namespace std;

int main()
{
   string a = "hello";
   string b = "world";
   string c;      // should be ""
   c = a + ' ';   // "hello "
   c += b;        // "hello world"

   cout << "c is: " << c << endl;

   cout << "c is: ";
   for (int i = 0; i < c.length(); i++)
   {
      cout << c[i];
   }
   cout << endl;

   char d[20];
   strcpy(d, c.c_str());
   cout << "c.c_str is: " << d << endl;

   return 0;
}

The insertion sorting of strings:

void insertSort(vector<string>& strs)
{
   string tmpStr;

   for (int i = 1; i < strs.size(); i++)
   {
      tmpStr = strs[i];

      // insert tmpStr into the right position
      int j;
      for (j = i - 1; j >= 0 && tmpStr < strs[j]; j--)
      {
         strs[j + 1] = strs[j];
      }
      strs[j + 1] = strs[j];
   }
}

1.12 Structures

A structure contains a number of fields which can be of different types.

The fields of a structure can be accessed by the dot operator.

Example:

struct Student
{
   string firstName;
   string lastName;
   int studentNum;
   double gradePointAvg;
}

void printInfo(const Student& s)
{
   cout << "ID is " << s.studentNum << endl;
   cout << "Name is " << s.firstName << ' '
                      << s.lastName << endl;
   cout << "GPA is " << s.gradePointAvg << endl;
}

int main()
{
   Student mary;

   mary.lastName = "Smith";
   mary.firstName = "Mary";
   mary.gradePointAvg = 4.0;
   mary.studentNum = 123456789;

   printInfo(mary);
   return 0;
}

Two variables of the same structure type can be assigned to each other.
Use the arrow operator (->) to access fields of a structure via a pointer.

1.13 Exogenous Versus Indigenous Data

Indigenous data are completely contained by the structure while Exogenous data resides outside the structure and are accessed through a pointer.

Example:

struct DynArr
{
   int *store;
   size_t size;
};

Problem of shallow copying:

   DynArr a, b;
   a.store = new int[100];
   a.size = 100;
   for (int i = 0; i < a.size; i++)
   {
      a.store[i] = 0;
   }

   b = a;
   for (int j = 0; j < b.size; j++)
   {
      b.store[i] = 1;
   }

Deep copying refers to the pointed-at values being copied.

void deepCopy(DynArr& target, const DynArr& source)
{
   target.size = source.size;
   target.store = new int[target.size];
   for (int i = 0; i < target.size; i++)
   {
      target.store[i] = source.store[i];
   }
}

1.14 The Classic Linked List

  +------+---+    +-------+---+    +----+---+    +-----+---+
  | Dale | o-+--->| Katie | o-+--->| Ed | o-+--->| Rob | ^ |
  +------+---+    +-------+---+    +----+---+    +-----+---+

Assume the following structure of a node:

   +---------------------+---+
   | item (of type data) | o-+------>
   +---------------------+---+  next (of type link)

A node is represented by a link to it, and a linked list is represented by a link to its first node, the head node. The symbol ^ is used to denote the empty link, one that does not link to any node.

1.14.1 Basic operations on linked lists

getNode() --- Allocates a node and returns a link to that node.
freeNode(p) --- Frees node p.
getItem(p) --- Returns item of node p.
getNext(p) --- Returns next of node p.
setItem(p,x) --- Sets item of node p to x.
setNext(p,q) --- Sets next of node p to q.

1.15 Linked Storage of Nodes

struct Node {
   Object item;
   Node *next;
};

Node *getNode (void)
{
   return new Node;
}

void freeNode (Node *p)
{
   delete p;
}

Object getItem (Node *p)
{
   return (p -> item);
}

Node *getNext (Node *p)
{
   return (p -> next);
}

void setItem (Node *p, const Object& x)
{
   p -> item = x;
}

void setNext (Node *p, const Node *q)
{
   p -> next = q;
}

1. Arrays, Pointers, and Structures

1.1 Pointers

1.2 Operations on Pointers

1.3 Reference Variables

1.4 Arrays and Pointers

1.5 Arrays as Function Parameters

1.6 Multidimensional Arrays

1.7 Dynamic Variables

1.8 Dynamic Arrays

1.9 Dynamic Two-Dimensional Arrays

1.10 The vector Class Template

1.11 The string Class

1.12 Structures

1.13 Exogenous Versus Indigenous Data

1.14 The Classic Linked List

1.14.1 Basic operations on linked lists

1.15 Linked Storage of Nodes

1.10 The `vector` Class Template

1.11 The `string` Class