CSC393 Jan08

slide version

single file version

Contents

  1. Algorithms and Data Structures
  2. Example
  3. A Class of Utility Functions
  4. Filter Program
  5. Problem
  6. Review of C++: Inheritance
  7. Overriding: Replacement or Refinement
  8. Virtual versus non-virtual member functions
  9. Dynamic Dispatch
  10. Usefulness of Dynamic Dispatch
  11. Dynamic Dispatch (continued)
  12. Copying
  13. Templates
  14. Example template class
  15. Precursor to template functions: pointers and arrays
  16. Iterators
  17. A template version of the find function
  18. The STL vector class
  19. Sort Algorithm

Algorithms and Data Structures[1] [top]

Algorithms and data structures in this course are largely independent of the programming language used.

This course will use C++ to implement the data structures and algorithms studied.

Example[2] [top]

Program to filter "bad" items in an input stream based on a give list of good inputs. (E.g., items could be ip addresses.)

Initial Prototype Design:

A Class of Utility Functions[3] [top]

class MStd
{
public:
  /**
   * Opens file fname for input reads all the integers and returns
   * them in a vector.
   * @param fname - the input file name
   * @return - vector containing all integers in the input file. 
   *           If the file cannot be opened the returned vector
   *           has size 0.
   */
  static vector<int> readInts(const string& fname);

  /**
   * Linear search of the vector v for integer x.
   * @param x - integer to search for in v
   * @param v - vector of integers to search for x
   * @return  - the first index i such that v[i] == x
   *            or -1 if x is not in v.
   */
  static int lrank(int x, vector<int>& v);

  /**
   * Finds the substrings of s separated by 1 or more characters in
   * the delim string. These substrings are returned in a vector.
   * @param s - the string to split
   * @param delim - a string of characters treated as separators
   *                between the substrings to be returned.
   * @return - a vector consisting of the substrings found
   */
  static vector<string> split(const string& s, const string& delims);
};

Filter Program[4] [top]


int main(int argc, char *argv[])
{
  ifstream ifs;
  string good_file;
  string test_file;

  // Usage: filter good_file test_file
  // good_file will be argv[1]
  // test_file will be argv[2]
  good_file = string(argv[1]);
  test_file = string(argv[2]);
    
  vector<int> v = MStd::readInts(good_file);

  ifs.open(test_file.c_str());

  if (!ifs) {
    cout << "Cannot open test file: " << test_file << endl;
    exit(0);
  }
  
  int x;
  int badcnt = 0;
  while( ifs >> x )
  {
    if (MStd::lrank(x, v) == -1 ) {
        badcnt++;
    }
  }
  cout << "There were " << badcnt << " items not in the good list!" << endl;

  return 0;
}

Problem[5] [top]

The program works correctly and it works quickly for small input files: tinyG.txt and tinyT.txt.

But for sample files: largeG.txt and largeT.txt (1000000 numbers in each), it takes an unacceptably long time.

Improvement

Review of C++: Inheritance[6] [top]

What is the syntax for definining a subclass B of a class A?

C++ has class types and also pointer to class types. I'll refer to the first as a value type and the second as a reference type.

Which of these types is polymorphic and which is monomorphic?

A variable declared as a monomorphic types may only hold different values during execution, but they must all be of one type specified by the declaration.

A variable declared as polymorphic type may not only hold different values during execution, but these may even be of different types.

A variable of type pointer to A, where A is a class is potentially a polymorphic type. During execution such a variable can hold values which are of type pointer to A or of type pointer to any class descended from A.

Overriding: Replacement or Refinement[7] [top]

A function f declared to be virtual in a base class A can optionally be given a new implementation in a derived class B.

To give a new implementation, the function f should be declared again in B. In this case it should have the exact same signature.

Replacement means B::f is given an new implementation that does not repeat or invoke A::f that whose implementation it is "replacing".

Alternatively, refinement would mean that B::f does the same thing as A::f but also adds some additional action (presumably involving data members in B that are not in A).

Constructors for classes typically use refinement. Indeed if no A constructor is invoked explicitly by a B constructor, the compiler will invoke an available no argument A constructor.

What is a constructor initialization list, where does it appear sytactically and when does it execute compared to the body of the constructor?

Virtual versus non-virtual member functions[8] [top]

If a function g in a class A is not virtual, can a function g of the same signature be defined in a subclass B?

class A
{
public:
    ...
    void g(int x);
    ...
};

class B : public A
{
public:
    ...
    void g(int x);
    ...
};

int main()
{
   A *pa = new B();
  
   pa->g();   // Does this use dynamic dispatch to determine which g?
   
}

Dynamic Dispatch[9] [top]

Dynamic dispatch is used by the compiler in C++ if all of the following are true:

Otherwise, static dispatch is used by the compiler, which means that the implementation of f in the class X from the declaration of p is used.

Usefulness of Dynamic Dispatch[10] [top]

Promotes reuse of code.

Example.

class Shape {
public: 

   virtual bool contains(Point p) const = 0; // Pure virtual
   virtual void draw() = 0;                  // Pure virtual
   ...
};

class Circle: public Shape {
public:
   bool contains(Point p) const;
   void draw();
   void drawCircle();
   ...
};

class Rectangle: public Shape {
public:
   bool contains(Point p) const;
   void draw();
   void drawRectangle();
   ...
};

class Polygon: public Shape {
public:
   bool contains(Point p) const = 0;
   void draw();
   void drawPolygon();
   ...
};

Dynamic Dispatch (continued)[11] [top]


Shape * shapes[100];
int N;  // number of Shapes in shapes array

void handler(Point p)
{
   Shape *sp;
   for(int i = 0; i < N; i++) {
	sp = shapes[i];
	if ( sp is pointing to a Rectangle that contains p) {
           sp->drawRectangle();
        } else if ( sp is pointing to a Circle that contains p){
           sp->drawCircle();
        } ... etc.
} 



void handler(Point p)
{
   Shape *sp;
   for(int i = 0; i < N; i++) {
	sp = shapes[i];
	if ( sp->contains(p) ) {
	   sp->draw();
        }
   }
} 

Copying[12] [top]

Since C++ allows variables declared to hold either class values or pointers (references) to class values, one has write classes to be prepared to handle the situations where C++ automatically makes copies of class values and automatically destroys class values:

You may need to define:

a. Copy constructor    Used for passing class objects by value
b. operator=           (assignment operator)
c. destructor          

When do you need to use the "rule of 3" and define these members? Typically when the class has pointer members which the class itself has allocated the storage on the heap using new and is responsible for this extra storage. The extra part on the heap is not automatically copied by the default copy constructor, default assignment operator. The extra storage on the heap is also not automatically deleted when the object containing the pointer member is deallocated.

Templates[13] [top]

Templates provide a different kind of reuse than that provided by inheritance with dynamic dispatch. Templates are also extensively used for the C++ Standard Template Library which implements a large number of data structures and also algorithms.

The algorithms in STL are implemented as template functions - typically non-member functions.

The data structures in STL are implemented as template classes.

Example template class[14] [top]

template <class T, class S>
class pair {
public:
  T first;
  S second;
  pair(T x, S y) : first(x), second(y) {}
};

pair<int, string> p1(5, "Hello");
pair<string, bool> p2("fred", false);

Precursor to template functions: pointers and arrays[15] [top]



bool find(int *start, int* limit, int x)
{
  for(int * p = start; p != limit; p++) {
     if ( *p == x ) {
	return true;
     }
  }
  return false;
}

int main()
{
  int a[] = {1,2,3,5,8,13,21,34};
  int *limita;
  limita = a + sizeof(a)/sizeof(int);
  int x;

  cin >> x;
  if ( find(a, limita, x) ) {
	cout << x << " found!" << endl;
  } else {
	cout << x << " not found!" << endl;
  }
}

Iterators[16] [top]

In C++ iterators are modeled after pointers into arrays. That is, pointers to arrays satisfy the properties of iterators and iterators generalize these properties to other data structures such as lists, dictionaries, etc.

    vector<string> v; // empty vector of strings
    vector<bool> w(10, false); // vector with 10 elements with 
                                     // each element set to false
    int a[] = {1,2,3,5,8,13,21,34};  // int array with 8 elements
    vector z(a, a + 8);              // vector initialized with 
                                     // the 8 elements from a.
    vector::iterator startz = z.begin();
    vector::iterator limitz = z.end();
    vector::iterator i;
   
    bool found = false;
    int x;
    cin >> x;
    for(i = startz; i != limitz; ++i) {
      if ( *i == x ) {
         found = true;
         break;
      }
    }

A template version of the find function[17] [top]

template <class Iterator, class T>
bool find(Iterator start, Iterator limit, const T& x)
{
  for(Iterator p = start; p != limit; p++) {
     if ( *p == x ) {
         return true;
     }
  }
  return false;
}

int main()
{
    int a[] = {1,2,3,5,8,13,21,34};  // int array with 8 elements
    vector z(a, a + 8);              // vector initialized with 
                                     // the 8 elements from a.
    vector::iterator startz = z.begin();
    vector::iterator limitz = z.end();

    bool ba, bz;

    ba = find(a, a + 8, 21);     // ba will be set to true
    bz = find(start, limitz, 21) // bz will also be set to true

The STL vector class[18] [top]

The STL vector class is a template class.

It has a size() method (how much used storage) and capacity() (how much available storage). The push_back method checks and doubles the size if size is equal to capacity. So vectors grow automatically.

We will follow a systematic study of vector next time (chapter 5).

Sort Algorithm[19] [top]

STL also has many template functions which implement traditional and useful algorithms. These template functions are reusable since the type of the elements they can be applied to is parametrized and can be specified simply by passing a given type to the function.

For example, STL has a sort function:

#include <algorithm>
#include <vector>
using namespace std;

int main()
{
  vector<string> v; 
  // Store values in v
  ...
  // Now sort strings in v in usual order

  sort(v.begin(), v.end());
  ...
}

The same sort statement works with no change if v is a vector of some other type such as int.