What are abstract data types and what use are they?
- Separates what methods in the ADT do from how
they are implemented.
- So many different implementations of an ADT are possible
without affecting the client application that uses the ADT.
- Improvements in efficiency of a client program can be made
by changing the implementation of the ADT without changing any
of the client application code that uses the ADT.
What are Java interfaces and what good are they?
Interfaces
- Instances of classes not related by inheritance may still be
used interchangeably if the classes all also implement the
same Java interface.
-
A class can implement multiple Java interfaces and instances
can therefore be used in multiple ways.
- However, a class that implements an interface must provide
implementations of every method in the interface.
Abstract Classes
A class is abstract if it has at least one abstract
method. An method declared as abstract has no
implementation. Implementation of abstract methods are the
responsibility of subclasses.
Example
public abstract class AbstractList<E>
{
...
public void add(int index, E element) { ... } // Not abstract; implemented
...
public abstract E get(int index); // Abstract; no implementation in this class
...
}
- Instance of classes that are subclasses of the same abstract
class may be used interchangeably.
- A subclass of an abstract class only has to implement the
abstract methods; it can simply inherit the implementations of
non-abstract methods (like less work for the programmer).
- However, a class can inherit from only one class; in
particular, a class can't inherit from multiple abstract classes.
For classes implementing abstract data types such as stacks, queues, lists,
etc. that store collections of items, the Iterator interface
provides a common way to access the items one at a time
independent of the underlying implementation of the class.
public class Iterator<E>
{
public boolean hasNext();
public E next();
public void remove();
}
The remove() method is "optional". T should throw
UnsupportedOperationException if isn't implemented to actually
remove an item.
But how do you get an Iterator for a Stack, Queue, or LinkedList?
A class that implements the Iterable interface must have
a method, iterator() that returns an Iterator Object for
that class instance
public interface Iterable<E>
{
public Iterator<E> iterator();
}
Java classes Stack<E>, Queue<E>, and
LinkedList<E> all implement Iterable<E>
So code to print the elements in any one of these can be
exactly the same:
Stack<String> x; // or Queue<String> x; or LinkedList<String> x;
...
Iterator<String> p = x.iterator();
while(p.hasNext()) {
System.out.println(p.next());
}
Another advantage for a class implementing Iterable is that the
'foreach' style loop can be used. It is implemented by the
compiler as the while loop above, but can be written more simply
as:
Stack<String> x; // or Queue<String> x; or LinkedList<String> x;
...
for(String s: x) {
Systme.out.println(s);
}
public class Mystery<E> implements Iterable<E>
{
// ??? (contents of Mystery class not shown)
}
What can you conclude, if anything, about the methods in Mystery?
An ADT that can be used in many different client applications
is the symbol table ADT
A symbol table (abstractly) stores (key, value) pairs and
supports insertion, deletion and lookup the value corresponding
to a key.
Such a data type can more easily help create correct
solutions to many programming problems.
Name |
Purpose |
Key |
Value |
Dictionary |
Lookup word meaning |
word |
meaning |
Book Index |
Find page(s) in a book where a word occurs |
word |
List of page numbers |
File Index |
Find list of files that contain a given string |
string |
List of file names |
Compiler |
lookup name usage |
program element names (variables, function names, class
names, etc.) |
Lookup the usage of the name and its attributes |
In Java an identifier can be declared to represent a variable
whose possible value is a type!
Such identifiers are called generic parameters.
An identifier must be declared to be a generic identifier.
Below line 2 declares E to be a generic parameter. The
scope of E is from line 2 to the end of the class MyList
at line 11.
Line 6 is a use of E, not a declaration of E.
If one or more generic parameters is declared in a class header
(line 2), the class is said to be a generic class.
1
2 public class MyList<E> // declaration of E
3 {
4
5 ...
6 public void add(E x) // Use of E
7 {
8
9 }
10
11 }
public class ST<Key, Value>
{
public void put(Key k, Value v) {...}
public Value get(Key k) {...}
public void delete(Key k) {...}
public Iterable<Key> keys();
public boolean contains(Key key) {...}
public int size() {...}
public boolean isEmpty() {...}
}
- put(k,v);
Inserts the pair (k,v) if k not in the table. Otherwise
replaces k's value with v.
- v = get(k);
Returns the value associated with k if the pair (k,v) is in the
table. Otherwise get returns null.
-
delete(k)
If (k,v) is in the table, that pair is removed. Otherwise, no change.
-
p = keys();
Returns a reference to some object that implements
Iterable<Key> (The only method in Iterable<Key> is
Iterator<Key> iterator().)
So p.iterator() can be used to retrieve each of the keys in
the symbol table, one at a time.
Question: How can you print all the key value pairs?
- Keys are unique
- Inserting a (key, value) pair when the key is already in the
symbol table replaces the old value associated with
that key with the new value.
- Keys are not allowed to be null
- Values are not allowed to be null.
- When creating an instance of a symbol table, an
immutable type should be chosen for the key
if possible.
Requiring Keys Not be null[13] [top]
All implementations of the Symbol Table methods (put(k,v), get(k),
delete(k)) will need to to compare k with the at least
some of keys stored in the Symbol Table.
How?
The obvious way to compare k with a key k1 stored in the
symbol table:
k.equals(k1)
But if k == null this will throw a
NullPointerException.
Requiring that keys in the symbol table not be null means
put, get, and delete require k not be null and so they can
always use the equals method for comparison.
Requiring Values not be null[14] [top]
The get(k) method returns null if k is not in the symbol
table. If k is in the symbol table, get(k) should return the value
associated with k in the symbol table.
If null values were allowed and get(k) returned null, it would
mean either
- k is not in the symbol table or
- k is in the symbol table and its associate value is null
It would be necessary to use the contains(k) method to
distinguish these two cases.
In many cases, the get method would have to repeat all the same
work in searching for the key k that the contains method does.
Requiring that values not be null means that if get(k) returns
null, there is only one possibility: k is NOT in the symbol table.
Several Java API classes that implement symbol table methods
allow null keys and null values while others do not.
Problem: For a text file, find the word that occurs most
frequently.
Use a symbol table whose keys are words (String type) and whose
value is the frequency of occurrence of each word (Integer
type).
- Read the file and extract one word k at a time.
- Use the get(k) method to either determine that k is in
the symbol table or not.
- If the word k is not in the symbol table, insert k with the
value 1.
- If the word k was already in the symbol table, get its
value, increment it, and put (k, updated value) back in the
symbol table.
public class MaxFreq
{
public static void main(String[] args)
{
Scanner in = MyIO.openInput("text.txt");
ST<String, Integer> st = new ST<String, Integer>();
while(in.hasNext()) {
String w = in.next();
Integer n = st.get(w);
if (n == null) {
st.put(w, 1);
} else {
st.put(w, n + 1);
}
}
int max = 0;
String maxWord = "";
Iterator<String> p = st.keys().iterator();
while(p.hasNext()) {
String k = p.next();
int cnt = st.get(k);
if (cnt > max) {
maxWord = k;
max = cnt;
}
}
System.out.println("Maximum frequency word: %s, frequency = %d\n", maxWord, max);
}
}
Best Practices
- Make sure the equals method for the Key type tests
for equality as you expect.
- If possible the Key type should be immutable
equals
Since searching for a key in a
symbol table uses equality, problems occur if the
equals method for the Key type is too strict.
Not every Java class overrides the equals method
inherited from Object. So some classes use Object's
equals method which IS too strict.
The equals method in Object is
almost always too strict: x.equals(y) is true for
Object's equals only if x and y reference the same object;
that is, only if x == y.
Immutable Keys
If the Key type has methods that can change a key's state
(i.e., Key type is
NOT immutable) the key in some (key, value) pair in the symbol
table can be changed to another key already in the symbol table. This would violate the rule that
keys can't be duplicated.
If Key type is immutable, this can't occur.
See the code examples
SequentialSearchST
SequentialSearchST from the text is a class that implements the symbol table
methods by storing (key, value) pairs in an unordered linked
list.
The Node class used for building the linked list contain
members for the key and for the value in addition to links to the
next (and possibly the previous) Node.
BinarySearchST
BinarySearchST is another class in the text that also
implements the symbol table methods by storing the (key, value)
pairs in two arrays - one for the keys and one for the values.
Key[] keys;
Value[] values;
These two arrays are logically related through the array indices: the key at
keys[i] has corresponding value values[i].
The keys array is kept in sorted order!
This means the Key type must implement Comparable.
Instead of using equals method to search for keys, the
compareTo method is used.
For an application that uses a symbol table, the two
implementations are interchangeable as far as correctness goes
provided the Key type implements Comparable.
But the methods have different performace times for the two
implementations.
For each method which class has the faster method?
How does each method do its task?
Method |
SequentialSearchST |
BinarySearchST |
put(k, v) |
|
|
v = get(k) |
|
|
delete(k) |
|
|
- put(k,v): Examine each key in the list until k is found,
then update the value or
until the entire list has been searched then just insert a new Node with k and v at the
beginning of the list. Time: O(N) in worst case.
- v = get(k): In worst case, must search through all N Nodes
when the symbol table contains N keys. Time: O(N)
- delete(k): Same as get in worst case to find a Node with key equal to
k. Then deleting the Node is only changing a few links. Time O(N)
- put(k,v): Binary search for a key equal to k (time
O(log(N))), but then to insert a new key all larger keys (and
values) must be copied to the next index. In worst case N keys
and values must be copied. Time: O(N)
- v = get(k): Binary search. Time: O(log(N))
- delete(k): Similar to put. Binary search for k (time
O(log(N))), but keys greater than k must be copied to previous
index. Time: O(N)
For Worst Case Performance
Method |
SequentialSearchST |
BinarySearchST |
put(k, v) |
O(N) |
O(N) |
v = get(k) |
O(N) |
O(log(N)) |
delete(k) |
O(N) |
O(N) |
Usually O(log(N)) or O(N) is great, but that was the
performance for one execution each method.
An application that prints the frequency of occurrence of each
word in a document with N total words and M distinct words, must do N get operations,
plus N put operations to insert the words into a symbol table.
Then it must iterate through the M distinct keys and do M more
get operations.
Comparing the two classes just for building the symbol table:
class |
cost for insertion |
Total |
SequentialSearchST |
N gets: N * O(N) = O(N2) N puts: N * O(N)
= O(N2)
total: O(N2) + O(N2) |
= O(N2) |
BinarySearchSt |
N gets: N * O(log(N)) = O(Nlog(N))
N puts: N * O(N) = O(N2)
total: O(Nlog(N)) + O(N2) |
=O(N2) |
See code examples