Oct 16: Traverals, Expression Trees, Exceptions


Code discussed in lecture

Traversal orders

A traversal is an order for visiting nodes in a tree. There are three standard ones:

These are basic approaches to designing tree algorithms. You are not limited to them. The book talks about an Eulerian tour of a binary tree. It even implements a template that visits each node three times (before its children, between its children, and after its children). The code supplies three methods for you to fill in called left, below, and right. I would have called them before, between, and after. But the idea is that you can be very flexible in how you traverse a tree. (I sometimes find it easier to write the whole traversal rather than fill in the template methods, but it is a matter of taste.) Eulerian traversals definitely abstract the idea of a traversal!

There is a somewhat interesting fact about traversals. If you have preorder and inorder traversals of a tree you can reconstruct the tree! (You can also reconstruct the tree if you have postorder and inorder traversals.) However, the tree elements must be unique.

How does this work? The first thing in preorder is the root. The root is somewhere in the middle of an inorder traversal, with everything in its left subtree before it and everything in its right subtree after it. In the preorder traversal all of the things in the left subtree are visited before anything in the right subtree. So we can build the tree by making a node for the root, splitting the two traversals into two pieces each corresponding to the left and right subtrees. How do we know how big each is? The inorder traversal tells us. We then recursively call ourselves on the sublists to reconstruct the left and right subtrees.There is code in BinaryTree.java to do this. Use the two lists that are created in the main program and show how to do the process.

Go through reconstructTree.

Expression Trees

The book also gives an example of an expression tree. They build it on top of their binary tree code, but there are advantages to writing it separately. In particular their code does a lot of testing to see what sort of node that they are dealing with. It is easier to take advantage of OO programming by making each node its own type and letting it deal with how to do things. It also allows for unary operators or operators that take more than 2 values. This is hard to achieve if you are based on a binary tree.

The code for an expression tree is in this zip file. (I will show you individual classes, but this is the easy way to download the whole thing.) The first thing to look at is Expression.java. It is an interface that says that an Expression must provide two methods: eval and deriv. The first evaluates the expression and the second takes its derivative symbolically. An Expression should also override toString.

Look at Constant.java. It is the simplest. To evaluate it you return its value and its derivative is always the constant 0.0. The strange thing is that the constructor is private, and there is a static method called make. This is an variant of a design pattern called the factory pattern. Instead of constructing objects directly users have to go through a method to get an object. One use is to allow the factory to decide to create a subclass object when asked to create a superclass object.

Another is to give the user options on what parameters to pass to create the object. For example, complex numbers can be represented using polar or rectangular coordinates. Either way you present two double values, so you can't overload the constructor to accept either representation. However, you can create two factories, one of which expects polar coordinates and the other of which expects rectangular coordinates. An example of this taken from Wikipedia' article on the "Factory method pattern":

  class Complex { 
     public static Complex fromCartesian(double real, double imaginary) {
         return new Complex(real, imaginary);
     }
 
     public static Complex fromPolar(double modulus, double angle) {
         return new Complex(modulus * cos(angle), modulus * sin(angle));
     }
 
     private Complex(double a, double b) {
         //...
     }
  }
 
 Complex c = Complex.fromPolar(1, pi);

It is less clear why we would want to use this pattern in Constant, but it will become clearer later.

The Variable class is similar, but it has a symbol table to save variable values. These are saved in a Map that is implemented using a hash table. We will see more about most of these later, but for now it is enough to know that a map stores (key, value) pairs. When you want the value you give it the associated key. You call symbolTable.put(varName, value) to add a variable name with a given value to the map and you call symbolTable.get(varName) to retrieve the value if the key is there. (You get back null if it is not.) Instead of printing an error message if the variable name is not in the symbol table we throw an exception. We will look at that later. The rest is fairly straightforward. The derivative of a variable is 1 if the derivative is taken with respect to that variable and 0 otherwise. Note the call to Constant.make to create the 0 or 1 constant.

The classes Sum, Difference, Product, and Quotient all perform eval by evaluating their operands and performing an operation on them. The only difference is the operator. Therefore I have created an abstract class BinaryOp that has the template for evaluating a binary expression and another template for toString. These templates call abstract functions doOperation and getOperation, repectively. (The latter returns a string representation of the operator). It also has accessor methods to get the first or second expression.

This simplifies the writing of the Sum, Difference, Product, and Quotient classes. We will look at Sum. It provides the necessary two methods, which are fairly trivial. It supplies its deriv method, which adds the derivatives of its operands. But the interesting thing is Sum.make. Here we see the power of a factory method. It tries to simplify the expression. If the two things it is adding are constants, it adds them to get a new constant. If either operand is 0 it returns the other. So in three of the four cases it does not even create a Sum object! The other three operations are similar

There is also a ExpressionDriver class to test the expressions. (Note we could not add a main method to Expression to do this because Expression is an interface.)

The Composite Pattern

This is an example of another design pattern, the composite pattern. Here we have individual things (constants and variables) and compositions of things (sums, differences, etc.), but they all are expressions and can be treated in the same manner. We perform the same operations on all of them, so we need not distinguish between individual things and groups of things. Any of the expression types can be used wherever an expression is expected, and everything works. For example, we take the derivative of the product of two huge, complicated expressions the same way that we take the derivative of 2*x.

The same pattern arises in some graphical drawing programs, where there are individual shapes (circles, rectangles, etc.) and groups of shapes (shapes are laid out and then combined into a single group). But all are shapes, and groups of shapes can be used wherever individual shapes can be used and vise versa, and all are subject to the same scaling, translation, and other operations. So you compose a bunch of shapes and get another shape.

Exceptions

So far we have handled errors by printing error messages to System.err. This is not a great solution to the problem of handling errors. Java has a much better one - exceptions. Exceptions are objects that are thrown when an error occurs. When an exception is thrown the program stops what it was doing and goes into error-handling mode. Exceptions are either caught and handled in the method that throws them or passed on the the method that called it. If that method does not catch the exception it is passed to the method that calls it, etc. If the main program does not catch the exception it kills the program and prints an error message. You probably have seen these messages for null pointers or subscript out of bounds.

Exceptions are an excellent way of passing an error on to a method that knows a reasonable way to handle them. Exceptions usually occur in a low-level method. Consider trying to open a file and failing. The file-opening method has no idea what the programmer wants to do if the file can't be opened. Maybe the method that calls it has no idea, either. If you don't know a good way to handle the exception it is best to pass it on. Eventually you will get to a high-level method that does know what to do about the situation. If the file name was just entered by the user it might be good to give her a second chance. Or it may be that the programmer has a backup file to use if the first cannot be opened. Or it could be that the right thing to do is to print an error message and quit the program. Exceptions give a way to let the low-level method that discovers the problem pass it on to a method that knows what to do about it.

There is a whole heirarchy of exceptions. The top level class is Exception. The heirarchy looks like:

I should also note that there are two types of exceptions, checked and unchecked. When you call a method that could throw a checked exception the compiler checks to make sure that you handle it. You can handle it in either of two ways. The first one is to catch it. The second is to pass the buck and add a throws clause at the end of the method header. For example:

public String readString(String fileName) throws IOException, CloneNotSupportedException {

Which are the checked and unchecked exceptions? RunTimeException and all of its subclasses are unchecked. The rest are checked. The checked exceptions tend to be things that are outside of the programmers's control. The IOException exceptions happen when the file you are reading is corrupted or there is a hardware glitch or somebody moved the file from where it was supposed to be or gave it a different name. Since you can't control whether these things happen you should make provisions to handle them when they do.

RunTimeException and its subclasses are all thing that happened because you made a programming error or failed to check something that you could have checked. The thought was that checking for all of these things is a waste of time, and if one of them goes wrong you probably want to kill the program anyway. (NumberFormatException is something of an exception to the rule - it often occurs when you are trying to convert strings typed in by the user into numbers. But being unchecked doesn't mean that you can't check for them - the compiler just doesn't check to make sure that you do.)

So how do we catch an exception? I have modified Fahrenheit.java to demonstrate catching a NumberFormatException. The relevant part is in actionPerformed:

    // Get the text entered in the text field.
    String text = fahrenheit.getText();

    try {
      fahrenheitTemp = Integer.parseInt(text); // convert the text to an int
      celsiusTemp = (fahrenheitTemp - 32) * 5 / 9; // do the actual conversion

      // Convert the int celsius temperature to a String, and display it in
      // the result label.
      resultLabel.setText(Integer.toString(celsiusTemp));
    }
    catch(NumberFormatException ex) {
      resultLabel.setText("*****");
    }

The user supplies text via a fahrenheit.getText() call. However, what the user entered may not be a valid int. Therefore we put the conversion and the statements that deal with the result of the conversion into a try block. If the conversion works successfully the result is displayed and the catch block is ignored. If a NumberFormatException is thrown then catch is called with an argument of type NumberFormatException. There can be several catch blocks, one after another, and the first one whose parameter is of the type of the exception (including a supertype) gets executed. Here what happens is that the output field is filled with asteriks to show that an error occurred.

There is also another possibility - a finally block. Code in the final block is run whether the code in the try block succeeds or fails. It is used to clean up. A typical example is an open file that should be closed. To extend our earlier example:

	public String readString(String fileName) throws IOException {
	  BufferedReader input =  new BufferedReader(new FileReader(pathName));
	  String line = input.readLine();
	  input.close();
	  return line;
	}

The first line opens a file. (We will see more about file reading and writing soon.) If an IOException is thrown by the readLine call the file will never be closed. By rewriting it:

	public String readString(String fileName) throws IOException {
	  BufferedReader input =  new BufferedReader(new FileReader(pathName));
	  
	  try {
	    String line = input.readLine();
	  }
	  finally {
	    input.close();
	  }
	  
	  return line;
	}

the file will get closed whether an exception is thrown or not. It is possible to include both catch and finally blocks in the same try block, but it is not recommended because of confusion over what is happening in what order. Instead have a throws clause (as above) or nest two try blocks with the finally on the inside and catch on the outside.)

Finally, you can write your own exceptions. An example is UnassignedVariableException thrown by Variable. No exception really described the problem, so I extended RuntimeException to create one whose name was more descriptive.