Oct 9: Iterators, with List examples


Code discussed in lecture

Iterators

In recent lectures we saw linked lists in which the notion of the current element was part of the SentinelDLL or SLL object. While in some ways this is convenient, in others it is not. For example, if some method is going through the list and passes the linked list to another method, that method can change the current element. It would be nice if there were some way that each method could have its own independent concept of the element in the list that it is currently dealing with.

In fact, we don't have to incorporate current as an instance variable of SentinelDLL or SLL. We'll focus on modifying the SentinelDLL class today, and we'll see how to make a separate object that knows how to traverse and modify a given list. By making it a separate object, we can have any number of them active at any time. In other words, we could have 0, 1, 2, 3, or any other number of such objects around, and each could have its own notion of the current element of the list. Our modification of the SentinelDLL class will not have the instance variable current, nor will it have anything like current. Therefore it will not have get, remove, next, hasNext, add, or set methods.

This style of going through a data structure is so common that there's a name for it: an iterator. In fact, it's the basis of one of the standard interfaces in Java: the Iterator interface.

The Iterator interface

The Iterator interface consists of three methods:

Iterators apply to lots of different data structures, not just linked lists. There is a general style of using the Iterator interface. To demonstrate it, we need a class that allows us to get an Iterator for the contents of the collection. The ArrayList class is one such class. We can do the following:

ArrayList<T> myList = new ArrayList<T>(); // Code to put things into myList Iterator<T> iter = myList.iterator(); while (iter.hasNext()) { T item = iter.next(); // Code that does something with item. iter.remove(); // but only if we want to remove this element } Note that we ask the particular list to supply an iterator that can be used to walk through the list. As we do so, we always have to verify, via a call to hasNext, that the next call to next will succeed.

We will sometimes see iterators used in for loops rather than while loops, but that's OK. After all, a for loop is just a while loop in disguise.

You might see similarities between the "foreach" loops that we used to run through arrays and iterators. In fact, a foreach loop for a collection of objects translates into code that uses iterators! Foreach loops work for arrays, arraylists, and anything else that is "Iteratable". However, an iterator gives us one power that the foreach for loops did not. It allows us to remove items.

To see this in an application, I have re-implemented Swarm.java from the CS 5 Bugs homework to use an ArrayList and iterators. The swarm keeps track of bugs on the screen, and removes them if they crawl off of the screen or are clicked on for elimination. The Bug.java is included for completeness. (If you have not taken CS 5 don't worry; I am just going to show three short methods that show how interators are used. I thought that it would be better to take them from a real program than to just make up something.) Note that to use the Iterator interface, I needed to include the line

import java.util.Iterator;

The relevant methods for our purposes are draw, move, and hitbugs. The draw method simply goes through the ArrayList insects in order, with no deletions. Therefore we can simply use a foreach loop:

public void draw(Graphics page) { for (Bug aBug: insects) aBug.draw(page); }

The code for move also illustrates the use of remove. The move method returns true if the bug is still visible and false if it has crawled off of the canvas. Because we want to remove, we have to use an iterator.

public void move() { Iterator<Bug> iter = insects.iterator(); while (iter.hasNext()) { Bug thisBug = iter.next(); if (!thisBug.move()) iter.remove(); // moved fully off the top } } I get an iterator for insects. Note that I include Bug in the iterator's type, to say that it's next will always return a Bug. I have a while loop to continue as long as there is a next. Within the loop I get the next Bug and move it. If move returns false I use iter.remove() to remove it from the ArrayList. The call to remove removes the object most recently returned by next. Note that such a remove does not change the element returned by the next call to next.

The code for hitBugs again illustrates the use of remove. This time we avoid using a temporary variable by calling isHit directly on the Bug returned by next.

public void hitBugs(Point where) { Iterator<Bug> iter = insects.iterator(); while (iter.hasNext()) { if (iter.next().isHit(where)) // Was current bug hit? iter.remove(); // If so, remove it. } }

Implementing an iterator for an arraylist

How might you implement such an iterator for an arraylist? This is one way to do it.

Create an object that has three instance variables:

The operations can then be implemented as follows.

An iterator interface for a linked list

When we use an iterator in a linked list, we often want more functionality than the standard Iterator interface provides. In fact, Java supplies a standard ListIterator class. Its concept of "current" is different than the one we have seen. It has a "cursor position" between two elements in the list. (Draw picture.) A call to next returns the item after the cursor and moves the cursor forward. A call to previous returns the item before the cursor and moves the cursor backwards. Because of the way this works alternating calls to next and previous will keep returning the same element. In addition to the methods in Iterator it requires methods: Calls to the remove and set methods are invalid if there has never been a call to next or previous or if remove or add has been called since the most recent call to next or previous.

The ArrayList class has a method that returns a ListIterator, also. There is a separate class LinkedList which behaves like our circular doubly-linked list. Both implement the interface List, which requires a number of methods, including all that we saw for ArrayList plus iterator, and listIterator. They differ in the amount of time operations take. For instance, a get, set, or add on a LinkedList requires time proportional to the distance that the index is from the nearest end of the list. That means that an add to either the front or end of a LinkedList takes constant time, unlike an ArrayList. If a ListIterator is used, the time required for any method in the interface is constant. For an ArrayList the time for an add or remove is proportional to the number of items after the item added or removed, even if using a ListIterator.

Because the conventions and operations are different than what we have implemented in SentinelDLL we will show how to implement a ListIterator using this new concept of current. We extend the Iterator interface by declaring the CS10ListIterator interface in CS10ListIterator.java.

Because we have removed some of the methods from the SentinelDLL class, we need a new interface for the list class to implement. This new interface, CS10IteratedList in CS10IteratedList.java, is similar to the LinkedList interface in CS10LinkedList.java. The methods add, remove, get, next, and hasNext --all of which require access to the current instance variable--are gone.

There one new method: listIterator. This method will return an object that can iterate through the object whose class implements CS10ListIterator.

The SentinelDLLIterator class

SentinelDLLIterator.java is a modified version of the circular, doubly linked list with a sentinel that includes an iterator. The first thing to notice is that the SentinelDLLIterator class implements the CS10IteratedList interface, and so the methods that were in LinkedList but not in CS10IteratedList are missing from SentinelDLLIterator.

The second thing to notice is that the SentinelDLLIterator class has just sentinel as an instance variable; there is no current instance variable, as there was in SentinelDLL.

But the most salient feature of our SentinelDLLIterator class implementation is the inner class DLLIterator, which implements the ListIterator interface. The DLLIterator class is private. Users of the SentinelDLLIterator can still get a DLLIterator by calling the listIterator method. Moreover, because DLLIterator implements the public CS10ListIterator interface, once any part of any program has a reference to a DLLIterator, it can call the public methods in CS10ListIterator on it. However, the constructor is private, so that the only way to create a DLLIterator object is to call the method listIterator on a SentinelDLLIterator object.

And, perhaps most importantly, by making DLLIterator an inner class of SentinelDLLIterator, the methods of DLLIterator can access anything that the methods of SentinelDLLIterator can access. That would include the instance variable sentinel, as well as anything that is public in the Element class (like data, next, and previous).

The DLLIterator class has an two instance variables:

From how we've defined current, it needs to be advanced in next before we return an object when moving forward and after determining the object to return when moving backwards. That also means that initially, current references the sentinel (rather than, say, sentinel.next).

I have included an equals method in DLLIterator, and it is set so that two DLLIterator objects are considered equal if they are currently referencing the same Element.

Returning to the SentinelDLLIterator class, there is a new method listIterator. It creates a new DLLIterator for the SentinelDLLIterator object and returns a reference to it. This listIterator method is made to be called from outside the SentinelDLLIterator class, and because it returns a reference to a DLLIterator, its return value may be assigned to CS10ListIterator or even Iterator (since ListIterator extends Iterator).

Using the iterator

In the SentinelDLL class, the toString method now uses the iterator. Notice how toString uses the iteration paradigm from before, with a while loop whose test includes the call iter.hasNext() and whose body includes the call iter.next().

The DLLIterator created in toString is independent of any other DLLIterator in existence. Where one DLLIterator's current is has no effect at all on where another DLLIterator's current is.

We can really see this independence in ListTestIterator.java. Here, our test driver creates a DLLIterator by the line

CS10ListIterator<String> iter = theList.listIterator(); The current instance variable of this DLLIterator is moved by next and previous and used by add. But when we call theList.toString, the DLLIterator created and used by toString does not affect the DLLIterator in main.

Similarly, the DLLIterators created and used in the call theList.addFirst(name) is independent of all others. Therefore adding to the front of a list does not change the current item in iter. (Why did I not use an iterator in addLast?)

I have also added a "clear" option that iterates through the list, removing all objects. (I could have used the clear method, but chose not to). I have added a "print reversed" option that runs through the list backwards, after advancing to the end.

The "nested print" option really shows the power of separate iterators. Here, we have two DLLIterators, outer and inner. For each list object traversed by outer, we perform a full traversal of the list with inner. This task would be impossible if we were limited only to the methods we had in our original linked list implementations.

A Problem with Multiple Iterators

Having multiple iterators on the same object can be very useful, as we just saw. As long as none of them modifies the list everything is fine. However, list modifications can be a problem. In particular, if one iterator removes an element that is the current element of another iterator things can get very messy. Even changing the list by using addFirst and addLast can change how things work, and calling clear is definitely a problem!

Multiple threads (streams of control) can really cause problems. Suppose that you are on the second to last item in the list, you call hasNext and true is returned, and then call next. Should be safe, right? Well, not if somebody else in another thread removed the last item between the two calls. (Maybe somebody clicked on a button or a Timer went off between the calls, and the method registered with the listener changed the list.)

Because of this, an interator should throw an exception if the list has been modified in any way except via the iterator's own operations.