Sep 27: Linked Lists

Linked Lists
Implementing Linked Lists
Circular, doubly linked lists with a sentinel
Singly Linked Lists

Code discussed in lecture

Linked Lists

Now we will examine a data structure known as a linked list. A linked list is in some ways like an array. In particular, it defines a linear ordering of elements. Unlike an array, we can insert an element into any position in the linear order in Θ(1) time, and we can also delete an element from any position in the linear order in Θ(1) time, with all other elements maintaining their relative positions for both insertion and deletion. With an array, we have to shift elements when inserting or deleting if we want them to maintain their positions relative to each other, and so the worst-case time to insert into or delete from an n-element array is Θ(n).

Another advantage of linked lists is that they grow and shrink dynamically. That is, the space occupied by an n-element linked list, in which each element occupies b bytes, is about bn. Although you could implement an ArrayList to grow and shrink dynamically, you cannot do so while maintaining Θ(1) time per insertion or deletion from the middle of the ArrayList.

So what's the downside of linked lists? They're a little more complicated to work with than arrays. They also take a bit more space than arrays (need space for the links). But the biggest disadvantage is that we cannot index into them in Θ(1) time. In other words, we can find the ith element of an array in Θ(1) time, but finding the ith element of a linked list takes Θ(i) time. For applications where we need to index quickly, linked lists are not a great choice.

In a linked list, the elements are arranged in a linear order that is determined by following a chain of links. For example, here is a picture of a linked list with three elements, each of which contains a reference to the name of a state:

As you can see, we think of the list as having a head and a tail element, and we can think of references head and tail to these elements.

The link part of each element tells us the next or previous element on the list. These will be implemented as references to entire elements.

Implementing Linked Lists

We will see a couple of ways to implement linked lists. Both ways will support a basic set of operations. When we have a set of operations to be supported but without a specific implementation for them, we make an interface. We will use the CS10LinkedList interface in CS10LinkedList.java.

Like the ArrayList we saw last time, this interface uses a generic type T.

The specifications of the methods of the CS10LinkedList interface assume that the linked list maintains a notion of a "current" element of the list. The interface has the following methods:

add inserts a new element after the current element, making the current element be this new element. If there is no current element, it inserts at the front of the list.
remove removes the current element, designating the successor of the removed element as the new current element.
contains determines whether an object is in the linked list. It returns a boolean that says whether the object was found. If the object was found, then the current element is set to be this object. (You can then insert after it, set it, or remove it.)
isEmpty returns a boolean indicating whether the linked list has no elements.
hasCurrent returns a boolean indicating whether the linked list has a current element.
hasNext returns true if there is a next item.
getFirst makes the current element be the first one in the linked list and returns it (printing an error message if there is no first item).
getLast makes the current element be the last one in the linked list and returns it (printing an error message if there is no last item).
addFirst inserts an element at the head of the linked list and makes it the current item.
addLast inserts an element at the tail of the linked list and makes it the current item.
clear removes all elements from the list.
next advances the notion of the current item to the next item following the current item and returns it (printing an error message if there is no current item).
get returns the data of the current item (printing an error message if there is no current item).
set sets the data field of the current item (printing an error message if there is no current item).

Circular, doubly linked lists with a sentinel

The simplest, cleanest way to implement a linked list is with a circular, doubly linked list with a sentinel. Quite a mouthful, indeed. The implementation is in SentinelDLL.java. The class is generic for a type T, declared with the line public class SentinelDLL<T> implements CS10LinkedList<T>

To start, each list element is an object of the class Element and has three instance variables:

data is a reference to the object being stored in that list element. This object must be of the type T. For the above example with state names, when we create a SentinelDLL object, T will be a String, so that data is a reference to a String.
next is a reference to the Element after this one in the list.
previous is a reference to the Element before this one in the list.

The Element class is a private inner class. It has the following methods:

A constructor that takes a reference to an object of type T. It stores this reference in the instance variable data.
toString returns the String representation of this element's data object.

Because each Element stores a reference to an object, strange things can happen if we store a reference to an object and then the object is changed. Therefore, we require that once a reference to an object is stored in an Element, the object itself should not change.

The class SentinelDLL implements the linked list. In fact, it implements the CS10LinkedList interface. The methods of SentinelDLL will need to access the data, next, and previous instance variables of each Element object. Because Element is a private inner class, the methods of SentinelDLL can access its instance variables, even though they are declared as private. No methods outside of SentinelDLL can access the instance variables of Element.

Next we examine the declaration for the class SentinelDLL. There are several methods, but first let's look at the instance variables.

current references the "current" list element, which we will need for several of the linked-list operations.
sentinel references a special list element, which we call the sentinel.

The scheme is that a linked list has exactly one sentinel, along with zero or more "real" elements. For example, the list above, with the names of three states, would contain four Element objects: the sentinel, and objects for Maine, Idaho, and Utah. The picture looks like the following:

Here, I omitted showing which Element object is pointed to by current. Despite how I had to draw the figure, each of these references points not to individual instance data, but rather to an entire Element object. The sentinel's data is a null reference.

Notice how the list is circular, in that you can start at the sentinel and follow either forward (next) or backward (previous) references and eventually get back to the sentinel.

In this scheme, every linked list, even an empty one, has a sentinel. In an empty list, both references in the sentinel point to the only Element available, namely the sentinel:

It may seem strange to have an "empty" list actually have an Element object in it, but it turns out to really simplify some of the code. You may appreciate this simplicity later on when we examine other ways to implement linked lists.

Methods

Having seen how we intend circular, doubly linked lists with a sentinel to be represented, now we examine the methods of the Element and SentinelDLL classes, which are in SentinelDLL.java. The methods for Element are straightforward, so we won't go over them here.

Making an empty list and making a list empty

The SentinelDLL constructor makes an empty list with only the sentinel, as the diagram above shows. It also sets the instance variable current to point to the only Element in town, namely the sentinel. Setting the next and previous fields of the sentinel and setting current are done by a call to clear, which makes any list empty (leaving any contents for garbage collection).

Converting to a `String`

The toString method for a SentinelDLL is fairly straightforward. It uses a common style of traversing a linked list by a clever for loop header: for (Element x = sentinel.next; x != sentinel; x = x.next) The for-loop iterates through the list, starting from the first non-sentinel on the list (sentinel.next), following next references, and stopping when it gets back to the sentinel. It concatenates the string representation of each item in the list onto a String named result, returning result at the end.

Inserting into a list

The add method for a list takes an object reference obj, and it inserts it after the Element object referenced by the instance variable current. Notice that we restrict obj to be of type T. The code manipulates references to "splice in" the new element. For example, if we start from an empty list, where

current =
sentinel

, and insert an element with the string Maine, we have the following situation:

The add method makes current reference the new Element object.

The splicing works the same when inserting into any position of the list. For example, starting from the 3-element list from before, we insert Ohio after Idaho as follows:

As you can easily see from the add code, it takes constant time, or Θ(1) time, to insert an element into a circular, doubly linked list with a sentinel.

Let's take a careful look at how add works. First, it makes a new Element that references the given object, and x references this new Element. It is this new Element that we will add to the list. We need to do four things:

Make x's next reference the element following the one that current references. The assignment x.next = current.next does so.
Make x's previous reference current. The assignment x.previous = current does so.
The element following the one referenced by current will have a new predecessor, namely the element that x references, so we need to set the previous instance variable of this element to reference x's element. The assignment current.next.previous = x does so. The expression current.next.previous can be a bit confusing, so let's examine it carefully. current references the current element. current.next references the element following the one that current references. This element has an instance variable previous that references its predecessor (which is current at the time that the add method is called, but it's about to be updated). Since we want to assign to the previous instance variable of the Element object referenced by current.next, we put current.next.previous on the left-hand side of the assignment statement.
The element referenced by current will have a new successor, namely the element that x references, so we set the next instance variable of current's element to reference x's element. The assignment current.next = x does so.

As you can easily see from the add code, it takes constant time, or Θ(1) time, to insert an element into a circular, doubly linked list with a sentinel. You can also see, by the absence of if-statements, that there are no special cases.

Removing from a linked list

The remove method for a list removes the Element object that current references. You never ever remove the sentinel, so the first thing we do is check whether current references the sentinel. If it does, then we print an error message to System.err, rather than to System.out. On some systems, you can suppress regular output printed to System.out, but you have to go to extra lengths to suppress error messages printed to System.err. In Eclipse, when you print to System.err, the message appears in red in the console. We want to make error messages likely to be seen.

Normally, the remove method is not trying to remove the sentinel. We splice the current element out of the list and make current reference its successor in the list.

For example, to remove Idaho from the previous list:

and to remove the only element from a list:

The time to remove an element is constant, or Θ(1). As we will see when we examine "simpler" lists, this running time is quite good; with simpler lists, removal takes O(n) time, and in the worst case, it's in fact Θ(n).

Finding a particular list element

The contains method for a SentinelDLL takes a reference obj to an object of the generic type T and looks for an element that equals obj, according to the equals method on the data field of each Element. We traverse the list, calling the equals on each element's data, until a match is found. If the contains method finds such an element, it sets current to reference it, so that we can next either add a new element after it or remove it.

We could check to make sure that we haven't returned to the sentinel, along with checking whether we have a match, but we've already seen a cool way to use sentinels for searching: put the value you're looking for in the sentinel. That way, you're guaranteed of finding it. If where you found it was the sentinel, it wasn't there in the first place. If where you found it was not the sentinel, then it really was there. We set sentinel.data to be the same reference as obj before traversing the list, and we make sure to put a null back into sentinel.next after the traversal is done.

When we try to pull the sentinel trick when linearly searching an array, we are in trouble if the array does not have room for the sentinel. In a circular, doubly linked list with a sentinel, however, we are assured that because we have already created a sentinel object, there's a place to put the object we're looking for.

When we use the sentinel trick, the for-loop is simply

for (x = sentinel.next; !x.data.equals(obj); x = x.next) ;

As you can see, this process is really linear search. The time to perform it depends on the time to compare two elements. If we denote this comparison time by t, and we say that the list has n elements, then the time to find a list element is O(tn), or because t is a constant that can be ignored, O(n). In the worst case (which is when it is not really there so that it was found in the sentinel), it's Θ(n).

Easy list functions

The remaining list functions are really easy. Note that the later functions use the isEmpty, hasCurrent, and hasNext predicates rather than just doing the tests directly. This makes changing the representation easier. You will appreciate this when doing your homework!

isEmpty returns true if and only if the only list element is the sentinel. That is the case precisely when the sentinel references itself.
hasCurrent returns true if and only if the there is a current item. That is the case precisely when current does not reference the sentinel.
hasNext returns true if there is both a current item and another item after the current item.
getFirst sets the current reference to the first item in the list and returns it. Error if the list is empty.
getLast sets the current reference to the last element on the list and returns it, so that anything added into the list goes after the last element, thus becoming the new last element. Error if the list is empty.
addFirst adds a new element at the head of the list, updating current.
addLast adds a new element at the end of the list, updating current.
next moves current to current.next and returns that next element. Error if there is no current item.
get returns the current item. Error if the there is no current item.
set assigns to the current item. Error if there is no current item.

Testing the `SentinelDLL` class

We can use the ListTest.java program to test the SentinelDLL class. You can use the debugger to examine the linked list if you like.

Notice that to declare and create the linked list, we specify the type that will be stored in the list. Here, it's a String:

CS10LinkedList<String> theList = new SentinelDLL<String>();

Singly linked lists

Although doubly linked circular linked lists with sentinels are the easiest linked lists to implement, they can take a lot of space. There are two references (next and previous) in each element, plus the sentinel node. There are applications where there are huge numbers of very short linked lists. (One is hashing, which you can learn about in later computer science courses.) In such situations, the extra reference in each node and the extra node for the sentinel can take substantial space.

The code for singly linked lists has more special cases than the code for circular, doubly linked lists with a sentinel, and the time to remove an element in a singly linked list is Θ(n) in the worst case rather than the Θ(1) time it takes in a circular, doubly linked list with a sentinel.

The SLL class in SLL.java implements the CS10LinkedList interface with a generic type T, just as the SentinelDLL class does.

A singly linked list, as implemented in the SLL class, has two structural differences from a circular, doubly linked list with a sentinel:

Each Element object in a singly linked list has no backward (previous) reference; the only navigational aid is a forward (next) reference.
There is no sentinel, nor does the list have a circular structure. Instead, the SLL class maintains references head to the first element on the list and tail to the last element on the list.

A singly linked list with Maine, Idaho, and Utah would look like

(Recall that the slash is how we indicate a NULL reference.) A singly linked list with only one element would look like

And an empty singly linked list looks like

The file SLL.java contains the class definitions for Element and SLL for a singly linked list. These declarations are similar to those for circular, doubly linked lists with a sentinel. As before, Element class is a private inner class, and all method declarations are the same. The only difference is in the instance data. We can use the same ListTest.java driver to test the singly linked list class, as long as we change the line creating the list to read

CSLinkedList<String> theList = new SLL<String>();

Let's examine the List methods in SLL.java for singly linked lists. We will highlight those that differ from those for circular, doubly linked lists with a sentinel.

Making an empty list

The clear method, which is called by the SLL constructor as well as being publicly available, makes an empty list by setting all instance variables (head, tail, and current) to null.

Adding an element into a list

As before, the add method places a new Element object after the one that current references. Without a special case, however, there would be no way to add an element as the new head of the list, since there is no sentinel to put a new element after. Therefore, if current is null, then we add the new element as the new list head.

The code, therefore, has two cases, depending on whether current is null. If it is, we have to make the new element reference what head was referencing and then make head reference the new element. Otherwise, we make the new element reference what the current element is referencing and then make current reference the new element.

If the new element is added after the last element on the list, we also have to update tail to reference the new element.

Compare this code to the add code for a circular, doubly linked list with a sentinel. Although there is only one directional link to maintain for a singly linked list, the code has more cases and is more complex. For either implementation, however, adding an element takes only Θ(1) time.

Removing from a linked list

As mentioned, removing an element from a singly linked list takes Θ(n) time in the worst case, which is worse than the Θ(1) time required for a circular, doubly linked list with a sentinel. Why does it take linear time, rather than constant time? The reason is that the previous reference in a doubly linked list really helps. In order to splice out the current element, we need to know its predecessor in the list, because we have to set the next instance variable of the predecessor to the value of current.next. With the previous reference, we can easily find the predecessor in constant time. With only next references available, the only way we have to determine an element's predecessor is to traverse the list from the beginning until we find an element whose next value references the element we want to splice out. And that traversal takes linear time in the worst case, which is when the element to be removed is at or near the end of the list.

The remove method first checks that current, which references the Element object to be removed, is non-null. If current is null, we print an error message and return. Normally, current is non-null, and the remove method finds the predecessor pred of the element that current references. Even this search for the predecessor has two cases, depending on whether the element to be removed is the first one in the list. If we are removing the first element, then we set pred to null and update head. Otherwise, we have to perform a linear search, stopping when pred.next references the same element as current; once this happens, we know that pred is indeed the predecessor of the current element. (There is also some "defensive coding," just in case we simply do not find an element pred such that pred.next references the same element as current. We do not expect this to ever happen, but if it does, we have found a grave error and so we print an error message and return.) Assuming that we find a correct predecessor, we splice out the current element. We also have to update tail if we are removing the last element of the list.

The bottom line is that, compared to the remove code for a circular, doubly linked list with a sentinel, the remove code for a singly linked list is more complex, has more possibilities for error, and can take longer.

toString for a list

The toString for a singly linked list is similar to how we print a circular, doubly linked list with a sentinel, except that now we start from head rather than sentinel.next and that the termination condition is not whether we come back to the sentinel but rather whether the reference we have is null. The for-loop header, therefore, is for (x = head; x != null; x = x.next)

Finding a particular list element

The contains method for a singly linked list is perhaps a little shorter than for a circular, doubly linked list with a sentinel, because now we do not replace the object reference in the sentinel. The for-loop header, therefore, becomes a little more complicated. We have to check whether we have run off the end of the list (which we did not have to do when we stored a reference to the object being searched for in the sentinel) and then, once we know we have not run off the end, whether the element we are looking at equals the object we want. The for-loop is for (x = head; x != null && !x.data.equals(obj); x = x.next) ;

Although the code surrounding the for-loop simplifies with a singly linked list, the loop itself is cleaner for the circular, doubly linked list with a sentinel. Either way, it takes Θ(n) time in the worst case.

Easy list methods

isEmpty is easy, but slightly different from the version for a circular, doubly linked list with a sentinel. We simply return whether head is null.
hasCurrent returns true if and only if the there is a current item. We simply return whether current is null.
hasNext checks to see whether there is a current item and whether the next field of the current item is null rather than seeing if it is the sentinel.
getFirst is different, as it sets current to head.
getLast changes, too, setting current to tail.
addFirst and addLast are similar to a circular, doubly linked list with a sentinel. However, addLast has to deal with an empty list separately/.
get is unchanged.
next is identical to the version in the doubly linked list. (This is an advantage of calling hasNext rather than doing the test directly in this method.)

Other options

It is also possible to have a dummy list head, even if the list is not circular. If we do so, we can eliminate some special cases, because adding at the head becomes more similar to adding anywhere else. (Instead of changing the head you update a next field.) It is also possible to have current reference the item before the item that it actually indicates, so that removal can be done in Θ(1) time. It takes a while to get used to having current reference the element before the one that is actually "current."

It is also possible to have a circular singly linked list, either with or without a sentinel.

Sep 27: Linked Lists

Code discussed in lecture

Linked Lists

Implementing Linked Lists

Circular, doubly linked lists with a sentinel

Methods

Making an empty list and making a list empty

Converting to a String

Inserting into a list

Removing from a linked list

Finding a particular list element

Easy list functions

Testing the SentinelDLL class

Singly linked lists

Making an empty list

Adding an element into a list

Removing from a linked list

toString for a list

Finding a particular list element

Easy list methods

Other options

Converting to a `String`

Testing the `SentinelDLL` class