Sep 27: Linked Lists
- Linked Lists
- Implementing Linked Lists
- Circular, doubly linked lists with a sentinel
- Singly Linked Lists
Code discussed in lecture
Linked Lists
Now we will examine a data structure known as a linked list. A linked list is in some ways like an array. In particular, it defines a linear ordering of elements. Unlike an array, we can insert an element into any position in the linear order in Θ(1) time, and we can also delete an element from any position in the linear order in Θ(1) time, with all other elements maintaining their relative positions for both insertion and deletion. With an array, we have to shift elements when inserting or deleting if we want them to maintain their positions relative to each other, and so the worst-case time to insert into or delete from an n-element array is Θ(n).
Another advantage of linked lists is that they grow and shrink
dynamically. That is, the space occupied by an n-element
linked list, in which each element occupies b bytes, is
about bn. Although you could implement an
ArrayList
to grow and shrink dynamically, you cannot do
so while maintaining Θ(1) time per insertion or deletion from
the middle of the ArrayList
.
So what's the downside of linked lists? They're a little more complicated to work with than arrays. They also take a bit more space than arrays (need space for the links). But the biggest disadvantage is that we cannot index into them in Θ(1) time. In other words, we can find the ith element of an array in Θ(1) time, but finding the ith element of a linked list takes Θ(i) time. For applications where we need to index quickly, linked lists are not a great choice.
In a linked list, the elements are arranged in a linear order that is determined by following a chain of links. For example, here is a picture of a linked list with three elements, each of which contains a reference to the name of a state:
As you can see, we think of the list as having a head
and a tail element, and we can think of references
head
and tail
to these elements.
The link part of each element tells us the next or previous element on the list. These will be implemented as references to entire elements.
Implementing Linked Lists
We will see a couple of ways to implement linked lists. Both ways will support a basic set of operations. When we have a set of operations to be supported but without a specific implementation for them, we make an interface. We will use theCS10LinkedList
interface in CS10LinkedList.java.
Like the ArrayList
we saw last time, this interface uses a generic type
T
.
The specifications of the methods of the CS10LinkedList
interface assume that the linked list maintains a notion of a
"current" element of the list. The interface has the following
methods:
-
add
inserts a new element after the current element, making the current element be this new element. If there is no current element, it inserts at the front of the list. -
remove
removes the current element, designating the successor of the removed element as the new current element. -
contains
determines whether an object is in the linked list. It returns a boolean that says whether the object was found. If the object was found, then the current element is set to be this object. (You can then insert after it, set it, or remove it.) -
isEmpty
returns a boolean indicating whether the linked list has no elements. -
hasCurrent
returns a boolean indicating whether the linked list has a current element. -
hasNext
returns true if there is a next item. -
getFirst
makes the current element be the first one in the linked list and returns it (printing an error message if there is no first item). -
getLast
makes the current element be the last one in the linked list and returns it (printing an error message if there is no last item). -
addFirst
inserts an element at the head of the linked list and makes it the current item. -
addLast
inserts an element at the tail of the linked list and makes it the current item. -
clear
removes all elements from the list. -
next
advances the notion of the current item to the next item following the current item and returns it (printing an error message if there is no current item). -
get
returns the data of the current item (printing an error message if there is no current item). -
set
sets the data field of the current item (printing an error message if there is no current item).
Circular, doubly linked lists with a sentinel
The simplest, cleanest way to implement a linked list is with a circular, doubly linked list with a sentinel. Quite a mouthful, indeed. The implementation is in SentinelDLL.java. The class is generic for a typeT
, declared with the line
To start, each list element is an object of the class
Element
and has three instance variables:
-
data
is a reference to the object being stored in that list element. This object must be of the typeT
. For the above example with state names, when we create aSentinelDLL
object,T
will be aString
, so thatdata
is a reference to aString
. -
next
is a reference to theElement
after this one in the list. -
previous
is a reference to theElement
before this one in the list.
The Element
class is a private inner class. It has the
following methods:
- A constructor that takes a reference to an object of type
T
. It stores this reference in the instance variabledata
. -
toString
returns theString
representation of this element's data object.
Element
stores a reference to an object,
strange things can happen if we store a reference to an object and
then the object is changed. Therefore, we require that once a
reference to an object is stored in an Element
, the
object itself should not change.
The class SentinelDLL
implements the linked list. In
fact, it implements the CS10LinkedList
interface. The
methods of SentinelDLL
will need to access the
data
, next
, and previous
instance variables of each Element
object. Because
Element
is a private inner class, the methods of
SentinelDLL
can access its instance variables, even
though they are declared as private. No methods outside of
SentinelDLL
can access the instance variables of
Element
.
Next we examine the declaration for the class SentinelDLL
.
There are several methods, but first let's look at the instance
variables.
-
current
references the "current" list element, which we will need for several of the linked-list operations. -
sentinel
references a special list element, which we call the sentinel.
The scheme is that a linked list has exactly one sentinel, along with
zero or more "real" elements. For example, the list above, with the
names of three states, would contain four Element
objects: the sentinel, and objects for Maine
,
Idaho
, and Utah
. The picture looks like the
following:
Here, I omitted showing which Element
object is pointed
to by current
. Despite how I had to draw the figure,
each of these references points not to individual instance
data, but rather to an entire Element
object.
The sentinel's data
is a null
reference.
Notice how the list is circular, in that you can start at the sentinel
and follow either forward (next
) or backward
(previous
) references and eventually get back to the
sentinel.
In this scheme, every linked list, even an empty one, has a
sentinel. In an empty list, both references in the sentinel point to
the only Element
available, namely the sentinel:
It may seem strange to have an "empty" list actually have an
Element
object in it, but it turns out to really simplify
some of the code. You may appreciate this simplicity later on when we
examine other ways to implement linked lists.
Methods
Having seen how we intend circular, doubly linked lists with a sentinel to be represented, now we examine the methods of theElement
and SentinelDLL
classes, which are in SentinelDLL.java. The methods for
Element
are straightforward, so we won't go over them
here.
Making an empty list and making a list empty
TheSentinelDLL
constructor makes an empty list with only
the sentinel, as the diagram above shows. It also sets the instance
variable current
to point to the only
Element
in town, namely the sentinel. Setting the
next
and previous
fields of the sentinel and
setting current
are done by a call to clear
,
which makes any list empty (leaving any contents for garbage
collection).
Converting to a String
The toString
method for a SentinelDLL
is
fairly straightforward. It uses a common style of traversing a
linked list by a clever for loop header:
sentinel.next
), following
next
references, and stopping when it gets back to the
sentinel. It concatenates the string representation of each item in
the list onto a String
named result
,
returning result
at the end.
Inserting into a list
Theadd
method for a list takes an object reference
obj
, and it inserts it after the Element
object referenced by the instance variable current
.
Notice that we restrict obj
to be of type T
.
The code manipulates references to "splice in" the new element. For
example, if we start from an empty list, where current =
sentinel
, and insert an element with the string
Maine
, we have the following situation:
The add
method makes current
reference
the new Element
object.
The splicing works the same when inserting into any position of the
list. For example, starting from the 3-element list from before, we
insert Ohio
after Idaho
as follows:
As you can easily see from the add
code, it takes
constant time, or Θ(1) time, to insert an element into a
circular, doubly linked list with a sentinel.
Let's take a careful look at how add
works. First, it
makes a new Element
that references the given object, and
x
references this new Element
. It is this
new Element
that we will add to the list. We need to do
four things:
- Make
x
'snext
reference the element following the one thatcurrent
references. The assignmentx.next = current.next
does so. - Make
x
'sprevious
referencecurrent
. The assignmentx.previous = current
does so. - The element following the one referenced by
current
will have a new predecessor, namely the element thatx
references, so we need to set theprevious
instance variable of this element to referencex
's element. The assignmentcurrent.next.previous = x
does so. The expressioncurrent.next.previous
can be a bit confusing, so let's examine it carefully.current
references the current element.current.next
references the element following the one thatcurrent
references. This element has an instance variableprevious
that references its predecessor (which iscurrent
at the time that theadd
method is called, but it's about to be updated). Since we want to assign to theprevious
instance variable of theElement
object referenced bycurrent.next
, we putcurrent.next.previous
on the left-hand side of the assignment statement. - The element referenced by
current
will have a new successor, namely the element thatx
references, so we set thenext
instance variable ofcurrent
's element to referencex
's element. The assignmentcurrent.next = x
does so.
As you can easily see from the add
code, it takes
constant time, or Θ(1) time, to insert an element into a
circular, doubly linked list with a sentinel. You can also see, by
the absence of if-statements, that there are no special cases.
Removing from a linked list
Theremove
method for a list removes the
Element
object that current
references. You
never ever remove the sentinel, so the first thing we
do is check whether current
references the sentinel. If
it does, then we print an error message to System.err
,
rather than to System.out
. On some systems, you can
suppress regular output printed to System.out
, but you
have to go to extra lengths to suppress error messages printed to
System.err
. In Eclipse, when you print to
System.err
, the message appears in red in the console.
We want to make error messages likely to be seen.
Normally, the remove
method is not trying to remove the
sentinel. We splice the current element out of the list and make
current
reference its successor in the list.
For example, to remove Idaho
from the previous list:
and to remove the only element from a list:
The time to remove an element is constant, or Θ(1). As we will see when we examine "simpler" lists, this running time is quite good; with simpler lists, removal takes O(n) time, and in the worst case, it's in fact Θ(n).
Finding a particular list element
Thecontains
method for a SentinelDLL
takes
a reference obj
to an object of the generic type
T
and looks for an element that equals obj
,
according to the equals
method on the data
field of each Element
. We traverse the list, calling the
equals
on each element's data, until a match is found. If the
contains
method finds such an element, it sets
current
to reference it, so that we can next either
add a new element after it or remove it.
We could check to make sure that we haven't returned to the sentinel,
along with checking whether we have a match, but we've already seen a
cool way to use sentinels for searching: put the value you're looking
for in the sentinel. That way, you're guaranteed of finding it. If
where you found it was the sentinel, it wasn't there in the first
place. If where you found it was not the sentinel, then it really was
there. We set sentinel.data
to be the same reference as
obj
before traversing the list, and we make sure to put a
null
back into sentinel.next
after the
traversal is done.
When we try to pull the sentinel trick when linearly searching an array, we are in trouble if the array does not have room for the sentinel. In a circular, doubly linked list with a sentinel, however, we are assured that because we have already created a sentinel object, there's a place to put the object we're looking for.
When we use the sentinel trick, the for-loop is simply
As you can see, this process is really linear search. The time to perform it depends on the time to compare two elements. If we denote this comparison time by t, and we say that the list has n elements, then the time to find a list element is O(tn), or because t is a constant that can be ignored, O(n). In the worst case (which is when it is not really there so that it was found in the sentinel), it's Θ(n).
Easy list functions
The remaining list functions are really easy. Note that the later functions use theisEmpty
, hasCurrent
, and hasNext
predicates rather than just doing the tests directly. This makes changing the
representation easier. You will appreciate this when doing your homework!
-
isEmpty
returnstrue
if and only if the only list element is the sentinel. That is the case precisely when the sentinel references itself. -
hasCurrent
returnstrue
if and only if the there is a current item. That is the case precisely whencurrent
does not reference the sentinel. -
hasNext
returns true if there is both a current item and another item after the current item. -
getFirst
sets thecurrent
reference to the first item in the list and returns it. Error if the list is empty. -
getLast
sets thecurrent
reference to the last element on the list and returns it, so that anything added into the list goes after the last element, thus becoming the new last element. Error if the list is empty. -
addFirst
adds a new element at the head of the list, updatingcurrent
. -
addLast
adds a new element at the end of the list, updatingcurrent
. -
next
movescurrent
tocurrent.next
and returns that next element. Error if there is no current item. -
get
returns the current item. Error if the there is no current item. -
set
assigns to the current item. Error if there is no current item.
Testing the SentinelDLL
class
We can use the ListTest.java program to
test the SentinelDLL
class. You can use the debugger to
examine the linked list if you like.
Notice that to declare and create the linked list, we specify the type
that will be stored in the list. Here, it's a String
:
Singly linked lists
Although doubly linked circular linked lists with sentinels are the easiest linked lists to implement, they can take a lot of space. There are two references (next
and previous
)
in each element, plus the sentinel node. There are applications where
there are huge numbers of very short linked lists. (One is hashing,
which you can learn about in later computer science courses.) In such
situations, the extra reference in each node and the extra node for
the sentinel can take substantial space.
The code for singly linked lists has more special cases than the code for circular, doubly linked lists with a sentinel, and the time to remove an element in a singly linked list is Θ(n) in the worst case rather than the Θ(1) time it takes in a circular, doubly linked list with a sentinel.
The SLL
class in SLL.java
implements the CS10LinkedList
interface with a generic
type T
, just as the SentinelDLL
class does.
A singly linked list, as implemented in the
SLL
class, has two structural differences from a
circular, doubly linked list with a sentinel:
- Each
Element
object in a singly linked list has no backward (previous
) reference; the only navigational aid is a forward (next
) reference. - There is no sentinel, nor does the list have a circular structure.
Instead, the
SLL
class maintains referenceshead
to the first element on the list andtail
to the last element on the list.
A singly linked list with Maine
, Idaho
, and
Utah
would look like
(Recall that the slash is how we indicate a NULL
reference.) A singly linked list with only one element would look like
And an empty singly linked list looks like
The file SLL.java contains the class
definitions for Element
and SLL
for a singly
linked list. These declarations are similar to those for circular,
doubly linked lists with a sentinel. As before, Element
class is a private inner class, and all method declarations are the
same. The only difference is in the instance data. We can use the
same ListTest.java driver to test the
singly linked list class, as long as we change the line creating the
list to read
The file SLL.java contains the class
definitions for Element
and SLL
for a singly
linked list. These declarations are similar to those for circular,
doubly linked lists with a sentinel. As before, Element
class is a private inner class, and all method declarations are the
same. The only difference is in the instance data. We can use the
same ListTest.java driver to test the
singly linked list class, as long as we change the line creating the
list to read
Let's examine the List
methods in SLL.java for singly linked lists. We will
highlight those that differ from those for circular, doubly linked
lists with a sentinel.
Making an empty list
Theclear
method, which is called by the
SLL
constructor as well as being publicly available,
makes an empty list by setting all
instance variables (head
, tail
, and
current
) to null
.
Adding an element into a list
As before, theadd
method places a new
Element
object after the one that current
references. Without a special case, however, there would be no way to
add an element as the new head of the list, since there is no
sentinel to put a new element after. Therefore, if
current
is null
, then we add the new
element as the new list head.
The code, therefore, has two cases, depending on whether
current
is null
. If it is, we have to make
the new element reference what head
was referencing and
then make head
reference the new element. Otherwise, we
make the new element reference what the current element is referencing
and then make current
reference the new element.
If the new element is added after the last element on the list, we
also have to update tail
to reference the new element.
Compare this code to the add code for a circular, doubly linked list with a sentinel. Although there is only one directional link to maintain for a singly linked list, the code has more cases and is more complex. For either implementation, however, adding an element takes only Θ(1) time.
Removing from a linked list
As mentioned, removing an element from a singly linked list takes Θ(n) time in the worst case, which is worse than the Θ(1) time required for a circular, doubly linked list with a sentinel. Why does it take linear time, rather than constant time? The reason is that theprevious
reference in a doubly
linked list really helps. In order to splice out the current element,
we need to know its predecessor in the list, because we have to set
the next
instance variable of the predecessor to the
value of current.next
. With the previous
reference, we can easily find the predecessor in constant time. With
only next
references available, the only way we have to
determine an element's predecessor is to traverse the list from
the beginning until we find an element whose next
value references the element we want to splice out. And that
traversal takes linear time in the worst case, which is when the
element to be removed is at or near the end of the list.
The remove
method first checks that current
,
which references the Element
object to be removed, is
non-null
. If current
is null
,
we print an error message and return. Normally, current
is non-null
, and the remove
method finds the
predecessor pred
of the element that current
references. Even this search for the predecessor has two cases,
depending on whether the element to be removed is the first one in the
list. If we are removing the first element, then we set
pred
to null
and update head
.
Otherwise, we have to perform a linear search, stopping when
pred.next
references the same element as
current
; once this happens, we know that
pred
is indeed the predecessor of the current element.
(There is also some "defensive coding," just in case we simply do not
find an element pred
such that pred.next
references the same element as current
. We do not expect
this to ever happen, but if it does, we have found a grave error and
so we print an error message and return.) Assuming that we find a
correct predecessor, we splice out the current element. We also have
to update tail
if we are removing the last element of the
list.
The bottom line is that, compared to the remove code for a circular, doubly linked list with a sentinel, the remove code for a singly linked list is more complex, has more possibilities for error, and can take longer.
toString for a list
The toString for a singly linked list is similar to how we print a circular, doubly linked list with a sentinel, except that now we start fromhead
rather than sentinel.next
and that
the termination condition is not whether we come back to the sentinel
but rather whether the reference we have is null
. The
for-loop header, therefore, is
Finding a particular list element
Thecontains
method for a singly linked list is perhaps a
little shorter than for a circular, doubly linked list with a
sentinel, because now we do not replace the object reference in the
sentinel. The for-loop header, therefore, becomes a little more
complicated. We have to check whether we have run off the end of the
list (which we did not have to do when we stored a reference to the
object being searched for in the sentinel) and then, once we know we
have not run off the end, whether the element we are looking at equals
the object we want. The for-loop is
Although the code surrounding the for-loop simplifies with a singly linked list, the loop itself is cleaner for the circular, doubly linked list with a sentinel. Either way, it takes Θ(n) time in the worst case.
Easy list methods
-
isEmpty
is easy, but slightly different from the version for a circular, doubly linked list with a sentinel. We simply return whetherhead
isnull
. -
hasCurrent
returnstrue
if and only if the there is a current item. We simply return whethercurrent
isnull
. -
hasNext
checks to see whether there is a current item and whether the next field of the current item is null rather than seeing if it is the sentinel. -
getFirst
is different, as it setscurrent
tohead
. -
getLast
changes, too, settingcurrent
totail
. -
addFirst
andaddLast
are similar to a circular, doubly linked list with a sentinel. However,addLast
has to deal with an empty list separately/. -
get
is unchanged. -
next
is identical to the version in the doubly linked list. (This is an advantage of callinghasNext
rather than doing the test directly in this method.)
Other options
It is also possible to have a dummy list head, even if the list is not circular. If we do so, we can eliminate some special cases, because adding at the head becomes more similar to adding anywhere else. (Instead of changing thehead
you update a next
field.)
It is also possible to have current
reference the item
before the item that it actually indicates, so that removal
can be done in Θ(1) time. It takes a while to get used to
having current
reference the element before the one that
is actually "current."
It is also possible to have a circular singly linked list, either with or without a sentinel.