Oct 21: Maps, Sets, and Instant Runoff Elections
Code discussed in lecture
- SetDemo.java
- PetSounds.java
- Election.java
- VoteTally.java
- InstantRunoffOO.java
- InstantRunoffProc.java
Collections
Java has a group of interfaces for holding collections of objects and classes that implement them. We have seen and used List, which is an interface with two Java-provided implementations:ArrayList
and LinkedList
.
Today will look at
two other interfaces for holding collections of objects: Set
and Map
. Each has two Java-provided implementations. Set
is implemented by HashSet
and TreeSet
.
Map is implemented by HashMap
and TreeMap
.
We will be looking at their underlying data structures, hash tables and
binary search trees, in the next
few lectures.
The List
Interface
As a review, a number of important methods in the List<E>
interface are:
-
boolean add(E o)
Adds the specified element to the end of the list. Returns true if succeeds. -
void add(int index, E o)
Inserts the specified element at position index of this list. -
void clear()
Removes all of the elements from this list. -
boolean contains(Object o)
Returns true if this list contains the specified element. -
E get(int index)
Returns the element at position index. -
boolean isEmpty()
Returns true if this list contains no elements. -
int indexOf(Object o)
Returns the index of the first occurrence of the specified element in the list, or -1 if the list does not contain the element. -
Iterator<E> iterator()
Returns an iterator that goes through the elements in this list in the order that they appear in the list. -
ListIterator<E> listIterator()
Returns a list iterator that goes through the elements in this list in the order that they appear in the list. -
E remove(int index)
Removes the element at the specified position in this list and returns it. -
boolean remove(Object o)
Removes the first occurrence of the specified element from this list if it is present. (True if succeeds). -
E set(int index, E element)
Replaces the element at the specified position with the given element. Returns the old element. -
int size()
Returns the number of elements in this list.
ArrayList
and LinkedList
implement this set of operations,
why have both? Efficiencies differ. Access operations (set
and get
)
are constant time in an ArrayList
, but require time proportional to the
distance to the nearest end for a LinkedList
. (The LinkedList
is a doubly-linked circular linked list, and is smart enough to start at the closest end.)
On the other hand, modification operations (add
and remove
at a
given index) require time proportional to the number of items after the index in an
ArrayList
, because all of these items must be copied. But for a
LinkedList
they are constant after the time to access the index (distance
from nearest end). This means that all Iterator
or ListIterator
operations are constant time for a LinkedList
, but add
and remove
operations take time proportional to the number of remaining items
for an ArrayList
.
The Set
Interface
A Set
differs from a List
in that a List
has a linear order, while a set does not. Furthermore, the an item can
appear multiple times in a List
but only once in a Set
.
The primary operations on a Set<E>
are:
-
boolean add(E o)
Adds the specified element to this set if it is not already present. Returns true if o was not in the set. -
void clear()
Removes all of the elements from this set. -
boolean contains(Object o)
Returns true if this set contains the specified element. -
boolean isEmpty()
Returns true if this set contains no elements. -
Iterator<E> iterator()
Returns an iterator over the elements in this set. -
boolean remove(Object o)
Removes the specified element from this set if it is present. -
int size()
Returns the number of elements in this set (its cardinality).
List
interface.
So why have a separate interface?
The main reason is implementation efficiency. The contains
operation on
either a ArrayList
or a LinkedList
takes
O(n) time, and for an ArrayList
the remove
operation can take O(n) time. For applications like a
dictionary for a spell checker this is too slow.
There are two implementations of Set
in the Java class library.
Both are more efficient for the contains
operation than the List
implementations.
The first is TreeSet
,
which uses a data structure called a balanced binary tree to store the data.
Think of it as a linked list on which you can do binary search. We will talk
about this data structure soon. The important point is that the add, remove, and contains
operations are all O(log n) time. It only works on Comparable
objects.
The iterator is guaranteed
to return the items in increasing order by compareTo
, and takes
O(n) time to iterate through the entire set. Getting the first
item from the iterator takes O(log n) time.
The second is HashSet
, which uses a data structure called a
hash table. We will talk about these next class.
If the hash table is used properly the
add
, remove
, and contains
operations are all O(1) time on average (although it is possible that
they could take O(n) time if you were extremely unlucky).
The iterator returns the items in a somewhat arbitrary order.
As an example of the use of sets, consider the program
SetDemo.java. It creates a set consisting of
all of the keywords in Java. It then uses an iterator to go through the
set and print each of the words. (Note that an iterator on a Set
is identical to an iterator on a List
.) Finally, it lets the user
type words and
determines if they are keywords by using contains
to see if
they are in the set.
The Map
Interface
The Map
interface describes a data structure that can be thought
of as a set where each element has associated data. Each data item is associated
with a key. By looking up the key one can get the associated data. A key
is typically something like your student ID number, and the associated data
might be your student record. Maps are implemented using balance binary
trees and hash tables, just like sets.
The primary operations in a Map<K,V>
are:
-
void clear()
Removes all mappings from this map. -
boolean containsKey(Object key)
Returns true if this map contains a mapping for the specified key. -
boolean containsValue(Object value)
Returns true if this map maps one or more keys to the specified value. -
V get(Object key)
Returns the value to which this map maps the specified key. -
boolean isEmpty()
Returns true if this map contains no key-value mappings. -
Set<K> keySet()
Returns a set view of the keys contained in this map. -
V put(K key, V value)
Associates the specified value with the specified key in this map. Returns the previous value associated with key, or null if key was not in the map. -
V remove(Object key)
Removes the mapping for this key from this map if it is present. Returns the value associated with key (or null if key is not in the map). -
int size()
Returns the number of key-value mappings in this map.
Note the way the the print operation works. The code for this is:
if (animalMap.isEmpty())
System.out.println("The map is empty");
else
{
System.out.println("The animals and their sounds are:");
Set<String> animalNames = animalMap.keySet();
Iterator<String> iter = animalNames.iterator();
while(iter.hasNext())
{
animal = iter.next();
System.out.println(animal + " says " + animalMap.get(animal));
}
}
Note that the first step is to call keySet
to get all of the
keys in the map. Then an iterator is created for the set, and used to
interate through the set, printing each key and the value returned by
get
for that key.
A More Complex Example - Voting
The method of voting where the candidate with the most votes wins the election has some drawbacks. If two conservatives get in a race against a liberal in a conservative district they may split the conservative vote and the liberal gets elected, even though he is the third choice of the majority of the voters in the election. Also, third parties have a hard time getting established, because voting for a third-party candidate can be "throwing away your vote." If about a third of the 22,000 New Hampshire voters who voted for Nader in the Bush-Gore election had voted for Gore instead he would have won the state and the presidency. Florida would not have mattered.
Some states solve these problems by having a runoff election between the top two candidates if nobody gets a majority of the votes. Unfortunately a runoff election costs time and money. A popular alternative suggestion is the instant runoff election.
In an instant runoff election the voters fill out a ballot with an ordered list of candidates, from most favorable to least favorable. The election takes place in rounds. In the first round each ballot awards a vote to the first candidate on the ballot. If nobody has a majority the candidate with the fewest votes is dropped from the election. (In case of ties we will chose one at random.) Then another round is run. This time the ballot's vote is awarded to the first candidate in its list who has not been eliminated. The bottom candidate is dropped, and the process repeats until someone has a majority. (In fact, it can repeat until there is just one candidate left and get the same result. Once someone has a majority they will never be eliminated.)
How could we write a program to determine the winner of an instant runoff election? The first step is to determine what objects appear in the problem and how they interact with one another. One obvious choice is a ballot. We could say, "Oh, that is just a list" and not create an object for it. But let's take an OO approach and say that there should be a Ballot class.
Another object would be the set of all the ballots in the election. We could just say, "Make a set of lists", but let's make Election a class, also.
A final object that may be less obvious is an object that represents the results of the voting. Let's create a VoteTally class. The alternative is to use a map from candidate names to the number of votes that they received.
We could have a class to keep track of the current set of candidates, but the Set
class seems to do everything that we are likely to need. Unless we discover an action that
we need to do that the Set
class doesn't handle we will just use a Set
.
What actions do we need to perform? We first need to get our set of candidates. Note that we can limit this set to candidates who get at least one first-round vote. Others will have zero votes and will be dropped before any of the candidates who got first-round votes. It sounds like the Election is the class that has access to the data to perform this, with help from the Ballot to get the first item on each ballot.
Next we have to run a round of the election. This requires going through all of the ballots, determining who the vote should go to, and increasing that candidate's tally by 1. The Ballot class has the data to determine who should get the vote. The Election class has the ballots. The VoteTally class should update itself by adding a vote for the candidate.
After running a round we have to find the candidate with the fewest votes. The VoteTally class has the information to do this. But what if there is a tie? Well, maybe we find a list of candidates who share the lowest vote total. We pick one at random and eliminate that candidate from the current candidate set.
We have to repeat running a round of the election and eliminating the candidate with fewest votes until we have only one candidate left. This does not seem to be appropriate for any class. A method in the InstantRunoff class can do this.
So what sorts of things do we want to be able to do with a Ballot object?
- Initialize it with the ordered list of candidates. One way to do that is
to start with an empty ballot and use an
addCandidate
method to add candidates to the ballot. - Get the first candidate on the ballot (in order to be able to create the set of initial candidates).
- Given the current set of candidates return the first candidate on the ballot who is in the current set of candidates.
There are many other possible things we could do with a ballot. Get all of the candidates in order is one possibility. So we could supply an iterator. A toString could be useful. A way of getting the number of candidates on the ballot could be useful. But for now we will do the minimum. We can always come back later to add new methods.
What should we do with an Election object? We need to create it plus perform the jobs mentioned above.
- Get the ballots into the Election. A constructor to create an empty Election and an
addBallot
method could take care of this. - Compute the initial set of candidates.
- Count the votes.
What about the VoteTally object?
- Record the votes for candidates when asked to do so.
- get the list of losers (lowest vote getters)
The code in Ballot.java, Election.java, and VoteTally.java do these operations. The class InstantRunoffOO.java supplies the method to loop through the rounds and the main method for testing. Look at code briefly.
An alternate approach is InstantRunoffProc.java. This does the same
thing, but through fixed data structures and static methods. It is less code, which is a plus. There are longer
lists of parameters as all of the data must be passed around "bare". We see data declarations like:
List<List<String>> ballots
. These are not easy to read and take getting used to. In short, there is no data
encapsulation, which is a minus.
On a program this short encapsulation and data hiding aren't that important. On the other hand,
I originally had a Set
of ballots instead of a List
. The Set
of Ballot
objects in InstantRunoffOO.java wasn't a problem, because Ballot
did not override
equals
. But in InstantRunoffProc.java it was a problem, because two ballots with choices, "Romney Huntsman"
were entered into an election but Romney only got one vote. The two ArrayList
objects ended up being
equal, so only one was kept in the set. Changing from Set<List<String>>
to
List<List<String>>
required five changes spread out over four methods. Finding all of the
appropriate changes in a much bigger program (and avoiding changes where the Set<List<String>>
wasn't dealing with ballots and may have been correct as it was) would have been tedious and error-prone. In contrast,
making the change in InstantRunoffOO.java required two changes in Ballot.java
: the declaration of the
instance variable and its initialization in the constructor. I would have only needed to make those two changes
even if the program had millions of lines.
Scanner class
Note the use of the Scanner class in the programs examined today. You can open it on an input stream
usually System.in
) or even on a String
! Then you can read any type of data.
next
reads the next "token" as a String
. (Tokens are like words, things separated
by white space. But you can also change the separator. The class is very flexible!) You can also call
nextLine
, nextInt
, nextLong
, nextDouble
, nextFloat
,
nextBoolean
, nextBigInteger
, nextBigDecimal
, nextShort
,
and nextByte
. It will read characters from the input and covert them to the
corresponding type. There is also a "has" version of each of these that returns true if the next thing in
the input can be converted to the corresponding type (hasNext
, hasNextLine
,
hasNextInt
, etc.).