Tuesday, May 15, 2012

Collections frame work


Collections in Java: A Brief Introduction
Why use Collections?
If you want to store many values of the same type, why not just use an array? Well, an array may work just fine for your purposes--if you know exactly how large the array needs to be before you put anything in and this size will NEVER EVER change; and if you will never need to search for an element in the array without knowing its specific index.
But suppose you are retrieving records from a database that satisfy certain conditions--and you do not always know how many records you will get. Do you declare the size to be some really large number and just hope it will always be enough? That could mean a lot of memory waste. Do you really want to have to go through and count the number of results every time you run the query just to declare the array? Perhaps you could write the program so that it automatically reallocates the array with a larger size when it fills up. But that is also a lot of extra work.
Here are a few of the more common interfaces and classes within the Java Collections Framework that can make your programming tasks MUCH easier. There are MANY others; but these will help get you started.
. +---- Queue ---- Deque --+
. | |
. | |
. | +---- LinkedList
. | |
. Collection ----+---- List ----+
. | |
. | +---- ArrayList
. |
. |
. | +---- HashSet
. | |
. +---- Set -----+
. |
. +---- SortedSet ---- TreeSet
The Collection Interface
This is the most general superinterface for Java’s predefined classes that store a general group of objects. By “object” we mean an instance of any subclass of the Object class, including Object itself. Unlike arrays, however, none of the Collections Framework classes can store primitives directly. To insert primitive values, you must use the wrapper classes (Boolean, Integer, Character, etc.). More on this later.
The Collection interface allows you to access methods you would use on any Collection, be it an ArrayList, TreeSet, etc. Some of the most common examples would be:
Iterator iterator() Returns an instance of a class that implements the Iterator interface. This allows you to move through and process all the elements of the collection one at a time.
boolean add(Object e) Attempts to add the given Object instance to the collection. The return value tells us whether or not the add was successful. The only way the add can fail is if the Collection happens to be a Set and we try to insert an Object that is already present.
boolean remove(Object o) Removes ONE occurrence of the given Object from the collection, if any are present. The return value tells us whether or not a removal actually occurred.
void clear() Empties the collection.
boolean contains(Object o) Tells us whether or not the collection has the given object as at least one of its elements.
int size() Returns the number of elements in the collection.
Object toArray() Returns an array which contains all the elements of the collection.
Be sure to understand the difference between the Collection interface and the Java Collections Framework. While the Collection interface and its predefined subtypes all belong to the Collections Framework, so do other interfaces and classes which are not subtypes of Collection, such as Map, Iterator, and Comparable.
The List Interface
This interface is the supertype for a group of objects stored as a sequential series. It inherits all the Collection methods and adds some new ones, most notably
Object get(int index)
Since we are dealing with a List and not a generalized Collection, we know that all the elements have been stored sequentially. So we can access the element at any given position in the List, provided it exists. (An IndexOutOfBoundsException will be thrown if it does not.) Remember that the List classes are indexed starting at zero, just like arrays. Because you can insert any Object at all into a List, this method must return the generalized type of Object. If you know that the List only contains objects of a more specific type and you want to use the element you are retrieving in a more type-specific manner, you must use a cast. A List also allows you to insert a new element at a position you specify (by calling add with an additional index parameter). If you do not wish to specify a certain position, the element will be appended to the end of the list by default. When you call the iterator() method on a List, you are guaranteed to receive an Iterator that moves through the List elements in increasing sequential order from beginning to end.
1) ArrayList
Chances are, of all the concrete subclasses of Collection, this is the one you will use most frequently. The ArrayList class implements the List interface using an array of references to the elements inserted. You can specify an initial size if you like, but it is not necessary. The ArrayList class will automatically expand the internal array dynamically (by copying the references to all the elements into a new larger array and then deleting the old array) whenever you try to insert a new element once the array is full. But if you know that the space will be used, it is advisable to go ahead and specify a large initial size rather than let the ArrayList class keep expanding the internal array over and over as you add a large number of elements. One very common use for an ArrayList is in the retrieval of multiple records from a database. The records are returned in an ArrayList and then processed similarly one at a time (displayed on the screen in a certain way, for example) using an Iterator.
2) LinkedList
The LinkedList class implements the List interface as well, but with a linked list instead of an array. LinkedList also implements the Deque interface (see Queue, discussed later).
By linked list we mean that each reference to an element is stored in a node, which contains the reference to the element and a reference to the next node. This means that whenever you want to retrieve an element according to its index, the LinkedList class must reference each node in turn starting from zero until it reaches the reference to the element at the desired position. But the advantage to this approach becomes evident when you consider what happens when you try to remove an element from the list or insert one at a specific position. ArrayList takes the bigger performance hit in this case, since we must shuffle elements either forward or backward in order to close the gap where one was deleted or to open a gap for an insertion. In practice, however, you may not encounter many situations where you need to remove an element from a List or insert one in the middle. Therefore you are likely to use the ArrayList class far more often in practice.
The Set Interface
This interface is the supertype for a group of objects stored as a set in the more mathematical sense. A set may be either sorted or unsorted, but it will not retain the order in which the elements are inserted and it will not allow the insertion of duplicates. Whether or not a prospective element is considered a “duplicate” of one that is already present will be decided using the element’s equals() function. Also, while an individual element of a List is accessed by the index at which its reference is located, an element of a Set is accessed using a reference to the element itself.
Calling the iterator() method on a Set guarantees that the Iterator you receive will process each element of the Set exactly once. There is no guarantee as to the order in which the elements will be processed. In fact, another call to iterator() on the same Set may return an Iterator that processes the elements in an entirely different order from before.
3) HashSet
The HashSet class implements the Set interface using a chained hash table. This means that the element references are stored in an array of slots (called buckets), where the bucket used to store each element is determined using a hash function. By a process called hashing, the hash function takes the object to be stored and converts it into an integer which is used for indexing the array. One could write an entire thesis on the subject of good hashing algorithms. Just remember that a good hash function (called hashCode() in Java) will always return the same value for the same object and any object that is considered equivalent.
For an explanation of what chained hash table means, think for a moment about the number of possible states that an object may take on. Because this number can be potentially huge, the number of buckets in the hash table will probably be much, much smaller. Therefore, it is natural to expect that the objects you insert into a HashSet will sometimes hash to a bucket that is already in use. This is known as a collision. There are other ways of implementing a hash table in which you would compute a different location to try in the event of a collision. But because we are dealing with a chained hash table, we work around collisions by essentially placing multiple elements in the same bucket. The new element is simply attached as part of a new node in linked list fashion to the last element that was added to that same bucket. So a chained hash table is really an array of linked lists.
The main advantage to using a HashSet is speed. Perhaps surprisingly, whether you are retrieving, inserting, or removing, the speed of a HashSet does not depend on its size! Rather, it depends on what is called the load factor, or the ratio of the number of elements currently inserted to the number of buckets allocated. The HashSet class will generally allow you to insert new elements until the load factor reaches about 75%, whereupon the next insertion will prompt the creation of a new larger storage table and the rehashing of the elements into this new table.
One disadvantage to using a HashSet is that it will never enforce any type of order. Whenever you iterate through the elements of a HashSet, always assume that they may be processed in any possible order--regardless of whatever order they may have been processed in before. Another thing the HashSet sacrifices in the name of speed is efficient use of memory. Because the HashSet class prevents the load factor from getting high enough to have a noticeably negative effect on performance, there will always be some unused space (and sometimes lots of it) in the storage table.
But if you want to work with the elements of your collection on a more individual basis instead of just dumping them all in a list all at once and then processing them all in the same way; if your main concern is speed; and if you do not need duplicates or any certain ordering, a HashSet or HashMap is the way to go. As a simple example, consider an extensive list of names and phone numbers. As with the ArrayList, a HashSet lets you specify an initial size if you desire--and you should if you know your collection will be large and you have some idea what the size will be.
The SortedSet Interface
This is a subinterface of the Set interface which, as you might expect, enforces a definite ordering of the elements inserted. If you are creating a SortedSet of Integer objects, of Double objects, or of String objects, for example, you can just insert the elements and they will be automatically sorted in ascending order. In fact, the elements you are inserting into a SortedSet must all be of a type that implements the Comparable interface--as many predefined classes already do. Otherwise the SortedSet will not be able to “look” at any given pair of elements and “know” which one comes “before” the other. However, there is an exception to this Comparable-type-element rule. You can define your own order in which the elements are to be sorted by means of a Comparator. In this case, your Comparator must be passed to the SortedSet on creation. (The Comparable and Comparator interfaces are discussed in more detail later.) When you apply the iterator() method to a SortedSet, you will receive an Iterator that will process the elements in their proper sorted order.
4) TreeSet
The TreeSet class implements the SortedSet interface using a red and black tree. That is, TreeSet is implemented internally as a binary tree in which each node is labelled as either “red” or “black”. A binary tree is like a linked list in that the references to the elements being inserted are again stored in nodes; but in this case each node can reference up to two additional nodes (called children) instead of just one. The purpose of the “red” and “black” tagging is to enforce a sense of “balance” (i.e. to maintain efficiency by ensuring that the tree and indeed each of its subtrees always has about the same depth on either side).
As with the LinkedList, no dynamic expansion and reinsertion is ever needed--additional memory is always allocated one new node at a time as the need arises. It is therefore never necessary or even helpful to specify an initial size. The power behind the TreeSet class comes from its structure--which allows retrievals, insertions, and deletions of its elements to all occur in logarithmic time while always maintaining some inherent ordering.
The Queue Interface
New in Java 1.5 (Oops, I mean Java 5!) a Queue enforces restricted access to its elements as well as a certain ordering of the elements. This ordering is typically FIFO (first-in-first-out, like a line at a checkout counter). But the ordering could also be LIFO (last-in-first-out, i.e. a stack) or based on the natural ascending order of the elements or a given Comparator (i.e. priority queue). You also have the option of either setting a maximum size for the Queue or letting it grow unrestricted.
Here are the new methods introduced in the Queue interface (which add to the methods inherited from Collection):
Object element() Retrieves the leading element of the Queue without removing it. Throws a NoSuchElementException if the Queue is empty.
Object peek() Same as element() but returns NULL for an empty Queue rather than throwing an exception.
boolean add(Object e) Inserts the given Object into the Queue according to the enforced ordering of the Queue’s elements. Throws an IllegalStateException if it happens that the Queue has a restricted capacity and is full. Note that this method overrides the add(…) method from Collection.
boolean offer(Object e) Same as add(…) but returns NULL if the Queue was full instead of throwing an exception.
Object remove() Same as element() but deletes the leading element from the Queue in the process.
Object poll() Same as remove() but simply returns NULL if the Queue is empty.
The Deque interface (short for double-ended queue) extends Queue and represents a more generalized queue--one that supports easy read, insert, and remove access to both the front and the back of the queue.
One popular application for a traditional queue (FIFO) is a call center. Calls enter the queue at the back in the order in which they arrive and exit the queue from the front each time an associate becomes available to take a new call.
For an example in which you might use a stack (or LIFO queue), consider a calculator that performs operations using postfix (or Reverse Polish) notation. If you were to simulate this using a Java application, you could push the numbers onto a stack as they are received from user input. But when the user enters an operator instead of a number (let’s say, +, -, *, /, %, or ^), you could then pop the top two numbers (i.e. the last two submitted) off the stack, apply the operator to these two numbers, and push the result back on the stack. If there were not enough operands available to perform the operation, you would probably wish to show the user an error message.
As for a priority queue, consider a series of documents sent by different users to the same printer. Rather than simply executing all the jobs on a first-come-first-serve basis, you may also wish to take into account the size and the level of criticality of each job. Naturally, you would need to specify the means by which to compute whether one job should take precedence over another. In Java, you would do this using the Comparable and/or Comparator interface. Once the priority queue knows how to “sort” its elements (print jobs), it will always pick up the “minimum element” (i.e. highest priority print job in this case) for the next execution once the last job is complete.
The Comparable and Comparator Interfaces
Neither of these is a subtype of the Collection interface. But both still belong to the Java Collections Framework and are used quite frequently in dealing with collections. The Comparable interface contains one method--int compareTo (Object o). This method returns a negative integer if the Object used to call the method is considered “less” than the Object passed in; a positive integer if the calling Object is considered “greater,” and zero if the two Objects are considered “equal.” Many predefined classes already have an inherent implementation of the compareTo method, such as Boolean, Integer, String, Calendar, etc., which works as you would intuitively expect. The compareTo method of a class is what is used to determine the so-called natural ordering for objects of the class.
The Comparator interface defines two methods, one of which is int compare(Object lhs, Object rhs). Much like the compareTo method, this one returns a negative, zero, or positive integer depending on whether the left Object is “less than,” “equal to,” or “greater than” the right Object.
Why use Comparator when we can define a method of ordering using Comparable? Suppose you wish to sort a List of objects of a class (call it class A) that someone else wrote. If A does not implement Comparable then there is no compareTo method available to tell the sort method how to compute the natural ordering. Using a Comparator allows you to define your own method of ordering objects of type A without the hassle of creating a “dummy” class that just extends A and implements Comparable. Not only that--what if A does implement Comparable but the compareTo method does not give the order you are looking for? You can define any number of Comparator objects, each designed to order objects of type A in its own unique way.
The Collections Interface
This class actually extends Object directly; but it contains many different methods that are useful when manipulating collections (including maps). Just like the Math class, all of the methods are static. But the Collections class is not final. Here are a few of the included methods:
int binarySearch(List list, Object key) Uses a binary search algorithm to determine whether the given List contains the given key. It is assumed that the given List is already sorted according to the natural order of its elements. Otherwise, a binary search may not work correctly and the method may return an erroneous result. The performance time is logarithmic provided that the List is “random access” (one that accesses any specified position in near-constant time). The index of the search key is returned if the key is found. If not, the return value is -p - 1, where p is the point at which the key would go if inserted according to the natural order.
Object max(Collection coll) Returns the maximal element of the given List, as determined by the natural ordering of the elements. As with the sort method, the elements must all be of a type that implements the Comparable interface.
Object min(Collection coll) Returns the minimal element of the given List, as determined by the natural ordering of the elements. As with the sort method, the elements must all be of a type that implements the Comparable interface.
boolean replaceAll(List list, Object oldValue, Object newValue) Replaces all occurrences of oldValue in the given List with newValue. Returns true or false based on whether or not oldValue was encountered.
void reverse(List list) Reverses the order of the elements in the given List.
void shuffle(List list) Rearranges the elements of the given List into some random (as defined by the default source of “randomness,” which is not completely unbiased) permutation.
void sort(List list) Sorts the elements of the List provided in ascending order according to their natural ordering. Actually, the elements must all be instances of a type that implements the Comparable interface. This is because the “natural ordering” will be determined using the compareTo methods of the elements. This method uses a Merge Sort algorithm that gives N log N performance. All the elements are actually dumped into an array and sorted from there--to avoid all the extra overhead of sorting a LinkedList in place.
The shuffle method above is overloaded to allow you to specify an alternate source of “randomness” when jumbling. Of the others shown above, most are overloaded to allow you to specify a Comparator as well. This means that the elements of the given List/Collection do not need to be of a subtype of Comparable; and the method will work using the ordering imposed by the given Comparator‘s compare method rather than the elements‘ “natural ordering.” Other than that, the methods are the same:
int binarySearch(List list, Object key, Comparator comp)
Object max(Collection coll, Comparator comp)
Object min(Collection coll, Comparator comp)
void sort(List list, Comparator comp)
+--- HashMap
Map ---+
+--- SortedMap --- TreeMap
The Map Interface
For our purposes, a map is a way of relating objects together in pairs. Instead of inserting, retrieving, or removing a single value as with a Set or List, you perform an insert by providing a key and its associated value; and you perform a retrieval by providing a key and getting back its associated value. While duplicate values are permitted, duplicate keys are not--so in this way a Map bears a bit of a resemblance to a Set. Uniqueness of keys assures us that a retrieval will never be ambiguous--every key gives us back ONE value (or no value at all if the key is not actually present). If you are mathematically inclined, you can think of a map as a function from the set of keys to the set of values. Each (key, value) pair is called an entry. Remember that although the Map interface does not extend the Collection interface in any way, shape, or form, it is still considered to be part of the Java Collections Framework. Many of the methods provided by Collection also appear and are similar in Map. For example:
Object put(Object key, Object value) Associates the given key with the given value in the Map, regardless of whether or not the key was previously associated with anything. The method will return either this previous value or NULL if there was no previous association.
Object remove(Object key) Removes the key and its associated value from the map, if present. The return value behaves like much as it does for put--it will be either the removed value or NULL if there was nothing to remove.
void clear() Empties the collection.
boolean containsKey(Object key) Tells us whether or not the map has the given object as one of its keys.
boolean containsValue(Object value) Tells us whether or not the map has the given object as at least one of its values. This method will generally be executed in linear time.
int size() Returns the number of entries in the map.
Set keySet() Returns a Set which contains all the keys of the map. Changes made to this Set will be reflected in the Map. For example, removing an element of this Set will cause the corresponding entry to be removed from the Map.
Collection values() Returns a Collection which contains all the values in the map. Changes made to this Collection will be reflected in the Map. For example, removing an element of this Collection will cause the corresponding entry to be removed from the Map.
5) HashMap
As you would probably guess after becoming familiar with HashSet, this class implements the Map interface using a chained hash table. The keys are converted into integers using the hashCode() method. The resulting integers are used to index the buckets in the hash table. The buckets store the corresponding values. One common application of this is a phone directory (list of names and phone numbers). Assuming each person has exactly one phone number, you would insert (name, phone) entries and retrieve using name keys. Were you inclined to allow multiple phone numbers for the same person, you might use a List of phone numbers for each value instead of a single phone number.
The SortedMap Interface
Naturally, this is Map’s counterpart to SortedSet. It is a subinterface of Map that enforces a definite ordering (by key) of the entries inserted. Of course, the SortedMap must also have a method by which to enforce the ordering. So you again have two choices: make sure all your elements are subtypes of Comparable OR provide your SortedMap with a Comparator on creation.
6) TreeMap
This, predictably, is a red and black tree implementation of the SortedMap interface. Again, there is no need to ever specify an initial size (in fact, you cannot). And any retrieve, insert, or remove operation generally occurs within logarithmic time. If you expect to continually update and print new copies of your phone directory from the previous example, you might consider implementing it using a TreeMap instead of a HashMap. That way you will maintain the sorted order with every insertion and avoid the overhead of repeatedly sorting the whole directory.
Think about it…
1. The sort and shuffle methods from Collections make sense only when applied to a Collection that happens to be a List. Why is this?
2. If you are familiar with postfix operations, you might try to trace through mentally or on paper what the internal stack described previously in the Queue application would look like (starting out empty) as the calculator processes the numbers and operations it receives. When you pop the top two numbers off the stack to apply an operator, use the first number off the top as the right operand. For example:
A) “3 4 5 + + 6 - * 2”
B) “22 23 + 3 3 / / 2 2 2 ^ ^ *”
3. Note that, unlike Collection, there is no iterator() method defined by the Map interface. How might you iterate through and process the entries of a Map despite this limitation?
4. A word of caution regarding the put and remove methods from the Map interface: a return value of NULL does not necessarily mean that the key that was provided did not exist previously in the map. It might be possible that the key was actually associated with the value NULL. How could one distinguish between these two cases?
5. Although a Map guarantees that the keys will all be unique, the values may not be. What do you think would happen if you were to call the values() method on an instance of Map having duplicate values--and then call the remove() method on the resulting Collection, passing in one of the duplicated values from the Map?
 

123passportphoto is a very easy to use passport photo website that provides six enhanced photos. I have never had an issue while using this ...