Choosing a Suitable Data Structure(Java)

When we do programming we may be working on different domains. So the processing of data differs from domain to domain. Some domains may need efficiency in speed.Some may need efficiency in memory usage. Some may need overall performance to be good.
So depending on the situation we may need to have different ways of accessing and processing data. So the data may be needed to be sorted in a particular order. Or data accessing may be critical, so that the accessing of data would need to be thread safe. So like that different problem domains may require different approaches of data storage facilities provided at the run time.
To achieve this purpose, in java language, there are many data structures provided. When we develop applications, we may need to choose our suitable data structure among other data structures provided by the language as default.
So this article lists down the main features and comparison between java Collection data structures. Its up to you to choose your suitable structure.


The Collection interface and class hierarchy is shown in the picture.

Now lets look at the features of each and every Collection data type.

ArrayList

1.Quick random access.

The reason for this is that the ArrayList is a data structure consists of subsequent memory locations in the memory. They are accesses based on the indexes. So if we want to access a memory location in the middle of the ArrayList, we can access it in O(1) time using the array indices.

2.Fast Iteration
This structure is fast when we are iterating through the List. Since the indexed memory locations exist, the computer can access subsequent memory locations faster than a scattered set  of memory locations.(Memory locations are one next to the other)

3.Slow in insertion and deletion
If an insertion occurred, then the whole array of memory locations from that particular element till the end would have to be re-indexed unless inserted to the end of the list.(increment the index by 1). Takes O(n).

Vector
All the features listed above apply for Vector as well. So as you might wonder, the difference is that all the methods in Vector are synchronized. So it is a thread safe data structure.So in a multi-threaded environment if the data access should be critical, then go for a Vector data structure. But it degrades the overall performance as getting the object lock takes time.

LinkedList
1.Iteration is slow
The reason is that if we iterate through the list, the computer would need to move into different memory locations scattered(not subsequent locations like an array) across the memory and access data.Normally in a linked list the next memory location is identified using a pointer stored with the current data element. So, the memory locations of each elements can be spread all over the memory heap. So the iterations would be slow.
2.Quick deletions and insertions
Unlike in the ArrayList, since the memory locations are connected sequentially using pointers from one location to another, if we want to insert an element or delete an element, all that is required is just swapping a few pointers.(Changing addresses that those pointers point to). So it doe not have to do any kind of re-indexing task. So, the insertions and deletions are relatively fast.

PriorityQueue
1.This is not a First Come First Served queue.
2.Sorted (Not the insertion order but the natural order)-(natural order 'a' precedes 'b' precedes 'c' like that)
If you want to sort this in a customized order, then you will have to create an object of customized Compactor to have a particular order of your wish.

java.​util.​PriorityQueue public PriorityQueue(int initialCapacity, Comparator<? super E> comparator)
Creates a PriorityQueue with the specified initial capacity that orders its elements according to the specified comparator.
Parameters:
initialCapacity - the initial capacity for this priority queue
comparator - the comparator that will be used to order this priority queue. If null, the natural ordering of the elements will be used.
Throws:
IllegalArgumentException - if initialCapacity is less than 1

This is the documentation given for PriorityQueue alternative constructor.

Lets see how to use that comparable object
 public class Element implements Comparable<Element>{  
   private static final int INIT_CAPACITY=100;  
   public static void main(String[] args) {  
     Element comp=new Element();  
     Queue<Element> lista = new PriorityQueue<Element>(INIT_CAPACITY,(Comparator<? super Element>) comp);     
   }  
   @Override  
   public int compareTo(Element o) {  
     return 0;//return -1,o or 1 depending on the compared results.   
   }   
 }  

Set
Usually we need a Set data structure when we want to avoid having duplicate elements in the data structure and we can check the existence of an element using contains(<element>) method easily.
HashSet has no order in it but LinkedHashSet has the insertion order. TreeSet is ordered(natural order)

Other than above mentioned Collection types, there is another type called Map, but Map does not implement the Collection interface.


Maps are used to store key value pairs and the advantage is that we can access the map using the key and get the required element right away.(in O(1) time). So it is efficient to use when you would want to access elements by a key. It is not used for iteration.
The HashTable class is a legacy class. No null key value pairs are allowed. If added NullPointerException is thrown. The methods are synchronized in this class. So as mentioned before in Vector, a performance degradation is possible in a multi-threaded environment due to its thread safety.
HashMap is not ordered but null key value pairs are allowed.LinkedHashMap is the only class that provides the last access order of elements.
SortedMap is sorted. TreeMap is sorted in the natural order.

Apart from that there are two utility classes called 'Collections' and 'Arrays'.
These two classes provide static methods for various actions like sorting a List.

Comments