Skip to main content
Logo image

Section C.20 A3: Java Collections Framework and Hashing

Synopsis.

  1. Short Overview over the Java Collection Framework
    • The Java Collections Framework is a library of data structures. Some we have discussed in Chapter 5
    • Discuss the basic interfaces List, Set, Map
    • Java has special syntax for Iterators, briefly discuss the Iterator interface and explain for loop.
  2. Hashing.
    • Discussed binary search trees as an implementation for maps/sets.
    • Recap that implementing maps subsumes sets
    • New idea that is popular in practice: Use function to map keys to array indexes.
    • Works well if function is injective: Hardly the case in practice because key set is too large and would be dependent on array size.
    • Non-injective function means that we have collisions.
    • First idea: Use collision chains/lists to list all collisions per bucket.
    • Discuss simple code to search/add in a hash table with separate chaining.
    • Discuss the hashCode/equals problem. Refer to Java Collections Framework.
    • Introduce load factor. If load factor less than one and hash function scatters elements evenly, time for search constant.
    • Needs however resizing of hash table upon insertion based on a threshold.
    • Resizing of hash table requires re-hashing: keys can end up in different bins.
    • Mention problem of mutability: Mutating objects such that their hash value changes after they have been inserted may make them “vanish”.
    • Introduce probing. Chaining conceptionally nice but linked lists require more memory and provoke not so local accesses; bad for the cache.
    • New hash function that takes hash value and index into collision chain: determines array index.
    • Linear probing: simple, contiguous accesses but easy cluster formation.
    • Quadratic probing: more complex, not so contiguous but maybe less cluster formation. Interesting question if the it fully utilizes table, i.e. if probing hash function is surjective on array. Yes, if \(c=d=1/2\) and table size is power of two.
    • Final example with all three presented techniques.

Sections Covered.