Skip to main content

Section C.20 A3: Java Collections Framework and Hashing

Synopsis.

  1. Short Overview over the Java Collection Framework

    • The Java Collections Framework is a library of of data structures. Some we have discussed in Chapter 5

    • Discuss the basic interfaces List, Set, Map

    • Java has special syntax for Iterators, briefly discuss the Iterator interface and explain for loop.

  2. Hashing.

    • Discussed binary search trees as an implementation for maps/sets.

    • Recap that implementing maps subsumes sets

    • New idea that is popular in practice: Use function to map keys to array indexes.

    • Works well if function is injective: Hardly the case in practice because key set is too large and would be dependent on array size.

    • Non-injective function means that we have collisions.

    • First idea: Use collision chains/lists to list all collisions per bucket.

    • Discuss simple code to search/add in a hash table with separate chaining.

    • Discuss the hashCode/equals problem. Refer to Java Collections Framework.

    • Introduce load factor. If load factor less than one and hash function scatters elements evenly, time for search constant.

    • Needs however resizing of hash table upon insertion based on a threshold.

    • Resizing of hash table requires re-hashing: keys can end up in different bins.

    • Mention problem of mutability: Mutating objects such that their hash value changes after they have been inserted may make them “vanish”.

    • Introduce probing. Chaining conceptionally nice but linked lists require more memory and provoke not so local accesses; bad for the cache.

    • New hash function that takes hash value and index into collision chain: determines array index.

    • Linear probing: simple, contiguous accesses but easy cluster formation.

    • Quadratic probing: more complex, not so contiguous but maybe less cluster formation. Interesting question if the it fully utilizes table, i.e. if probing hash function is surjective on array. Yes, if \(c=d=1/2\) and table size is power of two.

    • Final example with all three presented techniques.

Sections Covered.

Section 8.9, Section 5.3