Lower / upper load factor in hash tables

Question

I am to write a chained hash set class in java.

I understand the load factor is M/capacity where M is the number of elements currently in the table and capacity is the size of the table.

But how does the load factor help me determine if I should resize the table and rehash or not? Also I couldn't find anywhere how to calculate the lower / upper load factors. Are they even needed?

I hope this is enough information, Thank you!!

Solution

A single loadFactor used to configure standard Java hashes (and in a number of hash APIs in other languages) is a simplification.

Conceptually, it is reasonable to distingush

  • Target load, indicate a default memory footprint -- performance tradeoff configuration. when you build a hash of the known size, you choose capacity so that size/capacity is as close to target load, as possible.

  • Max load, you want a hash to never exceed this load. You trigger a resize, if a hash reach this load.

  • Grow factor, a default configuration of how much to enlarge a hash on resize. If capacity is a power of 2, grow factor could be only either 2 or 4.

  • Min load, you want a hash load to never be lower than the min load, maybe unless you remove elements or clear a hash. If capacity is a power of 2, min load couldn't be greater than 0.5. Additionally, max load / min load ratio should be greater or equal to grow factor.

All the above concerns chained hash, for open addressing hash with tombstones things are getting even more complicated.

In java.util.HashMap loadFactor plays the roles of target and max load simultaneously. Grow factor is 2, min load is 0.0.

For chained hash non-power-of-2 capacity is overkill unless you need extremely precise control over memory usage (you don't, trust me) or a capacity between 2^30 and 2^31-1 (because you can't create an array of size 2^31 in Java, it is Integer.MAX_VALUE 1).