?

Log in

No account? Create an account
penrose orange

stephenw32768


/var/log/stephen

cat /var/log/stephen >/dev/eyes


A different data structure today
penrose orange
stephenw32768
I've been talking about stacks for a few days, so let's have something different today. Maps.

When implementing maps (also known as associative arrays), there are really three choices: linked lists of key/value pairs (constant time insertion, linear retrieval, insertion order preserved), binary search trees (logarithmic insertion and retrieval, keys reordered into natural sorting order), or hash tables (constant time insertion and retrieval, chaotic ordering, some memory overhead). The "normal" choice is a hash table, unless ordering of keys is required, or if the memory penalty is too much to pay.

We covered maps in the data structures class at university, years and years ago. I've implemented more linked lists than I care to remember, and I seem to recall implementing a tree in the second year functional programming class, but I've never implemented a hash table.

Upon reading the Wikipedia article about hashing, the little bits about the subject that I recalled form university came bubbling to the surface of my mind, so I decided to have a go at implementing one. Here's my chaining hash table in Java.

(Disclaimer: don't use it for anything important. It's probably full of bugs...)

It benchmarks reasonably well against Java's standard HashMap class.

Perfectionism
penrose orange
stephenw32768
Made some minor tweaks to the hash table implementation:
  • the constructor validates its arguments and throws an exception if they're out of range;
  • getPrime() behaves more reasonably if the requested number is larger than the largest prime it knows about. In practice, I doubt it'd ever get as far as requesting such a large prime; the out-of-memory killer would've disposed of the process long before it got to allocating such a huge hash table...
  • rehash()'s loop is now a little less confusing, hopefully.

A patch against the previous version, showing exactly what has changed, is available.