Academic
Publications
Minimal perfect hashing: A competitive method for indexing internal memory

Minimal perfect hashing: A competitive method for indexing internal memory,10.1016/j.ins.2009.12.003,Information Sciences,Fabiano C. Botelho,Anísio La

Minimal perfect hashing: A competitive method for indexing internal memory   (Citations: 1)
BibTex | RIS | RefWorks Download
A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values. Since no collisions occur, each key can be retrieved from a hash table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. Until recently, the amount of space to store an MPHF description for practical implementations found in the literature was O(logn) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs presented in the literature changed this scenario: an MPHF can now be described by approximately 2.6 bits per key.The objective of this paper is to show that MPHFs are, after the new recent results, a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed. We have shown that MPHFs provide the best tradeoff between space usage and lookup time when compared with other open addressing and chaining hash schemes such as linear hashing, quadratic hashing, double hashing, dense hashing, cuckoo hashing, sparse hashing, hopscotch hashing, chaining with move to front heuristic and exact fit. We considered lookup time for successful and unsuccessful searches in two scenarios: (i) the MPHF description fits in the CPU cache and (ii) the MPHF description does not fit entirely in the CPU cache. Considering lookup time, the minimal perfect hashing outperforms the other hashing schemes in the two scenarios and, in the first scenario, the performance is better even when the compared methods leave more than 80% of the hash table entries free. Considering space overhead (the amount of used space other than the key-value pairs), the minimal perfect hashing is within a factor of O(logn) bits lower than the other hashing schemes for both scenarios.
Journal: Information Sciences - ISCI , vol. 181, no. 13, pp. 2608-2625, 2011
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...In [Botelho et al. 2008b, Botelho et al. 2009a] we show that MPHFs provide the best tradeoff between space usage and lookup time when compared to other hashing schemes for indexing internal memory when static key sets are involved...
    • ...In [Botelho et al. 2009a] we extended our prior study in two aspects...

    Fabiano C. Botelhoet al. Near-Optimal Space Perfect Hashing Algorithms

Sort by: