Implementation is based on parity-preserving bit operations (XOR and ADD), multiply, or divide. Hash functions can be designed to give the best worst-case performance, good performance under high table loading factors, and in special cases, perfect (collisionless) mapping of keys into hash codes. High table loading factors, pathological key sets and poorly designed hash functions can result in access times approaching linear in the number of items in the table. Hash functions rely on generating favourable probability distributions for their effectiveness, reducing access time to nearly constant. Map the key values into ones less than or equal to the size of the tableĪ good hash function satisfies two basic properties: 1) it should be very fast to compute 2) it should minimize duplication of output values (collisions).Scramble the bits of the key so that the resulting values are uniformly distributed over the keyspace.Convert variable-length keys into fixed length (usually machine word length or less) values, by folding them by words or other units using a parity-preserving operator like ADD or XOR.The output is a hash code used to index a hash table holding the data or records, or pointers to them.Ī hash function may be considered to perform three functions: In some cases, the key is the datum itself. The keys may be fixed length, like an integer, or variable length, like a name. 3.9 Variable range with minimal movement (dynamic hash function)Ī hash function takes a key as an input, which is associated with a datum or record and used to identify it to the data storage and retrieval application.The hash function differs from these concepts mainly in terms of data integrity. Although the concepts overlap to some extent, each one has its own uses and requirements and is designed and optimized differently. Hash functions are related to (and often confused with) checksums, check digits, fingerprints, lossy compression, randomization functions, error-correcting codes, and ciphers. Use of hash functions relies on statistical properties of key and function interaction: worst-case behaviour is intolerably bad with a vanishingly small probability, and average-case behaviour can be nearly optimal (minimal collision). Hashing is a computationally and storage space-efficient form of data access that avoids the non-constant access time of ordered and unordered lists and structured trees, and the often exponential storage requirements of direct access of state spaces of large or variable-length keys. They require an amount of storage space only fractionally greater than the total space required for the data or records themselves. Hash functions and their associated hash tables are used in data storage and retrieval applications to access data in a small and nearly constant time per retrieval. Use of a hash function to index a hash table is called hashing or scatter storage addressing. The values are usually used to index a fixed-size table called a hash table. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. There is a collision between keys "John Smith" and "Sandra Dee".Ī hash function is any function that can be used to map data of arbitrary size to fixed-size values. A hash function that maps names to integers from 0 to 15.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |