DragonFly On-Line Manual Pages

HASH(3)               DragonFly Library Functions Manual               HASH(3)

NAME
     hash_initialise, hash_insert, hash_retrieve, hash_delete,
     hash_iterator_initialise, hash_fetch_next, hash_iterator_deinitialise,
     hash_deinitialise - hash table manipulation functions

LIBRARY
     Hash Library (libhash, -lhash)

SYNOPSIS
     #include <sys/types.h>
     #include <hash.h>

     int
     hash_initialise(hash *h, u_int32_t number_of_buckets,
         u_int32_t (*hash_function)(void *key, u_int32_t number_of_buckets),
         int (*compare_keys)(void *key1, void *key2),
         int (*duplicate_key)(void **destination, void *source),
         void (*free_key)(void *key), void (*free_value)(void *value));

     int
     hash_insert(hash *h, void *key, void *value);

     int
     hash_retrieve(hash *h, void *key, void **value);

     int
     hash_delete(hash *h, void *key);

     int
     hash_iterator_initialise(hash *h, struct hash_iterator *it);

     int
     hash_fetch_next(hash *h, struct hash_iterator *it, void **key,
         void **value);

     int
     hash_iterator_deinitialise(hash *h, struct hash_iterator *it);

     void
     hash_deinitialise(hash *h);

DESCRIPTION
     These functions manipulate a hash table, sometimes known as an
     associative array. A hash table is a data structure that allows you to
     store "values" indexed by "key". A key can be any kind of data structure
     but is commonly a string. Indexing by integer is a reasonable way to
     simulate a sparse array. The data can also be any kind of data structure,
     user-defined or built-in.

     To allow any kind of data structure to be used as a key or a value a void
     * is used to represent each. While giving flexibility, this use of void *
     does force the user to supply functions to perform all manipulation and
     examination of the keys - it is not possible for libhash to check for
     equality, copy or delete keys on its own. Pointers to these user
     nominated functions are provided to the hash_initialise() function,
     called before any other operations on the hash table are allowed.

     The hash_initialise() function is called to initialise the hash table
     pointed to by h.  It must be called before any other hash function on the
     hash table h.

     The parameter number_of_buckets sets the "size" of the hash table - the
     number of buckets that the data is distributed over. For the best
     performance the general rule is that this number should be prime and
     approximately twice the size of the expected data set. Chaining is used
     in case of collisions so there is no actual limit to the amount of data
     that can be stored, subject to system resource limits.

     The parameter hash_function is a pointer to the hash function to be used.
     The hash function's job is to map keys to buckets so it must be aware of
     the data structure being used for keys and must return a number between 0
     and number_of_buckets - 1 inclusive. A good hash function maps the data
     set evenly over all buckets.  See hash_hash_int(3) for some provided hash
     functions. This argument may be NULL to use a provided function that
     works with strings.

     The parameter compare_keys is a pointer to a function that can provide
     ordering information about 2 keys.  This function should return a value
     less than 0 if key1 is less than key2, greater than 0 if key1 is greater
     than key2 and 0 if key1 is equal to key2.  This argument may be NULL to
     use a provided function that works with strings.

     The parameter duplicate_key is a pointer to a function that will allocate
     memory and make a copy of a key, the original located at the address
     source and the address of the new copy to be stored at destination.  See
     hash_copy_int(3) for some provided duplicate functions. This argument may
     be NULL to use a provided function that works with strings.

     The parameter free_key is a pointer to a function that will free the
     memory used by a key allocated with duplicate_key.  This argument may be
     NULL if keys don't need "freeing".

     The parameter free_value is a pointer to a function that will free the
     memory used by a value stored in the hash table. This argument may be
     NULL if values don't need "freeing". The ability to pass NULL is useful
     when data added to the hash table is already part of another data
     structure or memory management scheme - the hash table being used as an
     index.

     The hash_insert() function is used to add key/value pairs to the hash
     table.  h is a pointer to the hash table to which the pair is to be
     added.  key is the key under which the value is to be indexed under.
     value is the value to be stored. The hash_insert() function will make a
     copy of key but only store a pointer to value so it is the caller's
     responsibility to ensure the memory used by value is not lost
     prematurely.

     The hash_retrieve() function is used to retieve data previously stored in
     the hash table by the function hash_insert().  The parameter h is a
     pointer to the hash table to retrieve from. The parameter key is a
     pointer to a key considered equivalent (by the function compare_keys
     provided to hash_initialise()) to the key used to store the value. The
     parameter value is a pointer to a location at which to store the address
     of the data.

     The hash_delete() function is used to remove a key/value pair from a hash
     table. The parameter h is a pointer to the hash table from which to
     remove the pair. The parameter key is a pointer to a key considered
     equivalent (by the function compare_keys provided to hash_initialise())
     to the key used to store the pair. The functions pointed to by the
     parameters free_key and free_value provided to the hash_initialise()
     function are used (if not NULL) to free the memory used by the key/value
     pair. It is important to note that if you are using data retrieved by
     hash_retrieve() and then call hash_delete() on the same pair, the pointer
     returned by hash_retrieve() is invalid.

     The function hash_iterator_initialise() is used to initialise a struct
     hash_iterator.  A struct hash_iterator, in combination with
     hash_fetch_next(), is used to allow a caller to iterate over all
     key/value pairs stored in a hash table. The parameter h is a pointer to
     the hash table to be iterated over. The it parameter is a pointer to the
     struct hash_iterator.  hash_iterator_initialise() must be called before
     any other functions that use the struct hash_iterator.

     The hash_fetch_next() function fetches the "next" key/value pair from a
     hash table. By repeatedly calling it, this function can be used to
     iterate through all key/value pairs stored in a hash table. Note that the
     pairs are returned in an order which is an artifact of the internal
     workings of the library and probably meaningless to a user. The parameter
     h is a pointer to the hash table to iterate over (this should be the same
     as the h argument provided to hash_iterator_initialise()).  The parameter
     it is a pointer to the struct hash_iterator previously initialised with
     hash_iterator_initialise().  The parameters key and value are pointers to
     locations to store the addresses of the key and value fetched.  If
     hash_fetch_next() is called after the last pair in the hash table has
     been fetched then 0 is returned, nothing is stored at key or value, and
     the struct hash_iterator is reset so the next call to hash_fetch_next()
     will return the first pair in the hash table. If a pair is added to the
     table whilst a struct hash_iterator is active it may or may not be
     returned by hash_fetch_next() before the struct hash_iterator is reset.
     The calling of hash_delete() while a struct hash_iterator is active is
     safe for any pair that has already been returned by hash_fetch_next().

     The function hash_iterator_deinitialise() should be called when finished
     with a struct hash_iterator.  The parameter h is a pointer to the hash
     table originally passed to hash_iterator_initialise().  The parameter it
     is the struct hash_iterator to be deinitialised. This function should be
     called before the memory used by a struct hash_iterator is disposed of.

     The function hash_deinitialise() is used to dispose of a hash table. The
     parameter h is the hash table to dispose of. After calling
     hash_deinitialise() all pointers returned by hash_retrieve() or
     hash_fetch_next() are invalid.

IMPLEMENTATION NOTES
     Upon collision (where 2 keys are mapped by the hash function to the same
     bucket) ordered linked lists are used. On retrieval a sequential search
     is used. On a hash table that is overly full (has a lot of collisions)
     retrieval time is therefore O(n/(2 * number_of_buckets)) as opposed to
     O(1) for a sparsely populated table. Therefore it is wise to set
     number_of_buckets to be large enough and to use a hash function well
     suited for your data.

RETURN VALUES
     The hash_initialise(), hash_insert(), hash_retrieve(), hash_delete(),
     hash_iterator_initialise(), hash_fetch_next() and
     hash_iterator_deinitialise() functions return 0 on failure and set the
     global variable errno to indicate the error.

     The hash_deinitialise() function returns no value.

EXAMPLES
     See the tests directory included in the distribution.

ERRORS
     [ENOENT]
             There was no value stored for the provided key or in the case of
             hash_fetch_next() the end of the hash table has been reached.

     [ENOMEM]
             A call to malloc failed.

SEE ALSO
     dbopen(3), emmao(8), libhash_convenience(3), queue(3)

HISTORY
     The libhash library was written in January 2002 for emmao (a program to
     kill off rogue processes under Solaris).  libhash was written under
     FreeBSD 4.4.

AUTHORS
     Andrew Stevenson <andrew@ugh.net.au>

BUGS
     Please report them to <andrew@ugh.net.au>.

UgH!                           January 13, 2002                           UgH!