DragonFly On-Line Manual Pages

JudyHS(3)             DragonFly Library Functions Manual             JudyHS(3)

NAME
       JudyHS macros - C library for creating and accessing a dynamic array,
       using an array-of-bytes of Length as an Index and a word as a Value.

SYNOPSIS
       cc [flags] sourcefiles -lJudy

       #include <Judy.h>

       Word_t  * PValue;                           // JudyHS array element
       int       Rc_int;                           // return flag
       Word_t    Rc_word;                          // full word return value
       Pvoid_t   PJHSArray = (Pvoid_t) NULL;       // initialize JudyHS array
       uint8_t * Index;                            // array-of-bytes pointer
       Word_t    Length;                           // number of bytes in Index

       JHSI( PValue,  PJHSArray, Index, Length);   // JudyHSIns()
       JHSD( Rc_int,  PJHSArray, Index, Length);   // JudyHSDel()
       JHSG( PValue,  PJHSArray, Index, Length);   // JudyHSGet()
       JHSFA(Rc_word, PJHSArray);                  // JudyHSFreeArray()

DESCRIPTION
       A JudyHS array is the equivalent of an array of word-sized
       value/pointers.  An Index is a pointer to an array-of-bytes of
       specified length:  Length.  Rather than using a null terminated string,
       this difference from JudySL(3) allows strings to contain all bits
       (specifically the null character).  This new addition (May 2004) to
       Judy arrays is a hybird using the best capabilities of hashing and Judy
       methods.  JudyHS does not have a poor performance case where knowledge
       of the hash algorithm can be used to degrade the performance.

       Since JudyHS is based on a hash method, Indexes are not stored in any
       particular order.  Therefore the JudyHSFirst(), JudyHSNext(),
       JudyHSPrev() and JudyHSLast() neighbor search functions are not
       practical.  The Length of each array-of-bytes can be from 0 to the
       limits of malloc() (about 2GB).

       The hallmark of JudyHS is speed with scalability, but memory efficiency
       is excellent.  The speed is very competitive with the best hashing
       methods.  The memory efficiency is similar to a linked list of the same
       Indexes and Values.  JudyHS is designed to scale from 0 to billions of
       Indexes.

       A JudyHS array is allocated with a NULL pointer

       Pvoid_t PJHSArray = (Pvoid_t) NULL;

       Because the macro forms of the API have a simpler error handling
       interface than the equivalent functions, they are the preferred way to
       use JudyHS.

JHSI(PValue, PJHSArray, Index, Length) // JudyHSIns()
       Given a pointer to a JudyHS array (PJHSArray), insert an Index string
       of length: Length and a Value into the JudyHS array:  PJHSArray.  If
       the Index is successfully inserted, the Value is initialized to 0.  If
       the Index was already present, the Value is not modified.

       Return PValue pointing to Value.  Your program should use this pointer
       to read or modify the Value, for example:

       Value = *PValue;
       *PValue = 1234;

       Note: JHSI() and JHSD can reorganize the JudyHS array.  Therefore,
       pointers returned from previous JudyHS calls become invalid and must be
       re-acquired (using JHSG()).

JHSD(Rc_int, PJHSArray, Index, Length) // JudyHSDel()
       Given a pointer to a JudyHS array (PJHSArray), delete the specified
       Index along with the Value from the JudyHS array.

       Return Rc_int set to 1 if successfully removed from the array.  Return
       Rc_int set to 0 if Index was not present.

JHSG(PValue, PJHSArray, Index, Length) // JudyHSGet()
       Given a pointer to a JudyHS array (PJHSArray), find Value associated
       with Index.

       Return PValue pointing to Index's Value.  Return PValue set to NULL if
       the Index was not present.

JHSFA(Rc_word, PJHSArray) // JudyHSFreeArray()
       Given a pointer to a JudyHS array (PJHSArray), free the entire array.

       Return Rc_word set to the number of bytes freed and PJHSArray set to
       NULL.

ERRORS: See: Judy_3.htm#ERRORS
EXAMPLES
       Show how to program with the JudyHS macros.  This program will print
       duplicate lines and their line number from stdin.

       #include <unistd.h>
       #include <stdio.h>
       #include <string.h>
       #include <Judy.h>

       //  Compiled:
       //  cc -O PrintDupLines.c -lJudy -o PrintDupLines

       #define MAXLINE 1000000                 /* max fgets length of line */
       uint8_t   Index[MAXLINE];               // string to check

       int     // Usage:  PrintDupLines < file
       main()
       {
           Pvoid_t   PJArray = (PWord_t)NULL;  // Judy array.
           PWord_t   PValue;                   // Judy array element pointer.
           Word_t    Bytes;                    // size of JudyHS array.
           Word_t    LineNumb = 0;             // current line number
           Word_t    Dups = 0;                 // number of duplicate lines

           while (fgets(Index, MAXLINE, stdin) != (char *)NULL)
           {
               LineNumb++;                     // line number

               // store string into array
               JHSI(PValue, PJArray, Index, strlen(Index));
               if (PValue == PJERR)            // See ERRORS section
               {
                   fprintf(stderr, "Out of memory -- exit\n");
                   exit(1);
               }
               if (*PValue == 0)               // check if duplicate
               {
                   Dups++;
                   printf("Duplicate lines %lu:%lu:%s", *PValue, LineNumb, Index);
               }
               else
               {
                   *PValue = LineNumb;         // store Line number
               }
           }
           printf("%lu Duplicates, free JudyHS array of %lu Lines\n",
                           Dups, LineNumb - Dups);
           JHSFA(Bytes, PJArray);              // free JudyHS array
           printf("JudyHSFreeArray() free'ed %lu bytes of memory\n", Bytes);
           return (0);
       }

AUTHOR
       JudyHS was invented and implemented by Doug Baskins after retiring from
       Hewlett-Packard.

SEE ALSO
       Judy(3), Judy1(3), JudyL(3), JudySL(3),
       malloc(),
       the Judy website, http://judy.sourceforge.net, for further information
       and Application Notes.

                                                                     JudyHS(3)