ASORTS

   General-purpose routines for sorting, searching and moving arrays of arbitrary elements.

                 Copyright 1991, by J. W. Rider
 

ASORTS provides the Turbo Pascal programmer with type definitions, functions and procedures to handle a variety of array sorting and searching tasks.  Several of these are analogous to functions of the same name in the C programming language:

    qsort   --  sort an array

    bsearch --  do a binary search of a sorted array for an element that satisfies some key

    lfind   --  do a sequential ("linear") search of an unsorted (or sorted) array for an element that satisfies some key

    lsearch --  do a sequential search of an unsorted (or sorted) array for an element that satisfies some key.  If the element is not found, then it is appended to the end of the array.

    (Non-C programmers: please note that "bsearch" and "lfind" -- not "lsearch"! -- do the equivalent task for sorted and unsorted arrays, respectively.  It would make more sense to have "bfind" to search through a sorted array, and make "bsearch" insert a missing element into the array at the right location. However, these routines are provided for compatibility in converting C programs.)

    swab     -- move one portion of memory to another, swapping high and low order bytes in the process

There are some routines that are provided for utility above and beyond what is available in standard C libraries:

    bfind    -- do a binary search of a sorted array for an element that satisfies some key. (This differs from "bsearch" in that is "bfind" reports the "insertion index" if the key element is not found.)

    binsert  -- do a binary search of a sorted array for an element that satisfies some key.  If the element is not found, then the key element is inserted in the proper location in the array to maintain the sorted order.

    fibsearch-- searches an ordered array using a "fibonacci" search algorithm.

    fill     -- moves a single item into all of the elements of an array

    heapsort -- order an array by the "heapsort" algorithm.

    longaddr -- returns the address of a variable expressed as a longint value

    naivesort-- a particularly inefficient sorting algorithm that the author has been known to use when "qsort" was not available. (The use of "naivesort" in applications is NOT recommended!)

    scan     -- returns the index of the first element in an array that meets a specific criterion.

    selsort  -- "selection" sorting, another sorting routine.

    shellsort-- yet another sorting routine, different from the rest.

    shuffle  -- randomly permutes ("unsorts") the elements of an
                array

    subfill  -- moves a single item into all of the elements of a subarray of the base array

    submove  -- moves the elements of the source array into a subarray of the destination array (or, the elements of a subarray of the source to the destination array)

    swap     -- Exchanges two array elements or variables.

                (NOTE!  The "Asorts.Swap" procedure replaces the default "System.Swap" which simply exchanges the high and low-order bytes of a two byte expression.)

    vqsort   -- a quicksort algorithm for "virtual" arrays.  "VQSort" does not presume any kind of structure inherent within the "thing" being sorted. Instead, "VQSort" provides the sorting logic to other routines that will perform the actual comparisons and exchanges.

    vselsort -- a "virtual" selection sort.

    vshuffle -- a "virtual" shuffler.

    xsubmove -- moves the elements of a subarray of the source array into the elements of a subarray of the destination.
 

While these routines are provided to be of assistance to the programmer, the number of different searching and sorting algorithms does raise the issue of how to select any one algorithm for the programmer to employ. Unfortunately, that is very application dependent.  For general purpose array sorting, you would do well to compare the "qsort" and "shellsort" routines on your actual data.  For sorting of typed files, I'd suggest a
variant of "VSelSort".
 

CONCEPTS

The ARRAYs to be manipulated are passed as "untyped vars" to these routines.  (In the "Interface" section, these arrays are called "base", "source" or "destination".)  These routines will treat the ARRAYs as if they were declared to be of type:

         ArrayType = ARRAY [1..MaxArray] OF ElementType

   WARNING!  It is *very* important to avoid defining an array so that the last byte is in a memory segment different from the first byte.  As long as you never declare an array larger than 65519, or $10000-15, bytes, it should not be a problem.

Each ELEMENT of the ARRAY is presumed to be fixed size, and this size must be passed to the routines.  (In the "Interface" section, if an ELEMENT needs to passed directly to a routine as an argument, it is passed as an untyped var called "key".) Also, the number of elements in the ARRAY must also be passed.  For instance, to fill an array of real numbers with 0:

var
  RealArray : array [1..10] of Real;
  x         : real;

    x:=0;
    fill(RealArray,10,sizeof(real),x);

A SUBARRAY is a "byte-slice" of an array.  For instance, if "ElementType" is an "array [1..8] of byte", then a "subelement" would be any contiguous collection of bytes within the element, like 3,4 and 5.  The SUBARRAY would be the collection of all of the subelements stored in an ARRAY.  If "ElementType" is a record of fields, then a "subelement" would be any contigous group of fields.

For sorting and searching, a COMPARISON FUNCTION must be passed to the routines.  COMPARISON FUNCTIONs take two untyped vars, return a longint value, and must be declared "far" at compilation. (DIFFERENT! In C, only an integer-sized value is returned.)  For instance, to sort the array of real numbers declared earlier:

    function RealCompare(var a,b):longint; far;
    begin
       if real(a)>real(b) then realcompare:=1
       else if real(a)<real(b) then realcompare:=-1
       else realcompare:=0;
    end;

    qsort(realarray,10,sizeof(real),realcompare);
 

"Virtual" arrays are data structures whose elements can be accessed indirectly by an index.  For instance, information that is physically stored in multiple arrays might be sorted by a key in just one of the arrays.  ASORTS provides a few routines for handling such "virtual" arrays.  "VQSort" and "VSelSort" will provide the sorting logic for ordering the arrays.  "VShuffle" will similarly "unorder" the array. To sort an array of "DBRec" with respect to an array of integer priorities, declare:

   var Array1 : array [1..MaxDBRec] of DBRec;
       Priority: array [1..MaxDBRec] of integer;

   function ComparePriority(a,b:longint):longint; far;
   begin ComparePriority:=longint(Priority[a])-Priority[b] end;

   procedure SwapPriDBRec(a,b:longint); far;
   begin asorts.swap(Priority[a],Priority[b],sizeof(integer));
         asorts.swap(Array1[a],Array1[b],sizeof(DBRec)); end;

and sort the two arrays with:

   vqsort(MaxDBRec,ComparePriority,SwapPriDBRec);
 

INTERFACE

function bfind(var key{:element}, base{:array};
                   length_base, sizeof_element:word;
                   f:comparefunc):longint;

    Searches a sorted array for a "key" element. Return the index of its location, or the negative of the index of the largest element in the array that is smaller than the key (i.e., the element that you want to insert the new element after).
 

function binsert(var key{:element}, base{:array};
                     length_base, sizeof_element:word;
                     f:comparefunc):longint;

   WARNING: This routine overwrites memory if used incorrectly.

    Inserts the key element into the correct position of an ordered array. Unlike "lsearch", which only adds the key if it's not already present, "binsert" ALWAYS inserts a new element into the array. "Binsert" returns the index where the element is inserted.
 

function bsearch(var key{:element}, base{:array};
                     length_base, sizeof_element:word;
                     f:comparefunc):word;

    "Bsearch" attempts to locate the "key" element within the previously SORTED array "base".  If successful, the index of the found element is returned; otherwise, 0 is returned to indicate that the element is not present.
 

type comparefunc = function (var a,b{:element}):longint; {far;}

    Declares the type of the comparison function to be passed to sorting and searching routines. CompareFunc's are user-defined functions that takes two arguments a and b. A and B must be typeless in the declaration, but otherwise are of the same type as the elements of the "base" array. For "qsort" and "bsearch", the function needs to return a negative integer if "A<B"; a positive integer if "A>B"; and 0 if "A=B". For "lfind" and "lsearch", the function needs to return 0 if "A=B", and some non-zero integer if "A<>B".
 

function fibsearch(var key{:element}, base{:array};
                       length_base, sizeof_element:word;
                       f:comparefunc):word;

    "Fibsearch" attempts to locate the "key" element within the previously SORTED array "base".  If successful, the index of the found element is returned; otherwise, 0 is returned to indicate that the element is not present.  (This procedure is included because a user asked about the algorithm on one of Borland's CompuServe Forums.  It looks like an interesting alternative to "bsearch", but I have not run extensive comparisons.)
 

procedure fill(var key{:element}, destination{:array};
                   count, sizeof_element:word);

   WARNING: This routine overwrites memory if used incorrectly.

    Moves the "key" element to the first "count" indices in the "destination" array.
 

procedure heapsort(var base {:array};
                   length_base, sizeof_element:word;
                   f:comparefunc);

   WARNING: This routine overwrites memory if used incorrectly.

    "HeapSort" reorders the elements of the "base" array using a heapsort algorithm (like, you expected something else?).  The function "f" is used to compare two elements of the array, and must return a negative number if the first argument is "less than" the second, a postive number if the first argument is "greater than" the second, and zero if the two arguments are "equal".
 

type icomparefunc = function (a,b:longint):longint;

    Declares the type of the comparison function to be passed as an argument for sorting "virtual" arrays.  Instead of passing the elements to be compared as is done for "comparefunc", ICompareFunc's use the location of the elements.
 

function lfind(var key{:element}, base{:array};
                   length_base, sizeof_element:word;
                   f:comparefunc):word;

    "Lfind" attempts to locate the "key" element within the array "base".  If successful, the index of the found element is returned; otherwise, 0 is returnd to indicate the element is not present.
 

function longaddr (var x): longint;

    Returns the address of a variable expressed as a longint value so that address can be used for comparisons, etc.
 

function lsearch(var key{:element}, base{:array};
                     length_base, sizeof_element:word;
                     f:comparefunc):word;

   WARNING: This routine overwrites memory if used incorrectly.
 

    Does a linear search of the "base" array, looking for the "key" element.  If the key element is found, "lsearch" returns the index of the array.  *** NOTE! ***  Otherwise, the key element is appended to the end of the array.  It is the programmer's responsibility to ensure that "sizeof_element" bytes are available at the end of the array and that the count of contained elements is adjusted.  To avoid the append, use "lfind" instead.
 

var monitor : procedure;
procedure nullmonitor;

    "Monitor" and "NullMonitor" were debugging devices developed in the process of putting together the ASORTS unit. They can be optionally declared by defining a compilation variable "MONITOR". Every time that the "Qsort" algorithm swaps a pair of array elements, the "monitor" procedure is called. This will allow the user to watch the progress of the sort. This is of marginal practical value, and by default, these two identifiers are not defined. Calling "NullMonitor" will turn off monitoring even if the "monitor" procedure variable is defined.
 

procedure naivesort(var base {:array};
                    length_base, sizeof_element:word;
                    f:comparefunc);

  WARNING: This routine slowly overwrites memory if used incorrectly.

    "NaiveSort" also reorders the elements or an array.  However, it does it more slowly than "QSort".  The use of the procedure is not recommended.  It is provided for comparison only.  (i.e., don't waste your time trying to improve upon it.  It's only here for comic relief.)
 

procedure qsort(var base {:array};
                    length_base, sizeof_element:word;
                    f:comparefunc);

   WARNING: This routine overwrites memory if used incorrectly.

    "Qsort" reorders the elements of the "base" array using a quicksort algorithm.  The function "f" is used to compare two elements of the array, and must return a negative number if the first argument is "less than" the second, a postive number if the first argument is "greater than" the second, and zero if the two arguments are "equal".
 

function scan(var source; count, sizeof_element:word;
                  f:testfunc):word;

    "Scan" does a linear search of the "source" array and returns the index of the first element that satisfies the
    test function "f".  If no element is found, "scan" returns 0. procedure

selsort(var base{:array}; length_base, sizeof_element:word; f:comparefunc);

   WARNING: This routine overwrites memory if used incorrectly.

    "SelSort" reorders the elements of the "base" array using a selection sort algorithm.  This algorithm minimizes the number of times that each element is moved, but will maximize the of comparisons.  For most applications, the comparisons take more time than the exchanges, so you are not likely to want to use this algorithm for ordinary array sorting.  See also, "VSelSort".
 

procedure shellsort(var base{:array};
                    length_base, sizeof_element:word;
                    f:comparefunc);

   WARNING: This routine overwrites memory if used incorrectly.

    "ShellSort" reorders the elements of the "base" array using a sort routine first defined by an individual whose last name was Shell.  In concept, the algorithm is an insertion-type sort (as opposed to exchange-type sort (of which "qsort" and "selsort" are examples).  This turns out to be a very efficient sorting routine for in-memory sorting.
 

procedure shuffle(var base{:array};
                      length_base, sizeof_element:word);

   WARNING: This routine overwrites memory if used incorrectly.

    Randomly permutes ("unsorts") the elements of an array. The SYSTEM "Randomize" procedure should be called at least once in a program that shuffles an array.
 

procedure subfill(var key{:subelement}, destination{:subarray};
                  count, sizeof_key, sizeof_element:word);

   WARNING: This routine overwrites memory if used incorrectly.

    Partially fills an array with the "key". The address of "Destination" should be the address of the first byte in the subarray. The portion of the array outside of the subarray is left unchanged.
 

procedure submove(var source{:[sub]array},
                      destination{:[sub]array};
                      count,
                      sizeof_source_element,
                      sizeof_destination_element:word);

   WARNING: This routine overwrites memory if used incorrectly.

    If the size of the source elements are smaller than that of the destination elements, moves the first "count" elements of the source into a subarray of the same size in destination. If larger, only moves that portion of the source that will fill the first "count" elements of the destination. If equal in size, does a simple "move" of the bytes.
 

procedure swab(var source, destination; numwords:word);

   WARNING: This routine overwrites memory if used incorrectly.

    Moves the contents of source to the destination, swapping high and low order bytes in the process.  (Note:  while this is provided for C compatibility, the third argument is used differently.  In C, the third argument is the number of bytes to move and is expected to be even.  Here, the third argument is the number of byte-pairs to move.)
 

procedure swap(var var1,var2; sizeof_element:word);

   WARNING: This routine overwrites memory if used incorrectly.

    "Swap" exchanges the contents of the variable "var1" with the contents of the variable "var2".  (Note: this is a redefinition of the "swap" function defined in the System unit.)
 

type swapproc = procedure (a,b:longint); {far;}

    Declares the type of the exchange procedure that is passed as an argument to "selsort" and other virtual exchange sort algorithms.
 

type testfunc = function (var a):boolean; {far;}

    Declares the type of the criterion function to be passed to the "scan" function.  The actual TestFunc should expect an array element to be passed through "a" and return true if the element satisfies the criteria.  Otherwise, it should return false.
 

procedure vqsort(length_base:longint; f:icomparefunc; s:swapproc);

    "VQSort" provides the logic to do a quicksort of an indexed entity (a "virtual" array), but depends upon the user-defined routines "f" and "s" to do the actual work of accessing specific elements in the array, comparing and exchanging them.  This makes VQSort useful for sorting elements when they are stored in something other than a single contiguous array.
 

procedure vselsort(length_base:longint; f:icomparefunc; s:swapproc);

    "VSelSort" provides the logic to do a selection sort of an indexed entity (a "virtual" array), but depends upon the user-defined routines "f" and "s" to do the actual work of accessing specific elements in the array, comparing and exchanging them.  As mentioned in the description for "SelSort", the algorithm minimizes the number of exchanges of elements required to put something into sorted order, but makes a large number of comparisons to do so.  This would make "VSelSort" useful for something like sorting an external file of larger records based upon integer keys stored in an array in memory.
 

procedure vshuffle(length_base:longint; s:swapproc);

    Of course, what fun is there in being able to order virtual arrays if we can't mix all the elements up again?

procedure xsubmove(var source{:subarray}, destination{:subarray};
                       count, sizeof_source_element,
                       sizeof_destination_element,
                       sizeof_subelement:word);

   WARNING: This routine overwrites memory if used incorrectly.

    Moves a subarray from the source to the destination. The size of the subelements is presumed to be the same in both subarrays. "XsubMove" does not check to make sure that the sizes are consistent.
 

REFERENCES

Knuth, The Art of Computer Programming, Sorting and Searching.

Press, et al., Numerical Recipes.

Sedgewick, Algorithms.