Previous: uttexp Up: ../plot79_u.html Next: utts00


UTTIK

       INTEGER FUNCTION  UTTIK (TOKEN,LOCTOK,LENTOK,KEYSTR,LENSTR,PKEY,
      X                        LKEY,NKEY,MINMAT)
 C$    (Index of Key)
 C$    Given a  token  of  LENTOK characters  packed  in  TOKEN(*)
 C$    beginning at position LOCTOK,  and an alphabetical list  of
 C$    keywords packed in the first LENSTR characters of KEYSTR(*)
 C$    in the form
 C$
 C$    *KEY1*KEY2*KEY3*...*KEYN
 C$
 C$    where the first character  is the key separator  character,
 C$    return the  number of  the key  which matches,  or if  none
 C$    match, return the negative of  the number of the key  which
 C$    would PRECEDE the  token if  it were inserted  in the  list
 C$    (allowing the result to be used for installing a new key in
 C$    alphabetical  order).   Letter  case  is  ignored  in   the
 C$    comparisons.  Trailing  white  space in  TOKEN(*)  will  be
 C$    ignored, implicitly  reducing  LENTOK.  The  key  separator
 C$    character may  be  any character  which  is not  a  keyword
 C$    character.
 C$
 C$    MINMAT is the minimum number  of characters required for  a
 C$    match; it may be set to enforce the use of a certain number
 C$    of leading characters in  keywords.  However, since a  test
 C$    is always made for ambiguous  matches, setting MINMAT to  1
 C$    will permit the smallest number of characters for a  match.
 C$    If a keyword  contains fewer than  MINMAT characters,  then
 C$    the keyword  length is  used for  the minimum  match  test.
 C$    However, if MINMAT .LE.  0, then an exact keyword match  is
 C$    required.
 C$
 C$    The  table  is  required   to  be  alphabetically   ordered
 C$    according to  the  ASCII  collating  sequence  to  preserve
 C$    machine-independence.  White  space is  defined to  be  any
 C$    ASCII control character  (0..31, 127) or  blank (32).   Key
 C$    number k (k in 1..NKEY)   begins at PKEY(k) and has  length
 C$    LKEY(k).  This  allows  the  table to  be  binary  searched
 C$    instead of linearly searched.  Timing tests with a table of
 C$    about 40 keywords showed that linear searching was from  20
 C$    to 40 times slower than binary searching, and in the SLIDES
 C$    program about 35 percent of the execution time was spent in
 C$    the keyword matching  code sections with  a linear  search.
 C$    Binary search is therefore very worthwhile.  It should also
 C$    be noted that  because keyword  abbreviations are  allowed,
 C$    hash coding techniques  (which would  otherwise be  faster,
 C$    requiring seldom more than 1 comparison per lookup)  cannot
 C$    be used.
 C$
 C$    FUNCTION UTTCK  should  be used  to  automatically  compute
 C$    NKEY, PKEY(*) and LKEY(*)  from the contents of  KEYSTR(*),
 C$    so that that  tedious and  error-prone task  need never  be
 C$    done by hand.
 C$
 C$    This function  provides a  convenient way  of  implementing
 C$    symbol table  lookup  for input  languages  and  optionally
 C$    permitting  abbreviated  command  keywords.   The  function
 C$    value returned can  be used  as a CASE  statement index  to
 C$    select processing  according to  the matched  keyword.   If
 C$    abbreviations are  to be  disallowed, then  simply  setting
 C$    MINMAT = 0 will prevent their recognition.
 C$
 C$    Occasionally  one  wishes  to  have  keywords  with  common
 C$    prefixes, but allow an  abbreviation, which otherwise  does
 C$    not distinguish between them,  to select a particular  one.
 C$    For example,  a graphics  program might  have the  keywords
 C$    LINE, LINECOLOR, LINEINTENSITY,  LINESTYLE, and  LINEWIDTH,
 C$    where one would like  to accept the token  LIN to mean  the
 C$    first of them.   This situation  can be  easily handled  by
 C$    putting the preferred keyword in one keyword string and the
 C$    others in  a second  keyword string  which is  not  scanned
 C$    until the first has been examined.
 C$
 C$    For the benefit of users  on non-ASCII machines, here is  a
 C$    table of the ASCII character set:
 C$
 C$    ============================    ============================
 C$    HEX  DEC  OCT  CHR  Remark      HEX  DEC  OCT  CHR  Remark
 C$    ============================    ============================
 C$    00    0  000  NUL                20   32  040       Space
 C$    01    1  001  SOH                21   33  041   !   Exclamation
 C$    02    2  002  STX                22   34  042   "   Double Quote
 C$    03    3  003  ETX                23   35  043   #   Number
 C$    04    4  004  EOT                24   36  044   $   Dollar
 C$    05    5  005  ENQ                25   37  045   %   Percent
 C$    06    6  006  ACK                26   38  046   &   Ampersand
 C$    07    7  007  BEL                27   39  047   '   Apostrophe
 C$    08    8  010  BS                 28   40  050   (   Left Paren
 C$    09    9  011  HT                 29   41  051   )   Right Paren
 C$    0A   10  012  LF                 2A   42  052   *   Asterisk
 C$    0B   11  013  VT                 2B   43  053   +   Plus
 C$    0C   12  014  FF                 2C   44  054   ,   Comma
 C$    0D   13  015  CR                 2D   45  055   -   Minus
 C$    0E   14  016  SO                 2E   46  056   .   Period
 C$    0F   15  017  SI                 2F   47  057   /   Forward Slash
 C$    10   16  020  DLE                30   48  060   0   Zero
 C$    11   17  021  DC1                31   49  061   1   One
 C$    12   18  022  DC2                32   50  062   2   Two
 C$    13   19  023  DC3                33   51  063   3   Three
 C$    14   20  024  DC4                34   52  064   4   Four
 C$    15   21  025  NAK                35   53  065   5   Five
 C$    16   22  026  SYN                36   54  066   6   Six
 C$    17   23  027  ETB                37   55  067   7   Seven
 C$    18   24  030  CAN                38   56  070   8   Eight
 C$    19   25  031  EM                 39   57  071   9   Nine
 C$    1A   26  032  SUB                3A   58  072   :   Colon
 C$    1B   27  033  ESC                3B   59  073   ;   Semicolon
 C$    1C   28  034  FS                 3C   60  074   <   Left angle
 C$    1D   29  035  GS                 3D   61  075   =   Equals
 C$    1E   30  036  RS                 3E   62  076   >   Right angle
 C$    1F   31  037  US                 3F   63  077   ?   Query
 C$    40   64  100   @   At sign       60   96  140   `   Accent grave
 C$    41   65  101   A   Upper-case    61   97  141   a   Lower-case
 C$    42   66  102   B                 62   98  142   b
 C$    43   67  103   C                 63   99  143   c
 C$    44   68  104   D                 64  100  144   d
 C$    45   69  105   E                 65  101  145   e
 C$    46   70  106   F                 66  102  146   f
 C$    47   71  107   G                 67  103  147   g
 C$    48   72  110   H                 68  104  150   h
 C$    49   73  111   I                 69  105  151   i
 C$    4A   74  112   J                 6A  106  152   j
 C$    4B   75  113   K                 6B  107  153   k
 C$    4C   76  114   L                 6C  108  154   l
 C$    4D   77  115   M                 6D  109  155   m
 C$    4E   78  116   N                 6E  110  156   n
 C$    4F   79  117   O                 6F  111  157   o
 C$    50   80  120   P                 70  112  160   p
 C$    51   81  121   Q                 71  113  161   q
 C$    52   82  122   R                 72  114  162   r
 C$    53   83  123   S                 73  115  163   s
 C$    54   84  124   T                 74  116  164   t
 C$    55   85  125   U                 75  117  165   u
 C$    56   86  126   V                 76  118  166   v
 C$    57   87  127   W                 77  119  167   w
 C$    58   88  130   X                 78  120  170   x
 C$    59   89  131   Y                 79  121  171   y
 C$    5A   90  132   Z                 7A  122  172   z
 C$    5B   91  133   [   Left bracket  7B  123  173   {   Left brace
 C$    5C   92  134   \   Back slash    7C  124  174   |   Vertical bar
 C$    5D   93  135   ]   Right bracket 7D  125  175   }   Right brace
 C$    5E   94  136   ^   Circumflex    7E  126  176   ~   Tilde
 C$    5F   95  137   _   Underscore    7F  127  177   DEL Delete
 C$    ============================     ============================
 C$
 C$    Characters 0..31 and 127 are control characters and have no
 C$    assigned printer graphic.  The important thing to note  for
 C$    most applications  of  this  routine is  the  order:  space
 C$    before digits before letters.   This is different from  the
 C$    order  in  Honeywell  BCD,  CDC,  IBM  EBCDIC,  and  UNIVAC
 C$    FIELDATA character sets.
 C$    (11-SEP-85)