back to Axiomatic Language Home Page

Phonecode in 8 Lines



1. Introduction

This webpage gives an axiomatic language specification for the "phonecode" problem of Lutz Prechelt:

      [Prechelt 2000]   L. Prechelt, "An empirical comparison of seven programming languages", IEEE Computer, vol. 33, no. 10, pp. 23-29, Oct. 2000, doi: 10.1109/2.876288, link.

      [Prechelt 2000TR]   L. Prechelt, "An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl for a search/string-processing program", Dept. of Informatics, Univ. of Karlsruhe, Technical Report 2000-5, Mar. 2000.

      [Prechelt 1999]   L. Prechelt, "Technical opinion: Comparing Java vs. C/C++ Efficiency Differences to Interpersonal Differences", Comm. ACM, Vol 42, No.10, pp. 109-112, Oct. 1999.
  -- probably not as relevant

This webpage will be gradually refined in preparation for the July 8, 2024 short paper deadline for ICLP 2024. A definition of axiomatic language can be found at this webpage.

The software engineering assumption for this webpage is that programming time is roughly the same per line of code, regardless of language level. So, if a C language program for a problem is tyically 1/3 the size of an assembly language program for the same problem, then we would expect the programmer productivity with C to be roughly 3 times greater.

Another assumption for this webpage is that the automatic transformation of axiomatic language specifications to efficient programs for problems like this can eventually be achieved. Thus, this currently-unexecutable specification can be considered an actual solution to this programming problem.

Section 2 surveys the claims and evidence for the idea that a programming language that enables shorter source programs leads to correspondingly shorter programming time and thus greater programmer productivity. Section 3 defines the phonecode problem. Section 4 gives an axiomatic language specification for the phonecode problem. Section 5 gives an expanded, tutorial version of the specification with additional comments inserted to help explain the details. Finally, section 6 gives some concluding comments and dicusses future work.


2. Programming Language vs. Program Size vs. Programmer Productivity

... tbd ...


3. The Phonecode Problem

... tbd ...


4. Axiomatic Language Phonecode Solution

Our axiomatic language specification below for the phonecode problem has just 8 non-comment lines of code. (Numbers in column 1 count these lines.) Since axiomatic language is an extremely minimal language, it must be built-up with assorted utility functions, such as numbers, higher-order forms, Boolean functions and relations, etc. These utilities would be provided in a standard library, which would be shared by all applications, and thus this reusable code would not be counted as part of the source code for a specific application. Axiomatic language supports this type of generalization and abstraction of "general purpose functionality" from a specific application, thus making the application shorter.

Another characteristic of our phonecode solution is a preference for short names. We also adopt a "dense" style where composed functions are often written on the same line for conciseness (and to keep the line count down).

Some abbreviations used in source code and comments are as follows:
  > - marks a function/map from an expression domain to a range
    _domain>_range - (usage)
  >=0 - zero-or-more
    >=n - greater-than-or-equal-to n
  arg - argument for function or predicate
    iarg - input argument
    oarg - output argument (result of a function)
  ax - axiom(s)
  bin, b - binary (function or relation) - 2 input arguments
  bool, b - Boolean (function that yields true/false)
  cat - [con]catenate char strings
    cat* - concatenate a sequence of char strings into a single string
    chcat* - concatenate nested char expression into single, flat char string
  ch, char - character
  conc, conclu - conclusion (of an axiom)
  cond - condition (of an axiom)
  consec - consecutive (elements in a sequence)
  def - define (a name for a function or predicate expression)
  dic - dictionary (of phonecode code words)
  dig, d - digit
  dup - duplicate (an input arg)
  elem, el - "element" (expression) in a sequence
      (sometimes string vars in a seq may be called elems of that seq)
  elim - eliminate
  expr, ex - expression
  f, `f - false (Boolean value)
  filt - filter out (remove) seq elems which fail a Boolean predicate
  fmt - format
  func, fn, f - function
  ix - index
  ins - insert (an expression into a sequence)
  ins_btw - insert an expression between elements of a sequence
  iota - generate sequence of indices given natural number argument
    iota0n- - generate natural number sequence 0..n-1 given n
  lc, lcltr - lowercase (letter)
  len - natural number length of a sequence
  ltr, l - letter
  num, nat, natnum - natural number
  ph#, phonenum - phone number
  phcode, phc - phonecode problem
  pred - predicate (name)
  rel, r - relation
  repl - replace (seq expr with result of fn applied to that expr)
  rev - reverse (a sequence)
  seq - sequence
  str - string of some "type" (often chars) within a seq
  t, `t - true (Boolean value)
  uc, ucltr - uppercase (letter)
  un - unary (function/relation)
    bun - binary func/rel w its 2 iargs in one 2-elem seq (= unary fn)
  w - with
A "type" or expression syntactic category may be denoted by a name beginning with an underscore: _num, _ltr. A string of >=0 of the same category within a sequence may be denoted ..dig.. .

We specify a program that reads multiple input text files and writes a single output text file by a set of "Program" valid expressions of the following form:
  (Program ..infiles.. _outfile)  - >=0 input files, 1 output file
A text file is a sequence of lines, each represented by a sequence of character strings.

A functional style is used for many predicates:
  (_fn ..iargs.. _oarg)   - func on >=0 input args yields a single output arg
     - input arguments are numbered from 0: iarg0, iarg1, ...
Note that in addition to single-line comments,
   ...code...  ! comment...
we use multi-line comment blocks framed by '!/' and '!\' markers that start in column 1:
  !/ comment...
  comment...
  comment...
  !\ comment...
Comment blocks can be nested.

Recall that symbols can start with any printable character other than blank ` ! # $ % ( ) ' " and symbol characters after the first can be printable characters other than blank ( ). We make use of special characters for, say, higher-order forms.

Our axiomatic language phonecode solution is given below. Indices in column 1 count the non-comment source lines.
 
!---+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
! --- Phonecode top-level program axiom ---

1 (Program %dic %ph#s %outfile)<     ! Program phonecode
!/ - encode phone numbers with dictionary words, given letter>digit map
  input arg text files:
    %dic - >=0 words of dictionary, one-per-line
      _word is >=1 upper/lower-case ltrs and - " special chars w no spaces
    %ph#s - phone numbers to encode, one-per-line
      phone numbers are >=0 digits and - / special chars w no spaces
  output arg text file:  %outfile - phonenum/encoding pairs, one per line:
    _ph#: _wordd .. _wordd  - encoding "wordd"s separated by blanks
    - _wordd is dic word or single ph# dig if no possible word at that point
      (consecutive digit-wordd's cannot occur together)
!\  - A ph# might not have an encoding and thus wouldn't be in the output.    

! get map of ph# digits to dictionary words
!/ Given a sequence of lc-letter-group symbols, we assign a digit index '0'..
   to each letter based on its group's sequence index.  This letter-to-digit
   mapping, applied to %dic, gives us a word-to-digit-string mapping,
!\ from which we get a digit-string-to-dictionary-word map.
2  ((ltr>dig digs>word) (e jnq rwx dsy ft am civ bku lop ghz) %dic %digs>word),
                       ! 0  1   2   3  4  5   6   7   8   9
    ! Note that we "hard-code" the letter-to-digit map within the Program ax.
!\  ! ltr->dig map + dictionary  -->  digits->word map

! compute all stream solutions of digs>word map applied to phone number digits
3  ((dup (* (filt dig?)) (* (eat (bite1 %digs>word)))
4    (* (filt (~consec? dig?))) swap *pair) %ph#s %ph#codes),
!/ We remove special chars from phone numbers ph#s to get digit-only ph#ds.
   We then incrementally "eat" prefixes from each ph#d while outputting
   a dictionary word whose digit string matches this prefix.  If no word can
   be matched at some point, a single digit of ph#d is copied to the output.
   The resulting "wordd" sequence (wordd = dic word or phonenum digit)
   is a candidate encoding except that we then eliminate encodings where
!\ consecutive wordds are both digits.

! format phone numbers and their codes for output file
5  (((* (.* pair)) cat* *fmt) %ph#codes %outfile).
   ! - %outfile - output is text file of ph#-code pairs, 1 per line
!/ Given a sequence of original phone numbers and their encodings,
   we get phonenum/encoding pairs and then format each of these
!\ pairs for output.

! --- Phonecode app supporting functions ---:

! ltr>dig - define letter-to-digit mapping from seq of lc letter-group symbols
6 (def ltr>dig (*1 el>ix1 dup lc>uc cat)).
  !  -- Could this be a std lib fn?!

! digs>word - get map of digit strings to dictionary words
7 (def digs>word (swap dup (skip 1 ((* (filt ltr?)) **map)) *pair *rev)).
!/ Given the ltr>dig map and a dictionary of words, this function applies
   the map to the dictionary words to get a word>digit_string map, which
!\ is then reversed to give a digits>word map.

! fmt - format phonenum-code pair for output line
8 (def fmt ((ins 1 ": ") (repl 2 (ins_btw ' ')) chcat*)).
!/ Given a ph#-code pair, this function first inserts ": " between the
   phonenum and its code, then inserts a blank between each "wordd" of
   the code, and then flattens the nested characters into a single
!\ character string, which will be a line of the output file.
Links to standard library source files and some of the predicates they define are as follows:

form.ax - utilities involving the form of expressions
  (pair _a _b (_a _b)) - pair two argument exprs into 2-elem seq
  (cat (..str1..) (..str2..) (..str1.. ..str2..)) - concatenate two seqs into one
  (cat* ((..strA..) .. (..strZ..)) (..strA.. .. ..strZ..))
    - concatenate sequence of sequences into one
  (rev _seq _rev) - reverse a sequence

ho.ax - higher-order predicates and functions
  (..f..) - function composition on input arg stack, applied left-to-right
  dup - duplicate the first input argument in a function composition
  swap - swap the first two input arguments for a function composition
  (* _fn/pred), *fn/pred - apply function/predicate to sequences of arguments
  (.* _binfn/rel) - apply binary fn/rel to arg0 and all elems of arg1
  (*. _binfn/rel) - apply binary fn/rel to all elems of arg0 and arg1
  (map _expr _map _expr') - apply _map, consisting of a seq of pairs, to an expr

natnum.ax - natural number predicates and functions
  len - length of a sequence
  iota0n- - sequence of natnums 0..n-1 given n

bit.ax - predicates and functions that use bit representation
  -- no utilities directly used

char.ax - predicates and functions involving characters
  chcat* - flatten nested char expr into a char string
  _dig - select elem [_dig] from a sequence [0..]
  (el>ix1 _elgroups _el>dig) - given seq of elem group seqs, get elem-to-1-dig-index map
  (skip _dig _f) - skip _dig iargs, then apply fn
  (repl _dig _uf) - replace elem [_dig] of iarg0 seq w unary fn _uf applied to it

bool.ax - Boolean predicates and functions (- ending ? is a convention)
  _type? - returns true/false if expr is of given type
  _dig?, _ltr? - returns t/f if char expr is digit, letter
  ~_bfn? - returns complement of Boolean function _bfn?
  (consec? _type?) - t/f if seq has consec elems of given type

stream.ax - incremental stream processing functions
  (bite1 _map) - if possible, bite stream prefix that matches map, outputting expr;
      - else, copy 1 stream elem to output
  (eat _bite) - "eat" stream using all possible bite sequences, outputting expr seqs


5. Expanded, Tutorial Axiomatic Language Solution Presentation

To help the user understand the above specification, we have re-written it over multiple lines with extra comments inserted. Some comments document the actions of individual functions; others show the input argument stack before and after the function invocation. (Our assumption is that a future axiomatic language programmer would have comprehensive familiarity with the standard utility library, so an expanded presentation like this would not be necessary.)
!---+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
*** expanded program axioms with inserted comments ***
! --- Phonecode top-level program axiom ---


1 (Program %dic %ph#s %outfile)<     ! Program phonecode

!/ copied from above ...
  - encode phone numbers with dictionary words, given letter>digit map
  input arg text files:
    %dic - >=0 words of dictionary, one-per-line
      _word is >=1 upper/lower-case ltrs and - " special chars w no spaces
    %ph#s - phone numbers to encode, one-per-line
      phone numbers are >=0 digits and - / special chars w no spaces
  output arg text file:  %outfile - phonenum/encoding pairs, one per line:
    _ph#: _wordd .. _wordd  - encoding "wordd"s separated by blanks
    - _wordd is dic word or single ph# dig if no possible word at that point
      (consecutive digit-wordd's cannot occur together)
!\  - A ph# might not have an encoding and thus wouldn't be in the output.    


!/ copied ...
! get map of ph# digits to dictionary words
!/ Given a sequence of lc-letter-group symbols, we assign a digit index '0'..
   to each letter based on its group's sequence index.  This letter-to-digit
   mapping, applied to %dic, gives us a word-to-digit-string mapping,
!\ from which we get a digit-string-to-dictionary-word map.
2  ((ltr>dig digs>word) (e jnq rwx dsy ft am civ bku lop ghz) %dic %digs>word),
                       ! 0  1   2   3  4  5   6   7   8   9
    ! Note that we "hard-code" the letter-to-digit map within the Program ax.
!\  ! ltr->dig map + dictionary  -->  digits->word map

! get map of ph# digits to dictionary words
 ((     ! (e jnq rwx ... lop ghz) _dic   - input args for this cond fn
   ltr>dig     ! get map of lower/upper-case letters to digits
        ! _ltr>dig _dic
   digs>word   ! apply ltr>dig map to dictionary
        ! _digs>word - output map of ph# digit prefixes to dictionary words
  ) (e jnq rwx dsy ft am civ bku lop ghz) %dic %digs>word),


!/ copied ...
! compute all stream solutions of digs>word map applied to phone number digits
3  ((dup (* (filt dig?)) (* (eat (bite1 %digs>word)))
4    (* (filt (~consec? dig?))) swap *pair) %ph#s %ph#codes),
!/ We remove special chars from phone numbers ph#s to get digit-only ph#ds.
   We then incrementally "eat" prefixes from each ph#d while outputting
   a dictionary word whose digit string matches this prefix.  If no word can
   be matched at some point, a single digit of ph#d is copied to the output.
   The resulting "wordd" sequence (wordd = dic word or phonenum digit)
   is a candidate encoding except that we then eliminate encodings where
!\ consecutive wordds are both digits.
!\
  
! compute all stream solutions of digs>word map applied to phone number digits
 ((   ! _ph#s
   dup        ! duplicate the _ph#s input arg
      ! _ph#s _ph#s
   (* (filt dig?))  ! filter out non-digits from ph#s - get digit-only ph#ds
      ! _ph#ds _ph#s
   (* (eat (bite1 %digs>word)))  !incrementally "eat" each ph#d to get "codeX"s
      ! - each "bite" matches digit prefix to digs>word map to get output word
      !   If no match, the bite1 fn copies 1 ph#d digit to the output.
      ! - Thus, an output "codeX" encoding is a sequence of wordds, each a
      !   dictionary word or a single digit.  All possible codeXs are
      !   generated by this process, where a codeX encoding may have an
      !   arbitrary number of ph#d digits:      
      ! (.. _codeXs-for-ph#i ..) _ph#s - >=1 codes for each ph#
      !   - codeX "wordd" is digit where dictionary word not possible
      !   - Note that each ph# will have a codeX since ph#d is possible.
      ! - But consecutive digits are not allowed for ph# encodings!!
   (* (filt (~consec? dig?)))  ! elim codes w consecutive digits
      ! (.._{valid-}codes-for-ph#i..) _ph#s  - no consec digits in codes
      ! - Note that it's possible a ph# may not have any valid encodings.
   swap    ! move original ph#s in front of their codes
      ! (.. _ph#i ..) (.. _codes-for-ph#i ..)
   *pair   ! - pair each original ph# with its sequence of valid codes
      ! (..(_ph#i _codes_for_ph#i)..)
  )   ! end of fn composition puts ph#-codes pairs into %ph#codes output arg
  %ph#s %ph#codes),


!/ copied ...
! format phone numbers and their codes for output file
5  (((* (.* pair)) cat* *fmt) %ph#codes %outfile).
   ! - %outfile - output is text file of ph#-code pairs, 1 per line
!/ Given a sequence of original phone numbers and their encodings,
   we get phonenum/encoding pairs and then format each of these
!\ pairs for output.
!\

! format phone numbers and their codes for output file
 ((   ! (..(_ph# _codes_for_ph#)..)
      !  - note that a phone num may have no codes
   (* (.* pair))  ! pair each ph# w each of its codes
      ! (.. (.. (_ph# _code-for-ph#) ..) ..)
      !  - note that ph#-code pairs are grouped by ph#
   cat*   ! make a flat list of ph#-code pairs
      ! (.. (_ph# _code) ..)
          ! - if a phone num has no codes, it will not appear in any pair
   *fmt   ! format each ph#-code pair into output line
      ! (.. "ph#: wordd .. wordd" ..)   - this is the output text file!
      !  - returned in %outfile
  ) %ph#codes %outfile).

! --- Phonecode app supporting functions ---:


!/ copied ...
! ltr>dig - define letter-to-digit mapping from seq of lc letter-group symbols
6 (def ltr>dig (*1 el>id1 dup lc>uc cat)).
  !  -- Could this be a std lib fn?!
!\

! ltr>dig - define letter-to-digit mapping from seq of lc letter-group symbols
 (def ltr>dig     ! define ltr>dig function
  (    ! (e jnq rwx ... lop ghz)   - lc letter groups - a partition
   *1    ! sel elem [1] of each letter group symbol (- gets sym char string)
       ! ("e" "jnq" ... "lop" "ghz")
   el>ix1  ! get map of lc letter group elements to single digits
           ! (we index seq elems from [0])
       ! (('e' '0') .. ('g' '9') ('h' '9') ('z' '9'))  - lcltr-to-dig map
   dup   ! duplicate the first arg
       ! (('e' '0') .. ('z' '9')) (('e' '0') .. ('z' '9'))
   lc>uc  ! map lc letters to uppercase
       ! (('E' '0') .. ('Z' '9')) (('e' '0') .. ('z' '9'))
   cat   ! concatenate lc- and uc-maps into letter>digit map
       ! (('E' '0') .. ('Z' '9') ('e' '0') .. ('z' '9'))
  )).    ! end of fn composition and fn definition
         ! returns ltr>dig map
           

!/ copied ...
! digs>word - get map of digit strings to dictionary words
7 (def digs>word (swap dup (skip 1 ((* (filt ltr?)) **map)) *pair *rev)).
!/ Given the ltr>dig map and a dictionary of words, this function applies
   the map to the dictionary words to get a word>digit_string map, which
!\ is then reversed to give a digits>word map.
!\

! digs>word - get map of digit strings to dictionary words
 (def digs>word
  (    ! _ltr>dig_map _dic   - the two input args 
   swap  ! swap ltr>dig and dic input args
       ! _dic (('E' '0') .. ('Z' '9') ('e' '0') .. ('z' '9'))
   dup   ! duplicate the dictionary arg
       ! _dic _dic (('E' '0') .. ('Z' '9') ('e' '0') .. ('z' '9')) ..
   (skip 1  ! skip 1 arg, then apply the next function to remaining args
    ((* (filt ltr?))  ! remove non-letters from dictionary words
         ! _dic _dicl (('E' '0') .. ('z' '9')) ..
     **map  ! apply ltr>dig map to letter-only dictionary words
            ! - result is a unique string of digits for each word
         ! _dic _{word_}digits
    ))
   *pair  ! pair each original dictionary word with its digit string
       ! (.. (_word _digits) ..)   - map of original words to digit strings 
   *rev   ! reverse to get map of digit strings to original words
       ! (.. (_digits _word) ..)   - map of digit strings to original words
 )).      ! end of fn composition and fn definition
          ! digs>word map is returned!


!/ copied ...
! fmt - format phonenum-code pair for output line
8 (def fmt ((ins 1 ": ") (repl 2 (ins_btw ' ')) chcat*)).
!/ Given a ph#-code pair, this function first inserts ": " between the
   phonenum and its code, then inserts a blank between each "wordd" of
   the code, and then flattens the nested characters into a single
!\ character string, which will be a line of the output file.
!\

! fmt - format phonenum/code pair for output line
 (def fmt
  (   ! (_ph# _code)  - a ph#-code pair
   (ins 1 ": ")  ! insert punctuation between ph# and code
      ! (_ph# ": " _code)
   (repl 2 (ins_btw ' '))  ! insert blanks between code words/digits
      ! (_ph# ": " (_wordd ' ' .. _wordd))
      ! - We replace arg 2 w result of unary fn that inserts blanks between
      !   its sequence elements.
   chcat*   ! concat (flatten) nested char seqs into single string
      ! "_ph#: _wordd .. _wordd"  - syntax elems are concatenated
      !  - this is an output text file line
  )).


6. Final Comments

... tbd ...

back to Axiomatic Language Home Page