next up previous contents index
Next: Variants of match Up: A quick tour Previous: Pattern matching versus

More examples

 

As another example, we can specify the term structure of well-formed formulas in proposition calculus as follows. Notice that the constructors F and T are nullary.

   datatype Wff = F
                | T
                | Var     (const char *)
                | And     (Wff, Wff)
                | Or      (Wff, Wff)
                | Not     (Wff)
                | Implies (Wff, Wff)
                ;

Datatypes that are parametrically polymorphic, such as lists and trees, can be defined by parameterizing them with respect to one of more types. For example, both lists and tree below are parametric on one type argument T.

   datatype List<T> = nil
                    | cons(T, List<T>);
   datatype Tree<T> = empty
                    | leaf(T)
                    | node(Tree<T>, T, Tree<T>);

   List<int> primes = cons(2,cons(3,cons(5,cons(7,nil))));
   List<int> more_primes = cons(11,cons(13,primes));
   Tree<char *> names = node(leaf("Church"),"Godel",empty);

As a programming convenience, Prop has a set of built-in list-like syntactic forms. Unlike languages such as ML, however, these forms are not predefined to be any specific list types. Instead, it is possible for the user to use these forms on any datatypes with a natural binary cons operator and a nullary nil constructor. For instance, the previous list datatype can be redefined as follows:

   datatype List<T> = #[] | #[ T ... List<T> ];
   List<int> primes = #[ 2, 3, 5, 7 ];
   List<int> more_primes = #[ 11, 13 ... primes ];
   List<char *> names = #[ "Church", "Godel", "Turing", "Curry" ];

   template <class T>
      List<T> append (List<T> a, List<T> b)
      {  match (a)
         {  case #[]:          return b;
            case #[hd ... tl]: return #[hd ... append(tl,b)];
         }
      }
The empty list is written as #[], while cons(a,b) is written as #[ a ... b ]. An expression of the special form #[a, b, c], for instance, is simple syntactic sugar for repeated application of the cons operator, i.e.

    #[a, b, c] == #[a ... #[ b ... #[ c ... #[] ] ] ].
    #[a, b, c ... d ] == #[a ... #[ b ... #[ c ... d ] ] ].

List-like special forms are not limited to datatypes with only two variants. For example, we can define a datatype similar in structure to S-expressions in Lisp or scheme. Here's how such as datatype may be defined3:

   datatype Sexpr = INT    (int)
                  | REAL   (double)
                  | STRING (char *)
                  | ATOM   (const char *)
                  | #()
                  | #( Sexpr ... Sexpr )
   where type Atom = Sexpr  // synonym for Sexpr
   ;

With this datatype specification in place, we can construct values of type Sexpr in a syntax close to that of Lisp. For example, we can define lambda expressions corresponding to the combinators I, K and S as follows:

   Atom LAMBDA = ATOM("LAMBDA");
   Atom f      = ATOM("f");
   Atom x      = ATOM("x");
   Atom y      = ATOM("y");
   Atom NIL    = #();

   Sexpr I = #(LAMBDA, #(x), x);
   Sepxr K = #(LAMBDA, #(x), #(LAMBDA, #(y), x));
   Sepxr S = #(LAMBDA, #(f),
                #(LAMBDA, #(x),
                   #(LAMBDA, #(y), #(#(f,x), #(g,x)))));

Similar to list-like forms, vector-like forms are also available. This addresses one of the flaws of the C++ language, which lacks first class array constructors. Vectors are simply homogeneous arrays whose sizes are fixed and are determined at creation time4. Random access within vectors can be done in constant time. Unlike lists, however, the prepending operation is not supported. Vectors literals are delimited with the composite brackets (| ... |), [| ... |], or {| ... |}. In the following example the datatype Exp uses vectors to represent the cooefficients of the polynomials:

   datatype Vector<T> = (| T |);
   datatype Exp = Polynomial (Var, Vector<int>)
                | Functor (Exp, Vector<Exp>)
                | Atom (Var)
                | ...
   where type Var = const char *;
   Exp formula = Polynomial("X", (| 1, 2, 3 |));

Commonly used patterns can be given synonyms so that they can be readily reused without undue repetition. This can be accomplished by defining pseudo datatype constructors to stand for common patterns using datatype law definitions. For example, the following set of laws define some commonly used special forms for a Lisp-like language using the previously defined Sexpr datatype.

   datatype law inline Lambda(x,e)  = #(ATOM "LAMBDA", #(x), e)
              | inline Quote(x)     = #(ATOM "QUOTE", x)
              | inline If(a,b,c)    = #(ATOM "IF", a, b, c)
              | inline Nil          = #()
              | inline ProgN(exprs) = #(ATOM "PROGN" ... exprs)
              | SpecialForm  = #(ATOM ("LAMBDA" || "IF" ||
                                       "PROGN" || "QUOTE") ... _)
              | Call(f,args) = ! SpecialForm && #(f ... args)
              ;

Note that the pattern SpecialForm is meant to match all special forms in our toy language: i.e. lambdas, ifs, progn's and quotes. The pattern disjunction connective || is used to link these forms together. Since we'd like the Call pattern to match only if the S-expression is not a special form, we use the pattern negation and conjunction operators, ! and && are used to screen out special forms.

With these definitions in place, an interpreter for our language can be written thus:

   Sexpr eval (Sexpr e)
   {  match (e)
      {  Call(?f,?args):     { /* perform function call */ }
      |  Lambda(?x,?e):      { /* construct closure */ }
      |  If(?e,?then,?else): { /* branching */ }
      |  Quote(?x):          { return ?x; }
      |  ...:                { /* others */ }
      }
   }

As an interesting note, the special form pattern can also be rewritten using regular expression string matching, as in the following:

 datatype law SpecialForm = #(ATOM /LAMBDA|IF|PROGN|QUOTE/ ... _)

In addition, since the law constructors Lambda, Quote, If, Nil and ProgN are defined with the inline keyword, these constructors can be used to within expressions as abbreviates for their rhs definitions. For example, writing Lambda(x,x) is the same writing #(ATOM("Lambda"),#(x),x).


next up previous contents index
Next: Variants of match Up: A quick tour Previous: Pattern matching versus

Allen Leung
Mon Apr 7 14:33:55 EDT 1997