The Grammars structure


Synopsis

signature GRAMMARS
structure Grammars : GRAMMARS

This structure provides functions that determine properties of regular recursion equation systems, and their least solution in the language interpretation. (The structure Grammars will probably be modified to subsume the structure RegEqns.)

It also provides functions to transform such an equation system (or: extendend Backus-Naur-Form grammar) to grammars having the same least solution, except that the empty word is removed, etc.


Interface

type 'a t = 'a list Set.t
type 'a env = int -> 'a
type 'a eqns = (int * 'a) list
datatype re = datatype RegExp.re
val Unit : 'a re
val Zero : 'a re
val maxVar : 'a re eqns -> int
val recVars : 'a re eqns -> int list
val freeVars : 'a re eqns -> int list
val unit : unit -> ''a t
val zero : unit -> ''a t
val atom : ''a -> ''a t
val initial : int -> ''a t -> ''a t
val geUnit : bool env -> 'a re eqns -> 'a re -> bool
val nonEmpty : bool env -> 'a re eqns -> 'a re -> bool
val subtractUnits : bool env -> 'a re eqns -> 'a re eqns
val nfEmpty : bool env -> ''a re eqns -> ''a re eqns
val first : int
              -> ''a t env
                -> ''a re eqns -> ''a re -> ''a t
val leftcontexts : ''a re eqns -> ''a re -> ''a re eqns
val follows : int
                -> ''a t env
                  -> ''a re eqns
                    -> ''a re -> ''a re -> ''a t

Description

type 'a t = 'a list Set.t

type 'a env = int -> 'a

type 'a eqns = (int * 'a) list

datatype re = datatype RegExp.re

Unit
returns the regular expression Times[].

Zero
returns the regular expression Plus[].

maxVar eqs
returns RegEqns.maxVar eqs.

recVars eqs
returns a list of all the recursion variables in eqs.

freeVars eqs
returns a list of the variables in eqs that are not recursion variables.

unit ()
returns a language containing just the empty word, i.e. the empty list of objects of type 'a.

zero ()
returns the empty language of objects of type ''a.

atom a
returns the language containing just the single word consisting of the list of lenght 1 containing a.

initial k t
returns the language resulting from t by cutting off words after the first k letters.

geUnit env eqs r
returns true if the value of r contains the empty word, in a language environment Env such that: (a) for every non-recursion variable i of eqns, Env(i) containd the empty word iff env(i) = true, and (b) for every recursion variable i of env, Env(i) is the i-component of the least solution of eqs (relative to the languages given by Env to the non-recursion variables).

nonEmpty env eqs r
returns true if the value of r is non-empty in a language environment Env such that: (a) for every non-recursion variable i of eqns, Env(i) is non-empty iff env(i) = true, and (b) for every recursion variable i of env, Env(i) is the i-component of the least solution of eqs (relative to the languages given by Env to the non-recursion variables).

subtractUnits env eqs
returns an equation system whose least solution is the same as that of eqs, except that its components do not contain the empty word. Assumes that env(i) = false for all non-recursive variables i of eqs, meaninig that these are interpreted by languages not containing the empty word.

nfEmpty env eqs
simplifies eqs as follows: determines the empty components of its least solution, assuming that those non-recursion variables for which env is true denote the empty language; then replaces all variables denoting the empty language by 0 and simplifies the right-hand sides of eqs using RegExp.nf.

first k env eqs r
returns the language obtained from the language defined by r in the environment env' by cutting off words at their k-th letter, where env' is env updated by the least solution of eqs relative to env.

leftcontexts eqs r
returns a system env' of equations that defines the regular sets of reduced left contexts of expression r in the languages defined by eqs. The recursion variables of eqs' are new, since the variables of eqs may occur as non-recursion variables in eqs'. (This is used to define LR(k) parsers, very slow even for k=1.)

follows k env eqs s r
returns the language obtained as follows: consider the languages L(s),L(r) defined by s and r in the environment env' (which is defined as in follows above), and for any uvw in L(s) with subword v in L(r), take the word obtained from w by cutting off at the k-th letter. (This is used to define LL(k)-parsers, but very slow even for k=1.)


See Also

RegEqns

Discussion

Since we allow free non-recursive variables, most functions need an environment giving properties of the languages interpreting these variables. Normally, a grammar does not have such free variables, and the choice of the environment is then irrelevant.

For the following examples, consider an equation system with two non-recursion variables; for readability, we use the printing of equations.

- structure Grm = ppRegEqns (StandardNames); open Grm; open Grammars;

- val Eqns = [(0,Plus[Times[Atom "a",Var 1,Var 4],Var 2]),
	      (1,Plus[Times[Atom "a",Var 1,Atom "b"],Times[]])]; printEqns Eqns;

 Equation system: 
   x0 = (a;x1;x4 + x2)
   x1 = (a;x1;b + 1)

- fun falseEnv i = false and trueEnv i = true
Example:

Assuming all free variables denote the empty language, the least solution has the following non-empty components:

- printEqns (nfEmpty trueEnv Eqns); 

 Equation system: 
   x0 = 0
   x1 = (a;x1;b + 1)
Example:

Assuming the non-recursion variables denote languages not containing the empty word, removing the empty word from the languages defined by Eqns gives the languages defined by the grammar

- printEqns (subtractUnits falseEnv Eqns); 

 Equation system: 
   x0 = (a;(1 + x1);x4 + x2)
   x1 = a;(1 + x1);b
Example:
- structure NamesG = struct
                       type alph = string
                       fun atomName (a : alph) = a : string
                       fun varName 0 = "S" | varName 1 = "E" 
                         | varName i = "x"^ Int.toString i 
                     end;
- structure Grm = ppRegEqns(NamesG); open Grm; 

- val eqs = [(0,Plus[Times[Var 1,Atom "=",Var 1],Times[Atom "[",Var 0,Atom "]"]]),
             (1,Plus[Times[Atom "[",Var 1,Atom "*",Var 1,Atom "]"],Atom "a",Atom "b"])]
- printEqns eqs;

 Equation system: 
   S = (E;=;E + [;S;])
   E = ([;E;*;E;] + a + b)
The sets of reduced left contexts of E in sentence forms of the component languages are defined by the right-linear grammar:
- printEqns (leftcontexts eqs (Var 1));

 Equation system: 
   x2 = ([;x2 + x3 + E;=;x3)
   x3 = (1 + [;x3 + [;E;*;x3)

- printEqns (solveRightlinearEqns (leftcontexts eqs (Var 1)))

 Equation system: 
   x2 = ([)*;(E;= + 1);([;E;* + [)*
   x3 = ([;E;* + [)*