A Finite Automata Library for Standard ML
Introduction to the Automata Library
This Automata Library collects modules dealing with computations in
Kleenean algebras KA = (K,+,;,*,0,1), possibly with some constants for
elements of K. The most familiar Kleenean algebra is the algebra of
regular languages over a given finite alphabet. It is the
free Kleenean algebra over the alphabet, but there are other
interesting interpretations as well, like the algebra of binary
relations over a set, where ; is relational composition and * the
transitive closure.
Source code
Content
Roughly, the library contains programs to
- construct, combine, transform and display finite automata, transducers and pattern matchers
- translate between finite automata and regular expressions,
- evaluate regular expressions in a Kleenean algebra and build the matrix algebra over a Kleenean algebra,
- transform systems of regular recursion equations to context-free
grammars (in Greibach normal form),
- determine properties of the least solution by formal languages of a system of regular
recursion equations.
In more detail, this library provides
-
a signature KA for Kleenean algebras, and a signature pKA for polymorphic Kleenean algebras, whose universe is parametric in some type; moreover, a functor RegMod generating an evaluation of regular expressions in a Kleenean algebra;
-
a structure RegExp of regular expressions, with functions to simplify them using algebraic laws of Kleenean algebra; moreover, a functor ppRegExp to print regular expressions using printnames for letters and variables as provided by the user;
-
a signature FM for finite automata, and a functor ppFm generating from an alphabet provided by the user a structure for finite automata over the alphabet, with functions to build them from basic automata, functions to compute from an automaton an equivalent deterministic or minimal deterministic one, and graphical display of automata based on AT & T's program graphviz/dot.
-
a functor RegExpFm with translations between regular expressions and finite automata, and a functor ppRegExpFm which additionally generates print- and display functing from printnames provided by the user;
-
a functor kaMatrix generating the matrix algebra over a Kleenean algebra, which again is a Kleenean algebra. (However, the abstract notion of finite automaton over a Kleenean algebra mentioned above is not implemented). Similarly, a functor pkaMatrix for the matrix algebra over a polymorphic Kleenean algebra. (This is actually used to implement the solution of right-linear equation systems by regular expressions).
-
signatures LABEL and NAMES with some standard printable labels and variable names, and a functor Names that adds standard printnames for variables to user-provided ones for letters;
The Library also has an implementation of finite automata extended by a counter, which can be used to generate pattern matchers for regular pattern search:
-
a functor ppCFm, similar to ppFm, that generates a structure for finite automata with a counter, which count the input position and remember the occurrence positions of patterns found in an input;
-
functors simpleKMPautomaton and KMPautomaton that generate Knuth-Morris-Pratt pattern matchers for single-pattern-search;
-
a functor Matcher generating from a regular expression a search automaton for regular-pattern-search;
The Library also provides some (preliminary and incomplete) treatment of systems of regular recursion equations, or EBNF-grammars, i.e. grammars in extended Backus-Naur-form. In particular, it contains
-
a structure RegEqns for systems of regular rexursion equations, containing transformations that preserve the least solution in all continuous Kleenean algebras, and a functor ppRegEqns for printing equation systems with user-provided names for variables and constants;
-
a structure Grammars for EBNF-grammars, containing further transformations that are valid in the Kleenean algebra of all (context-free) languages; among those are
-
transformation to Greibach-normal form,
-
tests on emptyness or membership of the empty word in the least solution;
-
computations of first and follow as needed in the construction of LL- and LR-parsers for context free languages (but these are too slow to be useful except for demonstration).
This package was written to support teaching of finite automata and formal languages. Efficiency was not a main interest, but the minimization algorithm is O(n*log(n)) if the set of final states is O(log(n)); hashing the final states would make it O(n*log(n)) in general.
Note:
-
Finite automata with epsilon-transitions are not supported directly. One can enter such an automaton in the form of a system of right-linear recursion equations (in which variables as summands correspond to expsilon-transitions), and transform the equation system to an automaton.
-
The structure RegEqnsFm is very preliminary and ought to be replaced by a package RegEqnsRecFm that combines (mutually) recursive finite automata and systems of regular recursion equations.
Please send bug-reports and comments to:
Hans Leiß.
Last Modified: March 9th, 2000, January 22th, 2010