[Japanese]

A Programming Language Processor

This is a processor for a programming language called Minila (Mini-language) in which only integers are used as data types, and assignment statements, if-statements, while-statements, for-statements and sequential compositions are only available as language constructs.

Ex: Let us consider the following program written in Minila. 

x := 12903;
y := 7735;
while x != y
  do
@@  if x < y@then y := y - x;@
            
else x := x - y;
    fi
  od

This computes the greatest common divisor of 12903 and 7735. This program has  two assignment statements and one while-statement that are sequentially composed. In the while-statement, there is one if-statement.

The processor consits of a scanner (lexical analyzer), a parser, an interpreter, a code generator and a virtual machine.


A Scanner

It converts a string into a list of tokens. The type token used for tokens is as follows:

datatype token =
    Semc             (* ; *)
  | Lpar             (* ( *)
  | Rpar             (* ) *)
  | Mul              (* * *)
  | Div              (* / *)
  | Plus             (* + *)
  | Minus            (* - *)
  | Lt               (* < *)
  | Gt               (* > *)
  | Eq               (* = *)
  | Neq              (* != *)
  | And              (* && *)
  | Or               (* || *)
  | Assign           (* := *)
  | If               (* if *)
  | Then             (* then *)
  | Else             (* else *)
  | Fi               (* fi *)
  | While            (* while *)
  | For              (* for *)
  | Do               (* do *)
  | Od               (* od *)
  | Num of int       (* [0-9]+ *)
  | Var of string    (* [a-zA-Z][a-zA-Z0-9]* except for keywords *)
  | Undef of string  (* undefined tokens *)

The delimiters are as follows:

' ', ';', '(', ')', '*', '/', '+', '-', '<', '>', '=', '!=', '&&', '||', ':='

The reserved words (keywords) are as follows:

"if", "then", "else", "fi", "while", "for", "do", "od"

Ex: The string

"x := 12903; y := 7735; while x != y do if x < y then y := y - x; else x := x - y; fi od"

is converted into the list of tokens

[Var "x", Assign, Num 12903, Semc, Var "y", Assign, Num 7735, Semc, While,@Var "x", Neq, Var "y", Do, If, Var "x", Lt, Var "y", Then, Var "y",@Assign, Var "y", Minus, Var "x", Semc, Else, Var "x", Assign, Var "x",@Minus, Var "y", Semc, Fi, Od]


A Parser

It generates a parse tree from a list of toknes. The Minila gramar is as follows:

  P ::= S
 
 S ::=   eps                           (empty statement)
       | var ':=' E ';'                (assignment statement)
       | 'if' E 'then' S 'else' S 'fi' (if-statement)
       | 'while' E 'do' S 'od'         (while-statement)
       | 'for' var E E 'do' S 'od'     (for-statement)
       | S S                           (sequentially composed statement)
 
 E ::=   num
       | '-' num
       | var
       | '-' var
       | '(' E ')'
       | '-' '(' E ')'
       | E '*' E | E '/' E
       | E '+' E | E '-' E
       | E '<' E | E '>' E | E '=' E | E '!=' E
       | E '&&' E | E '||' E

P stands for programs, S for statements, and E for expressions.

The empty statement does nothing. The assignment statement "var := E" sets the variable var to the result obtained by calculating the expression E. The if-statement "if E then S1 else S2 fi" executes the statement S1 if the result of E is not 0 and the statement S2 otherwise (that is, if the result of E is 0). The while-statement "while E do S od" repeatedly executes the statement S as long as the result of E is not 0. The for-statement "for var E1 E2 do S od" sets the variable var to the result of E1 and repeated executes the statement S as long as the value in the variable var is less than or equal to the result of E2. The sequentilly composed statement "S1 S2" executes S1 and then S2 in this order.

The parse tree of a statement is represented as the list of (sub-)parse trees. The parse tree of the empty statement is represented as the empty list. Each of assignment statements, if-statements, while-statements and for-statements is represented as a singleton list. The parse tree of the sequentially composed statement "S1 S2" is represented as the list obtained by concatenating the two lists representing the parse trees of the two statements S1 and S2.

The type stm for parse trees of assignments, if-statements, while-statements and for-statements is as follows:

datatype stm 
  AssignNode of exp*exp
| IfNode of exp*(stm list)*(stm list)
| WhileNode of exp*(stm list)
| ForNode of exp*exp*exp*(stm list)

The type exp for parse trees of expressions is as follows:

datatype exp =
  NumNode of int
| VarNode of string
| UminusNode of exp
| MulNode of exp*exp
| DivNode of exp*exp
| PlusNode of exp*exp
| MinusNode of exp*exp
| LtNode of exp*exp
| GtNode of exp*exp
| EqNode of exp*exp
| NeqNode of exp*exp
| AndNode of exp*exp
| OrNode of exp*exp

The grammar of expressions is modified as follows so that the recursive decent parsing method can be used: 

F   ::= ['-'] num | ['-'] var | ['-'] '(' E ')'
E1  ::= F E11
E11 ::= eps | '*' F E11 | '/' F E11
E2  ::= E1 E22
E22 ::= eps | '+' E1 E22 | '-' E1 E22
E3  ::= E2 E33
E33 ::= eps | '<' E2 E33 | '>' E2 E33 | '=' E2 E33 | '!=' E2 E33
E4  ::= E3 E44
E44 ::= eps | '&&' E3 E44 | '||' E3 E44
E   ::= E4

Ex: From the following list of tokens

[Var "x", Assign, Num 12903, Semc, Var "y", Assign, Num 7735, Semc, While,@Var "x", Neq, Var "y", Do, If, Var "x", Lt, Var "y", Then, Var "y",@Assign, Var "y", Minus, Var "x", Semc, Else, Var "x", Assign, Var "x",@Minus, Var "y", Semc, Fi, Od]

the following parse tree is generated

[AssignNode(VarNode "x", NumNode 12903),
 AssignNode(VarNode "y", NumNode 7735),
 WhileNode(NeqNode(VarNode "x", VarNode "y"),
           [IfNode(LtNode(VarNode "x", VarNode "y"),
                   [AssignNode(VarNode "y",
                               MinusNode(VarNode "y", VarNode "x"))],
                   [AssignNode(VarNode "x",
                               MinusNode(VarNode "x", VarNode "y"))])])]


An Interpreter

It interprets a parse tree as follows. The interpreter uses an envirnment to remember what integers are assigned to variables. Pairs of variable identifiers (strings) and integers are registered into the environment. Let HD be the head of the list representing the parse tree.

When HD is "AssignNode (VarNode var, EN)", the interpreter calcluates the parse tree EN of an expression and registers the pair of the variable identifier (string) var and the result into the environment.

When HD is "IfNode (EN,SN1,SN2)", the interpreter calculates the parse tree EN of an expression, and interpretes the parse tree SN1 of a statement if the result is not 0 and the parse tree SN2 of a statement otherwise.

When HD is "WhileNode (EN,SN)", the interpreter repeatedly interprets the parse tree SN of a statement as long as the reusult obtained by calculating the parse tree EN of an expression is not 0.

When HD is "ForNode (VarNode var,EN1,EN2,SN)", the interpreter first calculates the parse tree EN1 of an expression, register the pair of the variable identifier var and the result into the environment, and repeatedly interprets the parse tree SN fo an statement as long as the value in the variable var is less than or equal to the result obtained  by calculating the parse tree EN2 of an expression.

As the result, the interpreter returns (variable,integer)-pairs found in the environment (a list of pairs of variables identifiers and integers).

Ex: When the interpreter interprets the parse tree

[AssignNode(VarNode "x", NumNode 12903),
 AssignNode(VarNode "y", NumNode 7735),
 WhileNode(NeqNode(VarNode "x", VarNode "y"),
           [IfNode(LtNode(VarNode "x", VarNode "y"),
                   [AssignNode(VarNode "y",
                               MinusNode(VarNode "y", VarNode "x"))],
                   [AssignNode(VarNode "x",
                               MinusNode(VarNode "x", VarNode "y"))])])]

it returns the following list of pairs:

[("x", 17), ("y", 17)]

It shows that the greates common divisor of 12903 and 7735 is 17.


A Code Generator

It generates a list of instructions for a virtual machine from a parse tree.  The virtual machine uses a stack of integers and an environment in which pairs of variable identifiers (strings) and integers are registered. Let Stk and Env denote the stack and environment. Let Top and Snd be the top and second elements of Stk. Let pc be a program counter. After every instruction except for Jump, JumpOnCond and Quit is executed, pc is incremented. The type command used for instructions is as follows:

datatype command =
  Push of int        (* Push an integer on Stk. *)
| Load of string     
(* Push an integer found in Env with a variable (string) on Stk. *)
| Store of string    (* Register a variable and Top into Env, and pop Stk. *)
| MulMinusOne        (* Let n be Top. Pop Stk and push -n. *)
| Multiply           (* Let n1 and n2 be Top and Snd. Pop Stk twice and push n2*n1. *)
| Divide             (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push n2/n1. *)
| Add                (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push n2+n1. *)
| Subtract           (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push n2-n1. *)
| LessThan           (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push lt(n2,n1). *)
| GreaterThan@   @@(* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push gt(n2,n1). *)
| Equal              (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push eq(n2,n1). *)
| NotEqual           (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push neq(n2,n1). *)
| And                (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push and(n2,n1). *)
| Or                 (* Let n1 and n2 be Tiop and Snd. Pop Stl twice and push or(n2,n1). *)
| Jump of int        (* Add an integer to pc. *)
| JumpOnCond of int  (* Let n be Top. Pop Stk. If n<>0, add an integer to pc. If n=0, pc is incremented.*)
| Quit               (* Quit *)

where lt(n2,n1) is 1 if n2<n1 and 0 otherwise. gt(n2,n1) is 1 if n2>n1 and 0 otherwise. eq(n2,n1) is 1 if n2=n1 and 0 otherwise. neq(n2,n1) is 1 if n2<>n1 and 0 otherwise. and(n2,n1) is 1 if n2<>0 and n1<>0, and 0 otherwise. or(n2,n1) is 1 if n2<>0 or n1<>0, and 0 otherwise.

From "AssignNode (VarNode var, EN), the following list of instructions is generated:

CL @ [Store var]

where CL is the list of instructions for the parse tree EN of an expression.

From "IfNode (EN, SN1, SN2)", the following list of instructions is generated:

CL1 @ [JumpOnCond 2, Jump (The distance to the top of CL3)] @ CL2 @ [Jump (The distance to the next to CL3)] @ CL3

where CL1, CL2 and CL3 are the lists of instructions for the parse tree EN of an expressions and the parse trees SN1 and SN2 of statements.

From "WhileNode (EN,SN)", the following list of instructions is generated:

CL1 @ [JumpOnCond 2, Jump (The distance to the next  of the Jump instruction after CL2)] @ CL2 @ [Jump -1 * (The distance to the next to CL1)]

where CL1 and CL2 are the lists of instructions for the parse tree EN of an expression and the parse tree SN of a statement.

From "ForNode (VarNode var,EN1,EN2,SN)", the following list of instructions is generated:

CL1 @ [Store v, Load var] @ CL2 @ [LessThan, Load var] @ CL2 [Equal, Or, JumpOnCond 2, Jump (The distance to the next of the final Jump instruction)] @ CL3 @ [Load var, Push 1, Add, Store var, Jump -1 * (The distance to the previous Load instruction of the first CL2)]

where CL1, CL2 and CL3 are the list of instructions for the parse trees EN1 and EN2 of expressions and the parse tree SN of a statement.

Quit is added at the end to the list of instructions generated from a parse tree.

Lists of instructions from parse trees of expressions: The list of instructions for "NumNode n" is [Push n]. The list of instructions for "VarNode var" is [Load var]. The list of instructions for "UminusNode EN" is CL @ [MulMinusOne], where CL is the list of instructions for the parse tree EN of an expression. The list of instructions for "MulNode (EN1,EN2)" is CL1 @ CL2 @ [Multiply], where CL1 and CL2 are the lists of instructions for the parse trees EN1 and EN2 of expressions. From the other parse trees of expressions, lists of instructions are generated likewise.

Ex: From the following parse tree

[AssignNode(VarNode "x", NumNode 12903),
 AssignNode(VarNode "y", NumNode 7735),
 WhileNode(NeqNode(VarNode "x", VarNode "y"),
           [IfNode(LtNode(VarNode "x", VarNode "y"),
                   [AssignNode(VarNode "y",
                               MinusNode(VarNode "y", VarNode "x"))],
                   [AssignNode(VarNode "x",
                               MinusNode(VarNode "x", VarNode "y"))])])]

the following list of instructions is generated:

[Push 12903, Store "x", Push 7735, Store "y", Load "x", Load "y", NotEqual,
 JumpOnCond 2, Jump 16, Load "x", Load "y", LessThan, JumpOnCond 2, Jump 6,
 Load "y", Load "x", Subtract, Store "y", Jump 5, Load "x", Load "y",
 Subtract, Store "x", Jump ~19, Quit]


A Virtual Machine 

It executes a list of instructions and returns (variable,integer)-pairs found in the environment (a lits of pairs of strings and integers) as the result.

Ex: The result of the following list of instructions

[Push 12903, Store "x", Push 7735, Store "y", Load "x", Load "y", NotEqual,
 JumpOnCond 2, Jump 16, Load "x", Load "y", LessThan, JumpOnCond 2, Jump 6,
 Load "y", Load "x", Subtract, Store "y", Jump 5, Load "x", Load "y",
 Subtract, Store "x", Jump ~19, Quit]

executed by the virtual machine is the following list:

[("x", 17), ("y", 17)]


Programs

The parts of for-statement are unimplemented in the interpreter and code generator.

Env.sig
Env.sml
Scanner.sig

Scanner.sml
Parser.sig
Parser.sml
Interpreter.sig
Interpreter.sml
Compiler.sig
Compiler.sml
VirtualMachine.sig
VirtualMachine.sml


Execution Examples

- use "examples.sml";
[opening file "examples.sml"]
...
> val it =
    [AssignNode(VarNode "x", NumNode 1), AssignNode(VarNode "i", NumNode 1),
     WhileNode(OrNode(LtNode(VarNode "i", NumNode 10),
                      EqNode(VarNode "i", NumNode 10)),
               [AssignNode(VarNode "x", MulNode(VarNode "i", VarNode "x")),
                AssignNode(VarNode "i", PlusNode(VarNode "i", NumNode 1))])] :
  stm list
> val it =
    [AssignNode(VarNode "x", NumNode 12903),
     AssignNode(VarNode "y", NumNode 7735),
     WhileNode(NeqNode(VarNode "x", VarNode "y"),
               [IfNode(LtNode(VarNode "x", VarNode "y"),
                       [AssignNode(VarNode "y",
                                   MinusNode(VarNode "y", VarNode "x"))],
                       [AssignNode(VarNode "x",
                                   MinusNode(VarNode "x", VarNode "y"))])])] :
  stm list
> val it =
    [AssignNode(VarNode "n", NumNode 200000000),
     AssignNode(VarNode "a", NumNode 1),
     WhileNode(LtNode(MulNode(MulNode(NumNode 4, VarNode "a"), VarNode "a"),
                      VarNode "n"),
               [AssignNode(VarNode "a", MulNode(NumNode 2, VarNode "a"))]),
     AssignNode(VarNode "b", MulNode(NumNode 2, VarNode "a")),
     WhileNode(NeqNode(PlusNode(VarNode "a", NumNode 1), VarNode "b"),
               [AssignNode(VarNode "d",
                           DivNode(MinusNode(VarNode "b", VarNode "a"),
                                   NumNode 2)),
                IfNode(GtNode(MulNode(PlusNode(VarNode "a", VarNode "d"),
                                      PlusNode(VarNode "a", VarNode "d")),
                              VarNode "n"),
                       [AssignNode(VarNode "b",
                                   MinusNode(VarNode "b", VarNode "d"))],
                       [AssignNode(VarNode "a",
                                   PlusNode(VarNode "a", VarNode "d"))])])] :
  stm list
> val it = [("x", 3628800), ("i", 11)] : (string * int) list
> val it = [("x", 17), ("y", 17)] : (string * int) list
> val it = [("n", 200000000), ("a", 14142), ("b", 14143), ("d", 1)] :
  (string * int) list
> val it =
    [Push 1, Store "x", Push 1, Store "i", Load "i", Push 10, LessThan,
     Load "i", Push 10, Equal, Or, JumpOnCond 2, Jump 10, Load "i", Load "x",
     Multiply, Store "x", Load "i", Push 1, Add, Store "i", Jump ~17, Quit] :
  command list
> val it =
    [Push 12903, Store "x", Push 7735, Store "y", Load "x", Load "y", NotEqual,
     JumpOnCond 2, Jump 16, Load "x", Load "y", LessThan, JumpOnCond 2, Jump 6,
     Load "y", Load "x", Subtract, Store "y", Jump 5, Load "x", Load "y",
     Subtract, Store "x", Jump ~19, Quit] : command list
> val it =
    [Push 200000000, Store "n", Push 1, Store "a", Push 4, Load "a", Multiply,
     Load "a", Multiply, Load "n", LessThan, JumpOnCond 2, Jump 6, Push 2,
     Load "a", Multiply, Store "a", Jump ~13, Push 2, Load "a", Multiply,
     Store "b", Load "a", Push 1, Add, Load "b", NotEqual, JumpOnCond 2,
     Jump 28, Load "b", Load "a", Subtract, Push 2, Divide, Store "d",
     Load "a", Load "d", Add, Load "a", Load "d", Add, Multiply, Load "n",
     GreaterThan, JumpOnCond 2, Jump 6, Load "b", Load "d", Subtract,
     Store "b", Jump 5, Load "a", Load "d", Add, Store "a", Jump ~33, Quit] :
  command list
> val it = [("x", 3628800), ("i", 11)] : (string * int) list
> val it = [("x", 17), ("y", 17)] : (string * int) list
> val it = [("n", 200000000), ("a", 14142), ("b", 14143), ("d", 1)] :
  (string * int) list
[closing file "examples.sml"]
> val it = () : unit
-