Project: Language Interpreter
Overview
In this project, you will implement an interpreter for the small C-like
language from the last project. The language supports variables, int and
bool types, equality and comparison operations, math and boolean operations,
control flow, and printing, all while maintaining static type-safety and being
Turing complete.
The language consists of expressions from the expr type and
statements from the stmt type. These algebraic types can be used to
represent the full space of well-formed programs. Their definitions are
found in the ast.ml file.
Compilation, Tests, and Running
You can run the interpreter on a source file using:
dune exec bin/main.exe -- <filename>
This executes the program and prints the results of any print statements. Additional options:
-R: Prints a report of the final variable bindings produced by your
evaluator.
-U: Converts the program’s abstract syntax tree to a string and
prints it.
You are encouraged to use the provided executable extensively while testing your evaluator.
The Evaluator
The evaluator must be implemented in eval.ml in accordance with the
signatures for eval_expr and eval_stmt. The functions should be
left in the order they are provided, as your implementation of
eval_stmt will rely on eval_expr.
Environments
Both evaluator functions operate over an environment, represented as an association list:
(string * value)
Each entry maps a variable name to its current value. Earlier entries override later ones. Statements may modify the environment, while expressions may not.
The OCaml List module includes several helpful functions for working with association lists.
Error Handling
The formal semantics describe only valid programs. Your implementation must explicitly handle error cases by raising the appropriate exceptions as follows:
Division by zero (
DivByZeroError)Type errors (
TypeError) for example,1 + trueDeclaration errors (
DeclareError), including:- redeclaring an existing variable
- assigning to an undeclared variable
- reading an undeclared variable
Part 1: eval_expr
- Type:
environment -> expr -> value - Description: Evaluates an expression under a given environment
and returns a value (
Int_ValorBool_Val). Expressions never modify the environment.
Expression Rules
Integers
Integer literals evaluate to
Int_Valof the same value.Booleans
Boolean literals evaluate to
Bool_Valof the same value.Identifiers
Evaluated by lookup in the environment. If no binding exists, raise
DeclareError.Arithmetic Operators (+, -, *, /, ^)
- Operate only on integers
- Return an
Int_Val - Division by zero raises
DivByZeroError - Non‑integer operands raise
TypeError - Exponentiation results are floored if non‑integral
Boolean Operators (&&, ||)
- Operate only on booleans
- Return
Bool_Val - Non‑boolean operands raise
TypeError
Logical Not (!)
- Operand must evaluate to a boolean
- Returns the negated value
- Otherwise raises
TypeError
Relational Operators (>, <, >=, <=)
- Operate only on integers
- Return
Bool_Val - Non‑integer operands raise
TypeError
Equality Operators (==, !=)
- Both operands must be the same type
- Valid for both int and bool
- Return
Bool_Val - Mismatched types raise
TypeError
Part 2: eval_stmt
- Type:
environment -> stmt -> environment - Description: Evaluates a statement under a given environment and returns an updated environment. Statements do not return values; instead, they modify program state.
Statement Rules
NoOp
Performs no action and returns the environment unchanged.
Seq
Executes two statements in sequence:
- Evaluate the first statement to produce a new environment
- Evaluate the second under that environment
Declare
- Introduces a new variable
- Redeclaring a variable raises
DeclareError - int variables initialize to 0
- bool variables initialize to false
Assign
- Assigns a new value to an existing variable
- Assigning to an undeclared variable raises
DeclareError - Type mismatches raise
TypeError - Updates and returns the environment on success
If
- Guard expression must evaluate to a boolean
- Executes either the if or else body accordingly
- Returns the resulting environment
- Non‑boolean guards raise
TypeError
While
- Guard expression must evaluate to a boolean
- Repeatedly evaluates the body while the guard is true
- Returns the final environment
- Non‑boolean guards raise
TypeError
For
- Contains an identifier, start expression, end expression, and body
- Start and end expressions must evaluate to integers
- The loop variable is automatically incremented by one each iteration
- The loop runs inclusively from start to end
- Returns the current environment if the range is empty
Print Statement
The print statement outputs the evaluated expression followed by a newline.
Supported types: int, bool
Output format:
- Integers print as numbers (for example, 10)
- Booleans print as true or false
Important Printing Note
You must not use print_string, print_int, or related OCaml
functions for program output. These will not be captured by the test
system. Instead, you must use the following wrapper functions:
print_output_string : string -> unit
print_output_int : int -> unit
print_output_bool : bool ->
print_output_newline : unit -> unit
- Use these functions only for handling the Print statement
- Standard print functions may be used freely for debugging
Turning in the Assignment
To submit your assignment, create a zip file of a DIRECTORY named
project5-handin containing ONLY the project files that you
edited. Then submit that file to the appropriate folder on D2L.
Grading Rubric (50 points)
This rubric explains how your interpreter will be evaluated. Points are awarded based on automated test results, your own unit tests, and overall code quality.
25 points — Automated Testing
- 25 points: All or nearly all automated tests pass successfully.
- 18 points: The majority of automated tests pass.
- 8 points: A small portion of automated tests pass.
- 0 points: Few or none of the automated tests pass.
- 25 points: All or nearly all automated tests pass successfully.
15 points — Unit Tests
- 15 points: Unit tests provide strong coverage, use correct expectations, and are clearly written.
- 10 points: Unit tests are mostly effective, with only minor gaps or issues.
- 5 points: Unit tests are limited in scope or uneven in quality.
- 0 points: Unit tests are minimal, ineffective, or missing.
- 15 points: Unit tests provide strong coverage, use correct expectations, and are clearly written.
4 points — Code Readability and Style
- 4 points: Code is clear, well organized, consistently styled, and properly formatted using a code formatter.
- 2 points: Code shows some effort toward clarity but is often difficult to follow or inconsistently formatted.
- 0 points: Code is extremely difficult to read or does not follow basic style conventions.
- 4 points: Code is clear, well organized, consistently styled, and properly formatted using a code formatter.
4 points — Documentation and Comments
- 4 points: Documentation is clear and complete; all files and top-level functions are appropriately documented.
- 2 points: Documentation is minimal or inconsistent, with several missing or unclear comments.
- 0 points: Documentation is missing or provides no meaningful guidance.
- 4 points: Documentation is clear and complete; all files and top-level functions are appropriately documented.
2 points — Adherence to Additional Specifications
- 2 points: All additional specifications are fully met.
- 1 point: One minor specification is not satisfied.
- 0 points: Multiple or major specification requirements are not met.
- 2 points: All additional specifications are fully met.
Appendix: Operational Semantics
The remainder of this document provides the formal operational semantics for expressions and statements. These semantics describe correct evaluations only. Error cases are intentionally omitted and must be handled as specified above in your implementation.
The formal rules correspond directly to the behavior expected from:
eval_expr(expression semantics)eval_stmt(statement semantics)
Use these rules as a reference to guide your implementation.
Environments
Both parts define judgments that make use of environments E. The rules show two operations on environments:
E(x) means to look up the value of x maps to in the environment E. This operation is undefined if there is no mapping for x in E. We write ⋅ for the empty environment. Lookup on this environment is always undefined; that is, ⋅(x) is undefined for all x.
The second operation is written E[x ↦ v]. It defines a new environment that is the same as E but maps x to v. It thus overrides any prior mapping for x in E. So, E[x ↦ 1] is an environment that maps x to 1 but is undefined for all other variables. The environment ⋅[x ↦ 1][y ↦ 2] is an environment that maps x to 1 and y to 2. As such, if E = ⋅[x ↦ 1][y ↦ 2] then E(x) = 1. That is, looking up x in the second example environment produces its mapped-to value, 1.
Error conditions
The semantics here defines only correct evaluations. It says nothing about what happens when, say, you have a type error. For example, for the rules below there is no value v for which you can prove the judgment ⋅ ⊢ 1 + true ⇒ v. In your implementation, erroneous programs will cause an exception to be raised, as indicated above.
Expression Semantics
This part of the formal semantics corresponds to the implementation of your
eval_expr function. That function has type environment -> expr -> value. In
the semantics, we write the judgment E ⊢ e ⇒ v, where E
represents the environment, e represents the expression, and v represents
the final value.
Abstract syntax grammar
Expressions and values are defined by the following grammar:
$$ \begin{array}{lrcl} \mathrm{Integers} & n & \mathit{is} & \mathrm{any\ integer}\\ \mathrm{Booleans} & b & \mathtt{::=} & \mathbf{true} \mid \mathbf{false} \\ \mathrm{Values} & v & \mathtt{::=} & n \mid b \\ \mathrm{Expressions} & e & \mathtt{::=} & v \mid\ !e \mid e \oplus e \mid e \odot e \mid e == e \mid e\ != e \\ \end{array} $$
Here, we write ⊕ to represent any operator involving a pair of integers. This could be addition (+), comparison (≤), multiplication (×), etc. We write ⊙ to represent any operator involving a pair of booleans. This could be boolean-and (&&), boolean-or (||), etc. We do not go into specifics here about what these actually do; they should match your intuition.
Rules
Here are the axioms.
$$ \frac{ \begin{array}{l} E(x) = v \end{array}} {E \vdash x \Rightarrow v}\text{[Id]} $$
$$ \frac{ } {E \vdash n \Rightarrow n}\text{[Int]} $$
$$ \frac{ } {E \vdash \texttt{true} \Rightarrow \; \textbf{true}}\text{[Bool-True]} $$
$$ \frac{ } {E \vdash \texttt{false} \Rightarrow \; \textbf{false}}\text{[Bool-False]} $$
Here are the rest of the rules for expressions.
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v\\ E \vdash e_2 \Rightarrow v\\ \end{array}} {E \vdash e_1 \; \texttt{==} \; e_2 \Rightarrow \; \texttt{true}}\text{[Eq-True]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v_1\\ E \vdash e_2 \Rightarrow v_2\\ v_1 \neq v_2 \end{array}} {E \vdash e_1 \; \texttt{==} \; e_2 \Rightarrow \; \texttt{false}}\text{[Eq-False]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v_1\\ E \vdash e_2 \Rightarrow v_2\\ v_1 \neq v_2 \end{array}} {E \vdash e_1 \; \texttt{!=} \; e_2 \Rightarrow \; \texttt{true}}\text{[NotEq-True]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v\\ E \vdash e_2 \Rightarrow v\\ \end{array}} {E \vdash e_1 \; \texttt{!=} \; e_2 \Rightarrow \; \texttt{false}}\text{[NotEq-False]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ \end{array}} {E \vdash e_1 \; \oplus \; e_2 \Rightarrow \; n_1 \oplus n_2}\text{[BinOp-Int]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow b_1\\ E \vdash e_2 \Rightarrow b_2\\ \end{array}} {E \vdash e_1 \; \odot \; e_2 \Rightarrow \; b_1 \odot b_2}\text{[BinOp-Bool]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow b_1\\ \end{array}} {E \vdash \texttt{!} e_1 \Rightarrow \; \neg \; b_1}\text{[Unary-Not]} $$
Statement Semantics
This part of the formal semantics corresponds to the implementation of your
eval_stmt function. That function has type environment -> stmt -> environment. In the semantics, we write the judgment E ⊢ s ⇒ E′, where E represents the input environment, s represents the statement
to execute, and E′ represents the final output environment. Statements are
different from expressions in that they do not “return” anything themselves;
instead, their impact occurs by modifying the environment (by assigning to
variables).
Abstract syntax grammar
Statements are defined by the following grammar:
$$\begin{array}{lrcl} \mathrm{Statements} & s & \mathtt{::=} & \mathtt{nop}\\ & & \mid & s \; \mathtt{;} \; s\\ & & \mid & \texttt{if} \; e \; s_1 \; s_2 \\ & & \mid & \texttt{while} \; e \; s \\ & & \mid & \texttt{for} \; x \; e_1 \; e_2 \; s\\ & & \mid & \mathtt{int}\;x\\ & & \mid & \mathtt{bool}\;x\\ & & \mid & x = e \\ \end{array}$$
In the project itself there is also a statement form for printing; we leave that out of the formal semantics.
Variables
In our language, you must declare a variable, with its type, before you use it. When declared, it will be initialized to a default value. You cannot declare the same variable twice.
$$ \frac{ \begin{array}{l} E(x) \text{~is~not~defined} \end{array}} {E \vdash \; \texttt{int} \; x \texttt{;} \; \Rightarrow \; E[x \mapsto 0]}\text{[Declare-Int]} $$
$$ \frac{ \begin{array}{l} E(x) \text{~is~not~defined} \end{array}} {E \vdash \; \texttt{bool} \; x \texttt{;} \; \Rightarrow \; E[x \mapsto \textbf{false}]}\text{[Declare-Bool]} $$
Assigning to variables must respect their declared type. In particular, you cannot assign to a variable you have not declared, and you cannot write a boolean to a variable declared as an int, or vice versa.
$$ \frac{ \begin{array}{l} E(x) = n_1\\ E \vdash e \Rightarrow n_2\\ \end{array}} {E \vdash x \; \texttt{=} \; e \texttt{;} \; \Rightarrow \; E[x \mapsto n_2]}\text{[Assign-Int]} $$
$$ \frac{ \begin{array}{l} E(x) = b_1\\ E \vdash e \Rightarrow b_2\\ \end{array}} {E \vdash x \; \texttt{=} \; e \texttt{;} \; \Rightarrow \; E[x \mapsto b_2]}\text{[Assign-Bool]} $$
Control Flow
Here are the rules for the different control flow constructs.
$$ \frac{} {E \vdash \; \texttt{nop} \; \Rightarrow \; E}\text{[Nop]} $$
$$ \frac{ \begin{array}{l} E \vdash s_1 \Rightarrow E_1\\ E_1 \vdash s_1 \Rightarrow E_2\\ \end{array}} {E \vdash s_1 \texttt{;} \; s_2 \Rightarrow E_2}\text{[Sequence]} $$
$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{true}\\ E \vdash s_1 \Rightarrow E_1\\ \end{array}} {E \vdash \; \texttt{if} \; e \; s_1 \; s_2 \; \Rightarrow E_1}\text{[If-True]} $$
$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{false}\\ E \vdash s_2 \Rightarrow E_1\\ \end{array}} {E \vdash \; \texttt{if} \; e \; s_1 \; s_2 \; \Rightarrow E_1}\text{[If-False]} $$
$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{true}\\ E \vdash s \Rightarrow E_1\\ E_1 \vdash \; \texttt{while} \; e \; s \; \Rightarrow E_2 \end{array}} {E \vdash \; \texttt{while} \; e \; s \; \Rightarrow E_2}\text{[While-True]} $$
$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{false}\\ \end{array}} {E \vdash \; \texttt{while} \; e \; s \; \Rightarrow E}\text{[While-False]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ E[x \mapsto n_1] \vdash s \Rightarrow E_1\\ E_1 \vdash x \; \texttt{=} \; x + 1 \Rightarrow E_2\\ E_2 \vdash \; \texttt{for} \; x \; n_1 \; n_2 \; s \; \Rightarrow E'\\ n_1 \leq n_2 \end{array}} {E \vdash \; \texttt{for} \; x \; e_1 \; e_2 \; s \; \Rightarrow E'}\text{[For-Go]} $$
$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ n_1 > n_2 \end{array}} {E \vdash \; \texttt{for} \; x \; e_1 \; e_2 \; s \; \Rightarrow E[x \mapsto n_1]}\text{[For-Stop]} $$