Project: Language Interpreter

Overview

In this project, you will implement an interpreter for the small C-like language from the last project. The language supports variables, int and bool types, equality and comparison operations, math and boolean operations, control flow, and printing, all while maintaining static type-safety and being Turing complete.

The language consists of expressions from the expr type and statements from the stmt type. These algebraic types can be used to represent the full space of well-formed programs. Their definitions are found in the ast.ml file.

Compilation, Tests, and Running

You can run the interpreter on a source file using:

  dune exec bin/main.exe -- <filename>

This executes the program and prints the results of any print statements. Additional options:

-R: Prints a report of the final variable bindings produced by your evaluator.

-U: Converts the program’s abstract syntax tree to a string and prints it.

You are encouraged to use the provided executable extensively while testing your evaluator.

The Evaluator

The evaluator must be implemented in eval.ml in accordance with the signatures for eval_expr and eval_stmt. The functions should be left in the order they are provided, as your implementation of eval_stmt will rely on eval_expr.

Environments

Both evaluator functions operate over an environment, represented as an association list:

(string * value)

Each entry maps a variable name to its current value. Earlier entries override later ones. Statements may modify the environment, while expressions may not.

The OCaml List module includes several helpful functions for working with association lists.

Error Handling

The formal semantics describe only valid programs. Your implementation must explicitly handle error cases by raising the appropriate exceptions as follows:

Division by zero (DivByZeroError)
Type errors (TypeError) for example, 1 + true
Declaration errors (DeclareError), including:
- redeclaring an existing variable
- assigning to an undeclared variable
- reading an undeclared variable

Part 1: `eval_expr`

Type: environment -> expr -> value
Description: Evaluates an expression under a given environment and returns a value (Int_Val or Bool_Val). Expressions never modify the environment.

Expression Rules

Integers

Integer literals evaluate to Int_Val of the same value.
Booleans

Boolean literals evaluate to Bool_Val of the same value.
Identifiers

Evaluated by lookup in the environment. If no binding exists, raise DeclareError.
Arithmetic Operators (+, -, *, /, ^)
- Operate only on integers
- Return an Int_Val
- Division by zero raises DivByZeroError
- Non‑integer operands raise TypeError
- Exponentiation results are floored if non‑integral
Boolean Operators (&&, ||)
- Operate only on booleans
- Return Bool_Val
- Non‑boolean operands raise TypeError
Logical Not (!)
- Operand must evaluate to a boolean
- Returns the negated value
- Otherwise raises TypeError
Relational Operators (>, <, >=, <=)
- Operate only on integers
- Return Bool_Val
- Non‑integer operands raise TypeError
Equality Operators (==, !=)
- Both operands must be the same type
- Valid for both int and bool
- Return Bool_Val
- Mismatched types raise TypeError

Part 2: `eval_stmt`

Type: environment -> stmt -> environment
Description: Evaluates a statement under a given environment and returns an updated environment. Statements do not return values; instead, they modify program state.

Statement Rules

NoOp

Performs no action and returns the environment unchanged.
Seq

Executes two statements in sequence:
1. Evaluate the first statement to produce a new environment
2. Evaluate the second under that environment
Declare
- Introduces a new variable
- Redeclaring a variable raises DeclareError
- int variables initialize to 0
- bool variables initialize to false
Assign
- Assigns a new value to an existing variable
- Assigning to an undeclared variable raises DeclareError
- Type mismatches raise TypeError
- Updates and returns the environment on success
If
- Guard expression must evaluate to a boolean
- Executes either the if or else body accordingly
- Returns the resulting environment
- Non‑boolean guards raise TypeError
While
- Guard expression must evaluate to a boolean
- Repeatedly evaluates the body while the guard is true
- Returns the final environment
- Non‑boolean guards raise TypeError
For
- Contains an identifier, start expression, end expression, and body
- Start and end expressions must evaluate to integers
- The loop variable is automatically incremented by one each iteration
- The loop runs inclusively from start to end
- Returns the current environment if the range is empty
Print Statement

The print statement outputs the evaluated expression followed by a newline.
- Supported types: int, bool
- Output format:
  - Integers print as numbers (for example, 10)
  - Booleans print as true or false

Important Printing Note

You must not use print_string, print_int, or related OCaml functions for program output. These will not be captured by the test system. Instead, you must use the following wrapper functions:

print_output_string : string -> unit
print_output_int : int -> unit
print_output_bool : bool ->
print_output_newline : unit -> unit

Use these functions only for handling the Print statement
Standard print functions may be used freely for debugging

Turning in the Assignment

To submit your assignment, create a zip file of a DIRECTORY named project5-handin containing ONLY the project files that you edited. Then submit that file to the appropriate folder on D2L.

Grading Rubric (50 points)

This rubric explains how your interpreter will be evaluated. Points are awarded based on automated test results, your own unit tests, and overall code quality.

25 points — Automated Testing
- 25 points: All or nearly all automated tests pass successfully.
- 18 points: The majority of automated tests pass.
- 8 points: A small portion of automated tests pass.
- 0 points: Few or none of the automated tests pass.
15 points — Unit Tests
- 15 points: Unit tests provide strong coverage, use correct expectations, and are clearly written.
- 10 points: Unit tests are mostly effective, with only minor gaps or issues.
- 5 points: Unit tests are limited in scope or uneven in quality.
- 0 points: Unit tests are minimal, ineffective, or missing.
4 points — Code Readability and Style
- 4 points: Code is clear, well organized, consistently styled, and properly formatted using a code formatter.
- 2 points: Code shows some effort toward clarity but is often difficult to follow or inconsistently formatted.
- 0 points: Code is extremely difficult to read or does not follow basic style conventions.
4 points — Documentation and Comments
- 4 points: Documentation is clear and complete; all files and top-level functions are appropriately documented.
- 2 points: Documentation is minimal or inconsistent, with several missing or unclear comments.
- 0 points: Documentation is missing or provides no meaningful guidance.
2 points — Adherence to Additional Specifications
- 2 points: All additional specifications are fully met.
- 1 point: One minor specification is not satisfied.
- 0 points: Multiple or major specification requirements are not met.

Appendix: Operational Semantics

The remainder of this document provides the formal operational semantics for expressions and statements. These semantics describe correct evaluations only. Error cases are intentionally omitted and must be handled as specified above in your implementation.

The formal rules correspond directly to the behavior expected from:

eval_expr (expression semantics)
eval_stmt (statement semantics)

Use these rules as a reference to guide your implementation.

Environments

Both parts define judgments that make use of environments E. The rules show two operations on environments:

E(x) means to look up the value of x maps to in the environment E. This operation is undefined if there is no mapping for x in E. We write ⋅ for the empty environment. Lookup on this environment is always undefined; that is, ⋅(x) is undefined for all x.
The second operation is written E[x ↦ v]. It defines a new environment that is the same as E but maps x to v. It thus overrides any prior mapping for x in E. So, E[x ↦ 1] is an environment that maps x to 1 but is undefined for all other variables. The environment ⋅[x ↦ 1][y ↦ 2] is an environment that maps x to 1 and y to 2. As such, if E = ⋅[x ↦ 1][y ↦ 2] then E(x) = 1. That is, looking up x in the second example environment produces its mapped-to value, 1.

Error conditions

The semantics here defines only correct evaluations. It says nothing about what happens when, say, you have a type error. For example, for the rules below there is no value v for which you can prove the judgment ⋅ ⊢ 1 + true ⇒ v. In your implementation, erroneous programs will cause an exception to be raised, as indicated above.

Expression Semantics

This part of the formal semantics corresponds to the implementation of your eval_expr function. That function has type environment -> expr -> value. In the semantics, we write the judgment E ⊢ e ⇒ v, where E represents the environment, e represents the expression, and v represents the final value.

Abstract syntax grammar

Expressions and values are defined by the following grammar:

$$ \begin{array}{lrcl} \mathrm{Integers} & n & \mathit{is} & \mathrm{any\ integer}\\ \mathrm{Booleans} & b & \mathtt{::=} & \mathbf{true} \mid \mathbf{false} \\ \mathrm{Values} & v & \mathtt{::=} & n \mid b \\ \mathrm{Expressions} & e & \mathtt{::=} & v \mid\ !e \mid e \oplus e \mid e \odot e \mid e == e \mid e\ != e \\ \end{array} $$

Here, we write ⊕ to represent any operator involving a pair of integers. This could be addition (+), comparison (≤), multiplication (×), etc. We write ⊙ to represent any operator involving a pair of booleans. This could be boolean-and (&&), boolean-or (||), etc. We do not go into specifics here about what these actually do; they should match your intuition.

Rules

Here are the axioms.

$$ \frac{ \begin{array}{l} E(x) = v \end{array}} {E \vdash x \Rightarrow v}\text{[Id]} $$

$$ \frac{ } {E \vdash n \Rightarrow n}\text{[Int]} $$

$$ \frac{ } {E \vdash \texttt{true} \Rightarrow \; \textbf{true}}\text{[Bool-True]} $$

$$ \frac{ } {E \vdash \texttt{false} \Rightarrow \; \textbf{false}}\text{[Bool-False]} $$

Here are the rest of the rules for expressions.

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v\\ E \vdash e_2 \Rightarrow v\\ \end{array}} {E \vdash e_1 \; \texttt{==} \; e_2 \Rightarrow \; \texttt{true}}\text{[Eq-True]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v_1\\ E \vdash e_2 \Rightarrow v_2\\ v_1 \neq v_2 \end{array}} {E \vdash e_1 \; \texttt{==} \; e_2 \Rightarrow \; \texttt{false}}\text{[Eq-False]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v_1\\ E \vdash e_2 \Rightarrow v_2\\ v_1 \neq v_2 \end{array}} {E \vdash e_1 \; \texttt{!=} \; e_2 \Rightarrow \; \texttt{true}}\text{[NotEq-True]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow v\\ E \vdash e_2 \Rightarrow v\\ \end{array}} {E \vdash e_1 \; \texttt{!=} \; e_2 \Rightarrow \; \texttt{false}}\text{[NotEq-False]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ \end{array}} {E \vdash e_1 \; \oplus \; e_2 \Rightarrow \; n_1 \oplus n_2}\text{[BinOp-Int]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow b_1\\ E \vdash e_2 \Rightarrow b_2\\ \end{array}} {E \vdash e_1 \; \odot \; e_2 \Rightarrow \; b_1 \odot b_2}\text{[BinOp-Bool]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow b_1\\ \end{array}} {E \vdash \texttt{!} e_1 \Rightarrow \; \neg \; b_1}\text{[Unary-Not]} $$

Statement Semantics

This part of the formal semantics corresponds to the implementation of your eval_stmt function. That function has type environment -> stmt -> environment. In the semantics, we write the judgment E ⊢ s ⇒ E^′, where E represents the input environment, s represents the statement to execute, and E^′ represents the final output environment. Statements are different from expressions in that they do not “return” anything themselves; instead, their impact occurs by modifying the environment (by assigning to variables).

Abstract syntax grammar

Statements are defined by the following grammar:

$$\begin{array}{lrcl} \mathrm{Statements} & s & \mathtt{::=} & \mathtt{nop}\\ & & \mid & s \; \mathtt{;} \; s\\ & & \mid & \texttt{if} \; e \; s_1 \; s_2 \\ & & \mid & \texttt{while} \; e \; s \\ & & \mid & \texttt{for} \; x \; e_1 \; e_2 \; s\\ & & \mid & \mathtt{int}\;x\\ & & \mid & \mathtt{bool}\;x\\ & & \mid & x = e \\ \end{array}$$

In the project itself there is also a statement form for printing; we leave that out of the formal semantics.

Variables

In our language, you must declare a variable, with its type, before you use it. When declared, it will be initialized to a default value. You cannot declare the same variable twice.

$$ \frac{ \begin{array}{l} E(x) \text{~is~not~defined} \end{array}} {E \vdash \; \texttt{int} \; x \texttt{;} \; \Rightarrow \; E[x \mapsto 0]}\text{[Declare-Int]} $$

$$ \frac{ \begin{array}{l} E(x) \text{~is~not~defined} \end{array}} {E \vdash \; \texttt{bool} \; x \texttt{;} \; \Rightarrow \; E[x \mapsto \textbf{false}]}\text{[Declare-Bool]} $$

Assigning to variables must respect their declared type. In particular, you cannot assign to a variable you have not declared, and you cannot write a boolean to a variable declared as an int, or vice versa.

$$ \frac{ \begin{array}{l} E(x) = n_1\\ E \vdash e \Rightarrow n_2\\ \end{array}} {E \vdash x \; \texttt{=} \; e \texttt{;} \; \Rightarrow \; E[x \mapsto n_2]}\text{[Assign-Int]} $$

$$ \frac{ \begin{array}{l} E(x) = b_1\\ E \vdash e \Rightarrow b_2\\ \end{array}} {E \vdash x \; \texttt{=} \; e \texttt{;} \; \Rightarrow \; E[x \mapsto b_2]}\text{[Assign-Bool]} $$

Control Flow

Here are the rules for the different control flow constructs.

$$ \frac{} {E \vdash \; \texttt{nop} \; \Rightarrow \; E}\text{[Nop]} $$

$$ \frac{ \begin{array}{l} E \vdash s_1 \Rightarrow E_1\\ E_1 \vdash s_1 \Rightarrow E_2\\ \end{array}} {E \vdash s_1 \texttt{;} \; s_2 \Rightarrow E_2}\text{[Sequence]} $$

$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{true}\\ E \vdash s_1 \Rightarrow E_1\\ \end{array}} {E \vdash \; \texttt{if} \; e \; s_1 \; s_2 \; \Rightarrow E_1}\text{[If-True]} $$

$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{false}\\ E \vdash s_2 \Rightarrow E_1\\ \end{array}} {E \vdash \; \texttt{if} \; e \; s_1 \; s_2 \; \Rightarrow E_1}\text{[If-False]} $$

$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{true}\\ E \vdash s \Rightarrow E_1\\ E_1 \vdash \; \texttt{while} \; e \; s \; \Rightarrow E_2 \end{array}} {E \vdash \; \texttt{while} \; e \; s \; \Rightarrow E_2}\text{[While-True]} $$

$$ \frac{ \begin{array}{l} E \vdash e \Rightarrow \; \textbf{false}\\ \end{array}} {E \vdash \; \texttt{while} \; e \; s \; \Rightarrow E}\text{[While-False]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ E[x \mapsto n_1] \vdash s \Rightarrow E_1\\ E_1 \vdash x \; \texttt{=} \; x + 1 \Rightarrow E_2\\ E_2 \vdash \; \texttt{for} \; x \; n_1 \; n_2 \; s \; \Rightarrow E'\\ n_1 \leq n_2 \end{array}} {E \vdash \; \texttt{for} \; x \; e_1 \; e_2 \; s \; \Rightarrow E'}\text{[For-Go]} $$

$$ \frac{ \begin{array}{l} E \vdash e_1 \Rightarrow n_1\\ E \vdash e_2 \Rightarrow n_2\\ n_1 > n_2 \end{array}} {E \vdash \; \texttt{for} \; x \; e_1 \; e_2 \; s \; \Rightarrow E[x \mapsto n_1]}\text{[For-Stop]} $$

Project: Language Interpreter

Overview

Compilation, Tests, and Running

The Evaluator

Environments

Error Handling

Part 1: eval_expr

Expression Rules

Part 2: eval_stmt

Statement Rules

Important Printing Note

Turning in the Assignment

Grading Rubric (50 points)

Appendix: Operational Semantics

Environments

Error conditions

Expression Semantics

Abstract syntax grammar

Rules

Statement Semantics

Abstract syntax grammar

Variables

Control Flow

Part 1: `eval_expr`

Part 2: `eval_stmt`