CPSC 310 Ocaml Style Guide
This is a mirror of the style guide here: CS3110 Ocaml Style Guide
File Submission
80 Column Limit
No line of code may have more than 80 columns. Using more than 80 columns causes your code to wrap around to the next line, which is devastating to readability.
No Tab Characters
Do not use the tab character (0x09). Instead, use spaces to control indenting. This is because the width of a tab is not uniform across all computers, and what looks good on your machine may look ugly on mine, especially if there are mixed spaces and tabs.
Code Must Compile
Any code you submit must compile under OCaml without errors or warnings. If it does not compile, we will not grade it. That means you will not receive any points for that problem. There is no excuse for it to not compile. Never submit anything that you have changed, no matter how small the change, without checking that it still compiles. You should treat compiler warnings as errors.
Comments
Comments Go Above the Code They Reference
Example:
(* Sums a list of integers. *)
let sum = List.fold_left (+) 0Avoid Useless Comments
Avoid comments that merely repeat the code they reference or state the obvious. Comments should state the invariants, the non-obvious, or any references that have more information about the code. The preceding example contains a comment that is obvious and should be omitted.
Avoid Over-commenting
Very many or very long comments in the code body are often more distracting than helpful. Long comments may appear at the top of a file if you wish to explain the overall design of the code or refer to any sources that have more information about the algorithms or data structures. All other comments in the file should be as short as possible. A good place for a comment is just before a function declaration. Judicious choice of variable names can help minimize the need for comments.
Line Breaks
Empty lines should only be included between value declarations within
a struct block, especially between function declarations. It is not
necessary to put empty lines between other declarations unless you are
separating the different types of declarations (such as structures,
types, exceptions and values). Unless function declarations within a
let block are long, there should be no empty lines within a let
block. There should never be an empty line within an expression.
Multi-line Commenting
When comments are printed on paper, the reader lacks the advantage of
syntax highlighting. Multiline comments can be distinguished from code
by preceding each line of the comment with a * similar to the
following:
(* This is one of those rare but long comments
* that need to span multiple lines because
* the code is unusually complex and requires
* extra explanation. *)
fun complicatedFunction () = ...Naming and Declarations
Naming Conventions
The best way to tell at a glance something about the type of a variable is to use the standard OCaml naming conventions. The following are the preferred rules that are followed by the standard OCaml libraries:
| Token | OCaml Naming Convention | Example |
|---|---|---|
| Variables | Symbolic or initial lower case. Use embedded caps for multiword names. Underscores are also occasionally used. | getItem |
| Constructors | Initial upper case. Use embedded caps for multiword names. Historic exceptions are nil, true, and false. Rarely are symbolic names like :: used. |
EmptyQueue |
| Types | All lower case. Use underscores for multiword names. | priority_queue |
| Signatures | All upper case. Use underscores for multiword names. | PRIORITY_QUEUE |
| Structures | Initial upper case. Use embedded caps for multiword names. | PriorityQueue |
| Functors | Same as for structures, except Fn completes the name. | PriorityQueueFn |
These conventions are not enforced by the compiler, though violations of the variable/constructor conventions ought to cause warning messages because of the danger of a constructor turning into a variable when it is misspelled.
Use Meaningful Names
Another way of conveying information is to use meaningful variable
names that reflect their intended use. Choose words or combinations
of words describing the value. Variable names may be one letter in
short let] blocks. Functions used ephemerally in a fold, filter, or
map are often bound to the name f. Here is an example for short
variable names:
let d = Unix.localtime(Unix.time()) in
let m = d.Unix.tm_min in
let s = d.Unix.tm_sec in
let f n = (n mod 3) = 0 in
List.filter f [m;s]Type Annotations
Top-level functions and values should be declared with types. Consider the difference between the following:
let foo x = x+1
let foo(x:int):int = x+1Mutable Variables
Mutable variables are at odds with the philosophy of functional programming and should be used sparingly. They are used primarily to maintain local state in a closure. Other uses should be avoided. In particular, global mutable variables cause many problems. First, it is difficult to ensure that a global mutable variable is in the proper state, since it might have been modified outside the function or by a previous execution of the algorithm. This is especially problematic with concurrent threads. Second, and more importantly, having global mutable variables makes it more likely that your code is non-reentrant. Without proper knowledge of the ramifications, declaring global mutable variables can extend beyond bad style to incorrect code.
Renaming
You should rarely need to rename values, in fact this is a sure way to
obfuscate code. Renaming should be backed up with a very good reason.
One instance where renaming is common and even encouraged is to alias
external structures referenced by the current structure. Here the
external structure is aliased to a one- or two-letter name at the top
of the struct block. This serves two purposes: it shortens the name
of the structure and it documents the structures you use. Here is an
example:
struct
module H = HashTable
module A = Array
...
endOrder of Declarations in a Structure
When declaring elements in a structure, you should first alias the structures you intend to use, followed by the types, followed by exceptions, and lastly list all the value declarations for the structure. Here is an example:
struct
module L = List
type foo = unit
exception InternalError
let first list = L.nth(list,0)
endEvery declaration within the structure should be indented the same amount.
Indenting
Indent by two spaces
Long expressions
Long expression can be broken up and the parts aligned, as in the second example. Either is acceptable.
let x = "Long line..."^
"Another long line."
let x = "Long line..."^
"Another long line."Match expressions
Match expressions should be indented as follows:
match expr with
pat1 -> ...
| pat2 -> ...If the code for each case is long or requies multiple lines, it should be indented as follows:
match expr with
pat1 ->
...
| pat2 ->
...If expressions
If expressions should be indented according to one of the following schemes:
if exp1 then exp2
else if exp3 then exp4
else if exp5 then exp6
else exp8
if exp1 then
exp2
else exp3
if exp1 then exp2 else exp3
if exp1 then exp2
else exp3Comments
Comments should be indented to the level of the line of code that follows the comment.
Parentheses
Over-Parenthesizing
Parentheses have many semantic purposes in OCaml, including constructing tuples, grouping sequences of side-effect expressions, forcing a non-default parse of an expression, and grouping structures for functor arguments. Their usage is very different from C or Java. Avoid unnecessary parentheses when their presence makes your code harder to understand.
Match expressions
Wrap inner nested match expressions with parentheses. This avoids a
common error. If the inner match expression is already wrapped by a
let...in...end block, you can drop the parentheses.
Block Styles
Blocks of code such as let...in should be indented as follows.
let foo bar =
let p = 4 in
let q = 38 in
bar * (p+q)Blocks of code such as struct...end and sig...end should be
indented as follows.
module type S =
sig
type t
type u
val x : t
endPattern Matching
No Incomplete Pattern Matches
Incomplete pattern matches are flagged with compiler warnings, which are tantamount to errors for grading purposes. If your program exhibits this behavior, the problem will get no points.
Use Pattern Matching in Function Arguments
Tuples, records and datatypes can be deconstructed using pattern matching. If you simply deconstruct the function argument before you do anything useful, it is better to pattern match in the function argument. Consider these examples:
Bad
let f arg1 arg2 =
let x = fst arg1 in
let y = snd arg1 in
let z = fst arg2 in
...let f arg1 = let
let x = arg1.foo in
let y = arg1.bar in
let baz = arg1.baz in
...
endGood
fun f (x,y) (z, _) = ...let f {foo=x, bar=y, baz=baz} = ...Function Arguments Should Not Use Values for Patterns
You should only deconstruct values with variable names and/or
wildcards in function arguments. If you want to pattern match against
a specific value, use a match expression or an if expression. We
include this rule because there are too many errors that can occur
when you don't do this just right. Thus of the following two
examples, you should use the latter:
let rec fact = function
0 -> 1
| n -> n * fact(n-1)
let rec fact n =
if n=0 then 1
else n * fact(n-1)Avoid Unnecessary Projections
Prefer pattern matching to projections with function arguments or a value declarations. Using projections is okay as long as it is infrequent and the meaning is clearly understood from context. The above rule shows how to pattern-match in the function arguments. Here is an example for pattern matching with value declarations.
Bad
val v someFunction() in
let x = fst v in
let y = snd v in
x + yGood let (x,y) = someFunction() in x + y
Combine nested match Expressions
Rather than nest match expressions, you can combine them by pattern
matching against a tuple, provided the tests in the match
expressions are independent. Here is an example:
Bad
let d = Unix.localtime(Unix.time()) in
match d.Unix.tm_mon with
0 -> (match d.Unix.tm_mday with
1 -> print_string "Happy New Year"
| _ -> ())
| 5 -> (match d.Unix.tm_mday with
4 -> print_string "Happy Independence Day"
| _ -> ())
| 9 -> (match d.Unix.tm_mday with
10 -> print_string "Happy Metric Day"
| _ -> ())Good
let d = Unix.localtime(Unix.time()) in
match (d.Unix.tm_mon, d.Unix.tm_day) with
(0, 1) -> print_string "Happy New Year"
| (5, 4) -> print_string "Happy Independence Day"
| (9, 10) -> print_string "Happy Metric Day"
| _ -> ()Avoid the use of List.hd and List.tl
The functions List.hd and List.tl are used to deconstruct list
types. However, they raise an exception if the list is empty. It is
better to avoid them altogether. It is usually easy to achieve the
same effect with pattern matching. If you cannot manage to avoid them,
you should handle any exceptions that they might raise.
Factoring
Avoid breaking expressions over multiple lines
If a tuple consists of more than two or three elements, you should
consider using a record instead of a tuple. Records have the advantage
of placing each name on a separate line and still looking good.
Constructing a tuple over multiple lines makes for ugly code. Other
expressions that take up multiple lines should be done with care. The
best way to transform code that constructs expressions over multiple
lines to something that has good style is to factor the code using a
let expression. Consider the following:
Bad
let third = fun (x,y,z) -> z in
let euclid (m,n) : (int * int * int) =
if n=0 then (b 1, b 0, m)
else (snd (euclid (n, m mod n)), u - (m div n) *
(euclid (n, m mod n)), third (euclid (n, m mod n)))Better
let third = fun (x,y,z) -> z in
fun euclid (m,n) : (int * int * int) =
if n=0 then (b 1, b 0, m)
else (snd (euclid (n, m mod n)),
u - (m div n) * (euclid (n, m mod n)),
third (euclid (n, m mod n)))Best
let euclid (m,n) : (int * int * int) =
if n=0 then (b 1, b 0, m)
else
let q = m div n in
let r = m mod n in
let (u, v, g) = euclid (n, r) in
(v, u - q*v, g)Do not factor unnecessarily
Bad
let x = input_line stdin in
match x with
...
endGood
match input_line stdin with
...Bad (provided y is not a large expression)
let val x = y*y in x+z endGood
y*y + zVerbosity
Don't Rewrite Library Functions
The OCaml library has a great number of functions and data structures
-- use them! Often students will recode List.filter, List.map,
and similar functions. A more subtle situation for recoding is all
the fold functions. Writing a function that recursively walks down the
list should make vigorous use of List.fold_left or
List.fold_right. Other data structures often have a folding
function; use them whenever they are available.
Misusing if Expressions
Remember that the type of the condition in an if expression is
bool. In general, the type of an if expression is 'a, but in the
case that the type is bool, you should not be using if at all.
Consider the following:
| Bad | Good |
|---|---|
if e then true else false |
e |
if e then false else true |
not e |
if beta then beta else false |
beta |
if not e then x else y |
if e then y else x |
if x then true else y |
x || y |
if x then y else false |
x && y |
if x then false else y |
not x && y |
if x then y else true |
not x || y |
Misusing match Expressions
The match expression is misused in two common situations. First,
match should never be used in place of an if expression (that's
why if exists). Note the following:
match e with
true -> x
| false -> y
if e then x else yThe latter is much better. Another situation where if expressions
are preferred over match expressions is as follows:
match e with
c -> x (* c is a constant value *)
| _ -> y
if e=c then x else yThe latter is definitely better. The other misuse is using match
when pattern matching with a val declaration is enough. Consider the
following:
let x = match expr with (y,z) -> y
let (x,_) = exprThe latter is better.
Other Common Misuses
Here are some other common mistakes to watch out for:
| Bad | Good |
|---|---|
l::nil |
[l] |
l::[] |
[l] |
length + 0 |
length |
length * 1 |
length |
E * E (E is a big expression) |
let val x = E in 0 x*x end |
if x then f a b c1 else f a b c2 |
f a b (if x then c1 else c2) |
Don't Rewrap Functions
When passing a function as an argument to another function, don't rewrap the function unnecessarily. Here's an example:
List.map (fun x -> sqrt x) [1.0, 4.0, 9.0, 16.0]
List.map sqrt [1.0, 4.0, 9.0, 16.0]The latter is better. Another case for rewrapping a function is often associated with infix binary operators. To prevent rewrapping the binary operator, put parentheses around the operator to refer to the function form, as in the following example:
List.fold_left (fun (x,y) -> x + y) 0
List.fold_left (+) 0The latter is better.
Avoid Computing Values Twice
If you compute a value twice, you're wasting CPU time and making your
program ugly. The best way to avoid computing values twice is to
create a let expression and bind the computed value to a variable
name. This has the added benefit of letting you document the purpose
of the value with a name.