Writing Interpreters
To implement the semantics of some programming language, an interpreter will consist of at least a few componenets:
.---------. .--------. .-------------. .-------------. | lexer | -> | parser | -> | type checker | -> | interpreter | '---------' '--------' '--------------' '-------------' lexer.mll parser.mly typecheck.ml interpreter.ml parser.mli ast.mli
This blog post is basically a write of of my work on assignment two of Stanford's CS242 class, where we'll explore the process of writing an interpreter for the simply typed lambda calculus.
The Simply Typed Lambda Calculus
Lexing
OCaml has a run-time library for lexers generated by ocamllex
that's used in this
project to generate a lexer by giving names to a few regular expressions:
rule token = parse | "(*" _* "*)" { token lexbuf } | [' ' '\t' '\n'] { token lexbuf } | '.' { DOT } | '(' { LPAREN } | ')' { RPAREN } | "fn" { FN } | "int" { TY_INT } | "->" { ARROW } | "=>" { FATARROW } | ... | ['0'-'9']+ as i { INT (int_of_string i) } | '-'['0'-'9']+ as i { INT (int_of_string i) } | eof { EOF } | _ { raise (Error (Printf.sprintf "At offset %d: unexpected character.\n" (Lexing.lexeme_start lexbuf))) }
Where our entry point is named "token" and basically becomes an OCaml function that
takes one argument (since we didn't give it any arguments of our own, there is still
the implicit last argument of type Lexing.lexbug
, where characters are read from).
We then pass this new lexical analyzer function as an argument to the parser...
Parsing
Alongside OCaml's very nice lexer library, the standard library also provides the
command-line tool ocamlyacc
that generates a parse from some specification of a
context-free grammar with attached semantic actions. The grammar used by our OCaml
yacc
variant lives in parser.mly
.