Sunday, March 29, 2009

An update on the Moscow ML for LLVM

The Moscow ML LLVM project is having a blast these days. First, we got autotools into the project, so everything is now built from autotools. Then we eliminated a lot of magic in the code base. Importantly, you can now reorder C functions without fear of the run-time breaking down and getting out of sync with the compiler.

We killed the threaded code interpreter. It is presumably fast, but it also hampers our further development with LLVM, so it had to go. The code base ended up much cleaner.

Then we coded up a simple opcode tracer that can write out the sequence of opcodes as they are executing. If you can simplify an error to the point where a trace can be used, then it tends to be easy to identify the problem with the tracer. It did uncover a couple of bugs.

The final nail in the coffin is the regression framework. We imported the regression framework from mlton and proceeded to add the tests that makes sense for mosml. We have 2 failing tests out of 88 at the moment. The original mosml passes all 88 and our version fails in 2 of those. Both these are in I/O, so it is narrowed down to a few commits at the moment. It is wickedly cool to have a working regression framework again.

We also killed the C preprocessor on all Standard ML files. Autoconf gives us what we need, so it is fairly easy to generate the correct files.

Looking forward

In the future, we intend to clean up the run-time some more and systematically begin getting more test cases to pass correctly, and construct the correct install-targets. We also need to get the standard ml front-end code into the game. Then we add 2003 Basis support and begin simplifying the back-end such that we can add LLVM on top if it. Another nail is the eradication of all C-compiler warnings and a simplification of the opcodes.

Saturday, March 28, 2009

Graphviz talk slides

So I did a small talk (in danish) on what Graphviz can do for you in the local LUG. But I have been a slacker and did not put up the slides yet. This remedies it: Graphviz talk slides

Wednesday, March 04, 2009

LLVM Update

People tracking along will know that I am in the process of writing an LLVM backend for Moscow ML. There are several ways in which one can do this, so one has to be decided upon and tried out. It is however clear we need a way to manipulate and output LLVMs IR, so I set upon writing an set of bindings.

These bindings are not using an FFI to call into the C++ LLVM code. Rather they produce the IR assembly from an internal SML-like data structure. This approach has the disadvantage that it is more work to maintain. On the other hand, the OCaml bindings passes around llvalues all over the place so very few bugs will be captured at compile time (we could do oh so much better had we had dependent types).

I have written LLVM as a datatype in SML and then written about 40 percent of a type checker for it. The process is very simple as soon as one has converted LLVM rules into SML datatypes. Here is a part of the Type module

datatype t =
    T_Integer of int
  | T_Real of float_ty
  | T_Fun of {return: t, params: t list}
  | T_Struct of {elements: t list, packed: bool}
  | T_Array of {ty: t, length: int}
  | T_Pointer of t
  | T_Vector of {ty: t, length: int}
  | T_Opaque
  | T_Void
  | T_Label
  | T_Top

which is a straightforward adaptation of the Language Reference of LLVM. I could simplify it more, but at the moment there is little need for that.

Type checking is carried out by a set of combinators. They probably form a monad (Haskell guys can relate this to a variant of the Either monad with a Writer on it as well. It should be straightforward to write it down as one or use a transformer stack to achieve the same. I have not given it much thought though).

The checking basis is the datatype

datatype t_check = Ok | Wrong of string list

Immediately, a helper is introduced

fun type_fail str = Wrong [str]

And a function for running a checker

fun run checker_fun ty =
 case checker_fun ty of
   Ok => ()
 | Wrong errors =>
     <<Print out type checking errors>>

Note that run will have type (ty -> t_check) -> ty -> unit. The question is then how to produce type checkers. This is done with combinators but of course:

(* Logical or for types, consider monadizing *)
fun or checker1 checker2 ty =
case checker1 ty of
 Ok => Ok
| Wrong err1 => case checker2 ty of
         Ok => Ok
       | Wrong err2 => Wrong (List.concat [err1, err2])

The 'or' checker tries the first checker and if it fail it tries the second, picking up any errors on the way. It is also easy to assert that a given type must be the base type of a vector type:

fun vectorized checker1 vty =
case vty of
 T_Vector {ty, ...} => checker1 ty
| _ => type_fail "Checked type is not a vector type"

Perhaps one could be giving a better error message here by capturing the eventual error output from checker1, but this has not been done yet.

Primitive checks are just doint what they are supposed to do. The assertion for an integer looks like the following:

fun assert_int ty =
case ty of
 T_Integer _ => Ok
| _ => type_fail "Type is not of integer type."

Usage

An addition in LLVM is either an integer, a float or a vector of these. Aha:

val assert_int_float = or assert_int assert_float
val assert_add = or assert_int_float (vectorized assert_int_float)

Hopefully this shows the succinctness of the approach.

EDIT: Layout was bad. Fixed

About Me

My Photo
Lambda-loving CS Geek. Likes metal music. Likes dogs, cats. Does not like pictures of dogs and cats (unless they are lambdacats!)

Has an unhealthy coffee addiction. Calls himself the coffee zombie in the morning (BEEEEANS!)

Has a neverending curiosity gene, loves intelligence and passion.