Wednesday, August 23, 2017

Writing a Simple DSL Compiler with Delphi [0. Introduction]

Part 0: Introduction

Some time ago I was listening to the Hanselminutes podcast where a guy described how he wrote a simple interpreter in Go language (YOU should write an interpreter with Thorsten Ball). I was not really interested in writing an interpreter - I did that a long time ago - but a thought crossed my mind. I asked myself whether I can do something better - write a compiler. (Or, rather, a kinda-compiler. You'll see.)

What I wanted to do is to take a small language, parse it, generate an abstract syntax tree, and then convert this into one large anonymous function calling other anonymous functions calling other anonymous functions and so on and so on. It is hard to explain, so let me try using a very simple example.
Let's say I have a simple program written in "my" language which adds two numbers together.

add(a,b) {
  return a + b

Simple, C-like syntax. It is not hard to guess that function add sums together two numbers and returns a result.

I want my code to convert this into something like this (and I'm simplifying here a lot, in reality the code will be more complicated):

function CodegenAdd(a, b: integer): integer;
  Result := a + b;

prog :=
  function (params: TArray): integer
    Result := CodegenAdd(params[0], params[1]);  

To execute the program, I can then call prog([1,2]) and I'll get back a result 3.

I'm calling this a compiler because what I get from it is a compiled anonymous function (prog) which I can later execute as many times as I want. I can also provide a different parameters to each call. It is, however, a compiler for a very specific architecture - a set of functions (like the CodegenAdd) which could in this scenario represent a very specific machine instruction set.

It's more understandable now why I called my compiler a kinda-compiler. It does compile the program into a CPU-executable format, but it uses a very specific runtime - a set of functions compiled into the Delphi program and it cannot be saved as a stand-alone executable. All limitations aside, this still proves to be a powerful technique, executing code faster than a simple interpreter for the same language. But I'm skipping ahead.

This article is just an introduction into a series in which I will slowly present my project. As of the moment, the following articles are planned for:
  1. The Language
  2. Abstract Syntax Tree (AST)
  3. Tokenizer
  4. Parser
  5. The framework
  6. AST Dumper

  7. AST Compiler
  8. AST Interpreter
  9. Speeding up execution and extending the language
  10. Speeding up interpreter
The impatient can find all the code and demo project on the GitHub.


  1. I didn't really understand the whole idea. If at the very end we need to use Delphi compiler to run the code, isn't it easier not to "compile", but just "translate" your language into Delphi? Or is it just a mental experiment?

    1. Indeed, this is mostly an experiment. This approach can, however, be used if you want to execute some scripting code faster than a normal interpreter could.