Tuesday, October 03, 2017

Writing a Simple DSL Compiler with Delphi [5. Framework]

This article provides a description of a compiler framework used in my toy language project. If you are new to this series, I would recommend to start reading with this post.

We have now a working parser that converts a string of code into an abstract syntax tree - AST. It is, however, not yet time to write about the most interesting piece - the compiler - as we should first do some integration and testing.

My toy compiler uses a very simple framework that is accessed with the ISimpleDSLCompiler interface (unit SimpleDSLCompiler). Relevant parts of this interface are shown below:

  ISimpleDSLCompiler = interface ['{7CF78EC7-023B-4571-B310-42873921B0BC}']
    function  Codegen: boolean;
    function  Compile(const code: string): boolean;
    function  Parse(const code: string): boolean;
    property AST: ISimpleDSLAST read GetAST;
    property Code: ISimpleDSLProgram read GetCode;
    property ASTFactory: TSimpleDSLASTFactory       read GetASTFactory write SetASTFactory;
    property CodegenFactory: TSimpleDSLCodegenFactory       read GetCodegenFactory write SetCodegenFactory;
    property ParserFactory: TSimpleDSLParserFactory       read GetParserFactory write SetParserFactory;
    property TokenizerFactory: TSimpleDSLTokenizerFactory       read GetTokenizerFactory write SetTokenizerFactory;

Framework exposes functions to parse the input (Parse), generate the executable (Codegen) and do both in one step (Compile), but it has no idea how to do any of that. All functionality is implemented externally - via tokenizer, parser, and code generator engines that are created in factory methods (TokenizerFactory etc).

To keep configuration simple, TSimpleDSLCompiler.Create sets up default factories which create typical engine classes. If you want to plug in your own implementation of a specific step, you can do that by setting the appropriate XXXFactory property before calling any of functions in this interface. We will use this functionality to implement an "AST Dumper" in the next instalment of this blog.

constructor TSimpleDSLCompiler.Create;
  inherited Create;
  ASTFactory := CreateSimpleDSLAST;
  CodegenFactory := CreateSimpleDSLCodegen;
  ParserFactory := CreateSimpleDSLParser;
  TokenizerFactory := CreateSimpleDSLTokenizer;

Let's take quick look at all three API functions. The most important, Compile, does nothing besides calling the parser and (if code was parsed correctly), the code generator. Nothing special here.

function TSimpleDSLCompiler.Compile(const code: string): boolean;
  Result := Parse(code);
  if Result then
    Result := Codegen;

The second one, Parse, creates parser and tokenizer engines, prepares the AST and calls parser's Parse method. Most of it is just plumbing, with all of the real work being done in parser.Parse.

function TSimpleDSLCompiler.Parse(const code: string): boolean;
  parser   : ISimpleDSLParser;
  tokenizer: ISimpleDSLTokenizer;
  LastError := '';
  parser := ParserFactory();
  tokenizer := TokenizerFactory();
  FAST := ASTFactory();
  Result := parser.Parse(code, tokenizer, FAST);
  if not Result then begin
    FAST := nil;
    LastError := (parser as ISimpleDSLErrorInfo).ErrorInfo;

The last one, Codegen, is equally simple. After some error checking, it creates a code generation engine and calls its Generate code, passing in the AST. We did not yet examine the code generator, so for now it will suffice to say that a code generator exposes one function - Generate - which converts ISimpleDSLAST into ISimpleDSLProgram.

function TSimpleDSLCompiler.Codegen: boolean;
  codegen  : ISimpleDSLCodegen;
  LastError := '';
  if not assigned(FAST) then
    Exit(SetError('Nothing to do'))
  else begin
    codegen := CodegenFactory();
    Result := codegen.Generate(FAST, FCode);
    if not Result then begin
      FCode := nil;
      LastError := (codegen as ISimpleDSLErrorInfo).ErrorInfo;

All this allows us to call the compiler in a very simple manner:

compiler := CreateSimpleDSLCompiler;
if not compiler.Compile(CMultiProcCode) then
  Writeln('Compilation/codegen error: ' +     (compiler as ISimpleDSLErrorInfo).ErrorInfo); 

In the next instalment we'll see how we can dump the generated AST into a textual form by replacing the CodegenFactory.

