Tuesday, November 02, 2010

Blaise Pascal Magazine Rerun #1: Introduction to Enumerators

This article was originally written for the Blaise Pascal Magazine and was published in Issue #3.

Hello, everybody. My name is Primoz and I’m addicted to enumerators. Don’t worry, you haven’t stumble into a middle of the Anonymous Enumerator Addicts meeting. It’s just that enumerators are one of my favourite topics.

In the previous issue, Hallvard Vassbotn covered traditional for loops in length. Since the beginning of the Pascal language, for statement represented one of three ways of creating loops (other two are, of course, while and repeat-until). Things have stayed that way for quite some time. Only in Delphi 2005 have we, Win32 programmers, acquired new looping construct – for-in. This is the article on this extremely useful addition to the Delphi language. (Unless it is called Object Pascal again. I’m never sure.)

Iteration over containers

The new kind of iteration was introduced to simplify use of containers supporting IEnumerable interface in the .NET world. Luckily, Borland people (at that moment Delphi was still developed by Borland), decided to introduce this concept to the Win32 compiler too. Thus, the for-in statement was born.

You may have some problems finding documentation for the for-in statement in the current help browser. Either search for “Declarations and Statements” or enter this address in the “URL” field of the help browser: ms-help://borland.bds5/devcommon/declarationsandstatements_xml.html (Delphi 2007 only). For loops are covered near the end of that help topic. If you don’t want to read official documentation, don’t worry. You’ll learn everything here (and in future installments of the Enumerators series).

The simplest way to understand for-in loops is to look at the example. One of the traditional uses of the classical for loop is to iterate over TStringList.

procedure ExamineList(S: TStringList);
var
  i: integer;
begin
  for i := 0 to S.Count - 1 do
    DoSomethingWith(S[i]);
end;

What is going on here? We have declared a counter named i. For statement will assign values from 0 to “number of elements in S minus 1” to that counter. For each value of i, i-th element of the string list will be accessed and will be passed to the DoSomethingWith procedure.

I know that this is fairly basic stuff but please keep with me for a moment. I have a point to make.

This kind of code is so familiar that we never even stop to think about it. So let’s take a deep breath, pause for the moment and look at the code again. What do we see? What assumptions are hidden there?

Firstly, we have to know in advance that TStringList can be accessed as an array. Secondly, we have to know that indices into that pseudo-array start with 0 and end with “number of elements minus 1”. Thirdly we have to know that number of elements in a string list can be accessed by calling the Count method.

It’s true that there are cases when we really have to know all that and that the knowledge itself is useful, but is it really required? Even more, doesn’t all that code obscure our original intent? What is the intent of the code above? To pass all elements of the string list to the DoSomethingWith function! That is the intent that we can much more naturally express with the for-in loop.

procedure ExamineList(S: TStringList);
var
  el: string;
begin
  for el in S do
    DoSomethingWith(el);
end;

Now the intention is clear. For each element in the string list, execute DoSomethingWith with the element as a parameter. There is an important distinction from the previous example. Instead of telling the compiler what to do, we’re telling it what we want to achieve. To me, this is an important change of viewpoint.

101 use of the for-in loop

The for-in loop can iterate over some fairly different structures. Iteration variable can be of many different types depending on the container we are iterating over. (In the previous example, iteration variable was of type string and container was TStringList.) For example, we can iterate over a string, in which case the iteration variable will be set to all characters in this string in turn. The following function makes a copy of a source string in a very inefficient manner.

function StringCopySlow(S: string): string;
var
  ch: char;
begin
  Result := '';
  for ch in S do
    Result := Result + ch;
end;

Also supported are all kind arrays, from single-dimensional to dynamic. Following code finds a minimum element in the array.

function Minimum(param: array of real): real;
var
  r: real;
begin
  Assert(High(param) >= 0);
  Result := param[0];
  for r in param do
    if r < Result then
      Result := r;
end;

Again, the intent of this code is clear – go over the elements of the array and if an element is smaller than the current minimum, set the minimum value to the value of that element.

For-in loop can also work on a set (“set of” type declaration), but that’s something that you’ll rarely see in practice, although it can be really useful. The following code demonstrates the technique, but it is fairly contrived and not very realistic.

type
  TMyStuff = (stOne, stFortyTwo);
  TMySet = set of TMyStuff;

procedure TheAnswer;
var
  st: TMyStuff;
  ms: TMySet;
begin
  ms := [];
  for st := stOne to stFortyTwo do
    Include(ms, st);
  Exclude(ms, stOne);
  for st in ms do
    DoSomethingWith(st);
end;

This code initializes ms variable to an empty string and then uses standard for loop to add all elements of TMyStuff to it. It would be much nicer if compiler allowed us to write “for st in TMyStuff” but, alas, that is not possible. It then excludes stOne from the set (leaving the ms variable set to [stFortyTwo]) and calls DoSomethingWith for all elements in the ms (namely, for stFortyTwo).

All those uses of for-in loop are great, but its greatest power lies in its openness. We can add iteration support to every class, interface or record. In fact, I’ve already demonstrated that in my first example. TStringList is a class that is extended in such way.

Enumerators and the VCL

Simple search of “GetEnumerator” in the VCL sources (later you’ll see why that search expression) shows many places where for-in support is implemented. We’ve already seen that TStringList supports for-in loops. In the same unit (Classes.pas) we also find support for TList, TInterfaceList, TCollection, and TComponent. The latter is very interesting as you can iterate over all components in a form (or any other descendent of TComponent) in a very simple manner.

procedure TForm1.FormCreate(Sender: TObject);
var
  comp: TComponent;
begin
  for comp in Self do
    if comp is TLabel then
      TLabel(comp).Font.Name := 'Calibri';
end;

The other important source of for-in supporting objects is ComCtrls.pas with support for iteration over nodes in a tree view, items in a TListItem and buttons in a toolbar. You can also iterate over menu items in a menu (Menus.pas), actions in an action list (ActnList.pas) and fields in a TFields container (DB.pas). There are also some other classes of lesser importance that have enumeration support.

To some, this may not seem very extensive but as you’ll see next month, it is very simple to add for-in support to existing VCL classes. Therefore, we can easily add what Borland/CodeGear left out.

Rolling up your own

Before we start messing with the VCL, let’s take a look at how to extend our own classes with the for-in support. Suppose we have a very simple class that stores TFont objects.

type
  TFontList = class
  private
    FStorage: TObjectList;
  public
    constructor Create;
    destructor Destroy; override;
    procedure Add(font: TFont);
  end;

constructor TFontList.Create;
begin
  inherited Create;
  FStorage := TObjectList.Create(false);
end;

destructor TFontList.Destroy;
begin
  FreeAndNil(FStorage);
  inherited Destroy;
end;

procedure TFontList.Add(font: TFont);
begin
  FStorage.Add(font);
end;

With that code, we can create TFontList container and add TFont objects to it, but we cannot access those stored objects. Instead of coding standard pseudo-array interface (Count method and default array property) we can add support for for-in iteration.

To use the for-in loop on a class, this class must contain a public method called GetEnumerator. This method must return a class, interface or record.

The class, interface, or record returned by GetEnumerator must contain a public method MoveNext that returns Boolean and a public read-only property called Current. The type of this property is the type of iterator variable used in the for-in loop. In our case, this will be TFont.

In our case, modified TFontList declaration will be.

type
  TFontList = class;

  TFontListEnumerator = class
  private
    FIndex: integer;
    FList: TFontList;
  public
    constructor Create(aList: TFontList);
    function GetCurrent: TFont;
    function MoveNext: boolean;
    property Current: TFont read GetCurrent;
  end;

  TFontList = class
  private
    FStorage: TObjectList;
  public
    constructor Create;
    destructor Destroy; override;
    procedure Add(font: TFont);
    function GetEnumerator: TFontListEnumerator;
  end;

As you can see, TFontList returns TFontListEnumerator and TFontListEnumerator declares a field of the TFontList type. To break this circular relationship, we have to tell the compiler that TFontList is a class that will be declared later.

TFontList constructor, destructor and Add method are kept unchanged and the new GetEnumerator method is very simple. It just creates new object of the TFontListEnumerator type and passes Self to it. This object will be automatically destroyed when it is no longer used.

function TFontList.GetEnumerator: TFontListEnumerator;
begin
  Result := TFontListEnumerator.Create(Self);
end;

TFontListEnumerator is slightly more complicated. Constructor stores the reference to the owner (TFontList) and initializes iteration index. GetCurrent uses the current value of this index to fetch TFont from TFontList’s internal storage and MoveNext tries to move to the next place in this internal storage. When last element has been reached and processed, MoveNext returns False.

constructor TFontListEnumerator.Create(aList: TFontList);
begin
  FIndex := -1;
  FList := aList;
end;

function TFontListEnumerator.GetCurrent: TFont;
begin
  Result := TFont(FList.FStorage[FIndex]);
end;

function TFontListEnumerator.MoveNext: boolean;
begin
  Result := FIndex < (FList.FStorage.Count - 1);
  if Result then
    Inc(FIndex);
end;

Please note that this code accesses TFontList internal storage directly and that this is not a recommended way. I’ve only done it here to keep the example short. If at all possible, access enumerated class with public or protected methods. In most cases you’ll already have them ready as you’ll be adding for-in support to traditionally written classes.

There’s one thing you have to keep in mind when writing an enumerator and that is that the MoveNext is called before the first GetCurrent call. That’s why we have initialized FIndex to -1 and not to 0. You can assume that the compiler-generated loop looks very similar to this pseudocode.

enumerator := list.GetEnumerator;
while enumerator.MoveNext do
  do something with enumerator.Current;
enumerator.Free;

With the addition of the enumerator, you can now access items in the TFontList.

procedure TForm1.FormCreate(Sender: TObject);
var
  fl: TFontList;
  font: TFont;
begin
  fl := TFontList.Create;
  try
    fl.Add(Self.font);
    for font in fl do
      Caption := Caption + '/' + font.Name;
  finally
   FreeAndNil(fl);
 end;
end;

This code is equivalent to the fragment below.

procedure TForm1.FormCreate(Sender: TObject);
var
  enumerator: TFontListEnumerator;
  fl: TFontList;
  font: TFont;
begin
  fl := TFontList.Create;
  try
    fl.Add(Self.font);
    enumerator := fl.GetEnumerator;
    while enumerator.MoveNext do
      Caption := Caption + '/' + (enumerator.Current).Name;
    enumerator.Free;
  finally
    FreeAndNil(fl);
  end;
end;

This is not a pseudocode, but a completely valid and compilable Delphi code. As you can see, you can access the enumerator by yourself, without the help of the for-in loop, as it is completely standard Delphi class.

Enough for today. In the next issue I’ll discuss advanced enumeration topics – how to implement more than one enumerator for a class, how to use enumerators with parameters, and how to enhance VCL with custom enumerators.

2 comments:

  1. You forgot one way to make loops ;)

    GOTO

    :D

    ReplyDelete
  2. This option should be forgotten forever

    ReplyDelete