Tuesday, March 26, 2013

Using Generics to Manipulate Enumerated Types

I’d like to present a short class written by my colleague at work. Why? Because I just love how it uses the generics syntax to simplify some operations on enumerated types.

Like me, he hated the hoops one must jump through to convert an instance of an enumerated type to a string, to check if an integer value that was read from some storage contains a valid integer representation of some enumerated value and to enumerate (no pun intended) over such a type. Unlike me, he did something about that and wrote a Spring-inspired Range<T>.

The code fragment below represents a simplified version of this type. I’ve cleaned out various directives (static, overload) and private methods to make it less visually noisy. A link to the full source can be found at the bottom of this post.

type
Range<T: record> = record
  public
    class function Clip(const value: Integer): T; 
    class function Clip(const value: T): T; 
    class function Ensure(const value: Integer; const min, max: T): T;
    class function Ensure(const value, min, max: T): T;
    class function FromInt(const value: Integer): T;
    class function Enum: RangeEnum<T>; static;
    class function GetValueOrDefault(const value: Integer): T;
    class function IsValid(const value: Integer): Boolean;
    class function IsValid(const value: T): Boolean; 
    class function Max: T; static; 
    class function Min: T; static; 
    class function ToInt(const value: T): Integer; 
    class function ToString(const value: T): string; 
  end;

The usage is best explained through examples and to do that we need an enumerated type. Let’s say we have a very simple type TMyEnum and a variable of this type.

type
  TMyEnum = (enum1, enum2, enum3);

var
  e: TMyEnum;

Min and Max return exactly the same information as Low() and High() functions. Their usage is mainly internal because in the Range<T> implementation we cannot use Low(T) and High(T) as the compiler doesn’t support that. Range<T> uses RTTI to get the min/max value and because of that Range<T> doesn’t support enumerated types with explicit values (TWontWork = (enum1 = 1, enum2 = 2, enum3 = 4)) which don’t contain runtime type information.

Example: Range<TMyEnum>.Min = enum1. Range<TMyEnum>.Max = enum3.

FromInt and ToInt are conversion functions that work exactly the same as hard casting.

Example: Range<TMyEnum>.FromInt(2) = TMyEnum(2) = enum3.

ToString converts an instance of an enumerated type into its string representation. It uses Delphi’s TValue to do the conversion.

Example: e := enum2; Range<TMyEnum>.ToString(e) = “enum2”.

Ensure clips the input value (either presented as an integer or as a typed value) into some range of values.

Example: Range<TMyEnum>.Ensure(enum3, enum1, enum2) = enum2.

Clip clips the input value to the full range of valid values for that type. It is great for validating input.

Example: Range<TMyEnum>.Clip(3) = enum2.

IsValid verifies that the value lies inside the range of valid values. GetValueOrDefault returns the value if it is valid or Default(T) if it is not.

My favorite (besides the ToString and Clip) is Enum, which returns an enumerator for the type.

Example: for e in Range<TMyEnum>.Enum in is functionally equivalent to for e := Low(TMyEnum) to High(TMyEnum).

If you want to play with Range<T> or just explore the implementation, you can download it from here.

21 comments:

  1. Honestly, it is hard to figure how all those code is preferred in comparison to good old plain pascal, using low(TMyEnum)..high(TMyEnum) code, and direct RTTI call.

    Is "Range.Min" better than "low(TMyEnum)"?

    Is "for e in Range.Enum" more explicit than "for e := Low(TMyEnum) to High(TMyEnum)"?

    {$troll on}
    Just for the fun to have slower execution, and bigger generated code size?
    ;)
    {$troll off}

    Such syntax sugar could make sense...

    But my point is that I like so much the way pascal handles enumerates - much better than C#, for instance. I definitively miss declaration of constants like "EnumName: array[TMyEnum] of string". And in this case, you can use "for e := low(EnumName) to high(EnumName) do writeln(EnumName[i])" which is better than "for e in Range.Enum" in this case.

    I prefer using "pascalish" syntax most of the time, and rely on generics only if it is worth it.

    For instance, object pascal class variables TMyClassClass = class of TMyClass are quite powerful, and avoid using generics in most cases. It saves a lot of generated code size, when you use some methods with a lot of diverse classes: TMyClassClass will let the code be generated only once, whereas generics will duplicate almost the same asm, just for nothing. This is due to lack of optimization of the Delphi compiler, of course - but if I can avoid it...

    Another problem with generics is that the Delphi syntax is not the same as FPC's... so if you want to port your code to a non supported target, you will need to spam your libraries with $ifdefs...

    Of course, I'm one of those guys who enjoy reading the Linux or SQlite3 source code, and know how asm looks like. Perhaps I'm too old for all this. Or I'll have to switch to the embedded world. ;)

    ReplyDelete
    Replies
    1. Arnaud, mostly I agree with you. As I've said before - this is not my code and I would mostly use Enum, ToString and Clip. Other methods I don't find that much useful.

      As for the enumerator - sure, I prefer the for..in syntax every time.

      Delete
    2. this old-school for loop sucks :-)

      http://qc.embarcadero.com/wc/qcmain.aspx?d=106212

      Delete
    3. >Honestly, it is hard to figure how all those code is preferred in comparison to
      >good old plain pascal,
      >Is "Range.Min" better than "low(TMyEnum)"?

      Yes. One's modern syntax and less LISPy. ;-)

      >Is "for e in Range.Enum" more explicit than "for e := Low(TMyEnum) to
      >High(TMyEnum)"?

      Yes, yes, a thousand times yes! :-) Iteration is something we gained a taste of many years ago yet never realized its full potential. It's not just the for... in loop (which should make the for...to loop a deprecated construct because you don't need both). Put iterators everywhere and we'd realize its benefits. Give us a suite of functions that work with anything iterable and we'd have a lot of power now only reserved for the functional or dynamic languages.

      Imagine iterating through database records... imagine a sort routine that sorts anything iterable... imagine min/max functions that give the min/max of anything iterable (we've already got three min/max functions and still can't do that). Then it's a short jump to list comprehensions.

      It's not the syntax that's the gain; it's the ability to work with anything iterable (and lots and lots of code can be expressed that way) that's the benefit. Unfortunately, EMBT hasn't shown much interest in finishing what they started, refactoring the standard library and putting iterators everywhere.

      >But my point is that I like so much the way pascal handles enumerates - much
      >better than C#,

      We can't have duplicate names... except we can, but you need to add a compiler flag. We can't use RTTI with custom assigned values.

      Code like this is ungainly:

      for i := Ord(Low(TyourEnumType)) to Ord(High(TyourEnumType)) do
      Combobox1.Items.Add(GetEnumName(TypeInfo(TyourEnumType), i));

      DWScript has it right (as usual):

      type TEnum = (enOne, enTwo, enThree);

      PrintLn(TEnum.enOne.Name);
      PrintLn(enTwo.Name);

      for var e in TEnum do
      PrintLn(e.Name);

      Python's finally going to be getting enums in the forthcoming 3.4 version; the syntax is very similar. Treating enums as a class/objects has its advantages.


      >for instance. I definitively miss declaration of constants like "EnumName:
      >array[TMyEnum] of string".

      All you need is an enumerated type that can be treated as integers. Or just a bunch of constants/vars that map to integers.

      >And in this case, you can use "for e := low(EnumName) to high(EnumName) do
      >writeln(EnumName[i])" which is better than "for e in Range.Enum" in this case.

      Neither seem as good as

      For item in EnumName:
      WriteLn(item)

      :-) If you just want to loop over the items in an array the whole enum thing doesn't serve any point.

      >For instance, object pascal class variables TMyClassClass = class of TMyClass
      >>are quite powerful,

      Trying to read up on this made my head hurt. :-) It seems rather than being powerful it's a workaround due to Delphi not having types as first class objects.

      Delete
  2. Why did you call it "Range"?

    ReplyDelete
  3. Why do you write
    Range< T > = class
    class function ToInt(I: T): Integer;
    end;
    ??
    I would prefer
    MyRange = class
    class function ToInt< T >(I: T): Integer;
    end;
    Compare
    Range< TMyEnum >.ToInt(enOne)
    MyRange.ToInt(enOne)

    ReplyDelete
    Replies
    1. Design choice, I don't know exactly why (not my code). This way you can declare

      type TMyRange = Range;

      BTW, your second example should be

      MyRange.ToInt(enOne);

      Delete
  4. These are enumerated types rather than enumerable types. There's a huge difference.

    I personally don't think the record constraint adds. Much you still have to add a runtime check that the type is an enum. I mean, Range.Min isn't meaningful.

    I use a class named TEnumPreference in a similar vein to this. It represents a user preference stored to, for example, registry. T is replaced with an enumerated type at instantiation. It's a nice way to avoid a pile of casting and ord.

    I think Range is a poor choice of name.

    Another of my faves is this class:

    TArray = class(Generics.Collections.TArray)
    private
    class function Comparison(SortType: TSortType): TComparison; static;
    class function Comparer(const Comparison: TComparison): IComparer; static;
    public
    class procedure Swap(var Left, Right: T); static;
    class procedure Reverse(var Values: array of T); static;
    class function Reversed(const Values: array of T): TArray; static;
    class function Contains(const Values: array of T; const Item: T; out ItemIndex: Integer): Boolean; overload; static;
    class function Contains(const Values: array of T; const Item: T): Boolean; overload; static;
    class function IndexOf(const Values: array of T; const Item: T): Integer; static;
    class function Sorted(var Values: array of T; SortType: TSortType; Index, Count: Integer): Boolean; overload; static;
    class function Sorted(var Values: array of T; SortType: TSortType): Boolean; overload; static;
    class function Sorted(var Values: array of T; const Comparison: TComparison; Index, Count: Integer): Boolean; overload; static;
    class function Sorted(var Values: array of T; const Comparison: TComparison): Boolean; overload; static;
    class function Sorted(GetValue: TFunc; const Comparison: TComparison; Index, Count: Integer): Boolean; overload; static;
    class procedure Sort(var Values: array of T; SortType: TSortType; Index, Count: Integer); overload; static;
    class procedure Sort(var Values: array of T; SortType: TSortType); overload; static;
    class procedure Sort(var Values: array of T; const Comparison: TComparison; Index, Count: Integer); overload; static;
    class procedure Sort(var Values: array of T; const Comparison: TComparison); overload; static;
    class function Copy(const Source: array of T; Index, Count: Integer): TArray; overload; static;
    class function Copy(const Source: array of T): TArray; overload; static;
    class function Concatenated(const Source1, Source2: array of T): TArray; overload; static;
    class function Concatenated(const Source: array of TArray): TArray; overload; static;
    class procedure Initialise(var Values: array of T; const Value: T); static;
    class procedure Zeroise(var Values: array of T); static;
    class function GetHashCode(const Values: array of T): Integer; overload; static;
    class function GetHashCode(Values: Pointer; Count: Integer): Integer; overload; static;
    end;

    ReplyDelete
    Replies
    1. "These are enumerated types rather than enumerable types."

      Oooops, what a stupid mistake. Fixed! Thanks!

      "I personally don't think the record constraint adds."

      Presumably the 'record' constraint is here so that you can only use types with RTTI for the Range<> parameter. At least that's how it was explained to me.

      "I think Range is a poor choice of name."

      Accepting better suggestions ...

      Delete
    2. "Presumably the 'record' constraint is here so that you can only use types with RTTI for the Range<> parameter."

      Actually that's not it. I've only just read this enough to take it on board. The record constraint applies a type constraint to the generic parameter that ensures it is a value type. So, integer types, character types, real types, enumerated types, pointer types, sets, static arrays, procedural types. Er, I think that covers it. Documentation here: http://docwiki.embarcadero.com/RADStudio/en/Constraints_in_Generics#Record_Constraint

      Delete
  5. I suppose that the record constraint stops some obvious mistakes at compile time. But you'll still need some runtime checks to stop instantiation with types that are not enumerations. I actually have a similar type in my code. It looks like this:

    TEnumeration = record
    strict private
    class function TypeInfo: PTypeInfo; inline; static;
    class function TypeData: PTypeData; inline; static;
    public
    class function IsEnumeration: Boolean; static;
    class function ToOrdinal(Enum: T): Integer; inline; static;
    class function FromOrdinal(Value: Integer): T; inline; static;
    class function MinValue: Integer; inline; static;
    class function MaxValue: Integer; inline; static;
    class function InRange(Value: Integer): Boolean; inline; static;
    class function EnsureRange(Value: Integer): Integer; inline; static;
    end;

    So, there's my suggestion for a name, `TEnumeration`. Or even `Enumeration` to go against the age old grain!

    ReplyDelete
  6. Oh, your blog code ate my generic syntax!

    ReplyDelete
  7. those classes lack GetEnumerator method :

    http://qc.embarcadero.com/wc/qcmain.aspx?d=106212

    ReplyDelete
  8. I wonder if those classes could also support "set of T" type related functions, ToString and FromString.

    Cause Delphi stock functions of RTTI are buggy

    ReplyDelete
  9. @Dmitry No GetEnumerator in my TEnumeration because the syntax is too messy. You cannot put the `GetEnumerator` on TEnumeration. You need another layer. The loop would need to be

    for enum in TEnumeration.Enumerator do

    Make me yak!

    ReplyDelete
  10. Like C#, Delphi does not have a constraint for enumeration types:
    In C# I used the clumsy "where T : struct, IComparable, IFormattable, IConvertible" and I'm still thinking "I should have left those away":
    http://bo.codeplex.com/SourceControl/latest#.NET/C#/Library/bo.Extensions/bo.Extensions..EnumExtensions.cs

    ReplyDelete
  11. I'd be interested in that code (EnumHelper.pas) but it's not available (404). Does anyone have a copy?

    ReplyDelete
  12. Hello,
    The link to the source is broken. Do you have a valid one please?
    Regards Peter

    ReplyDelete
    Replies
    1. That source is lost in the mists of time. I have created new one based on our current code: https://www.dropbox.com/s/cixpqemiaeqdnbt/EnumHelper.pas?dl=1. You will also need https://www.dropbox.com/s/l59l9931amvc1n6/System.Reflection.pas?dl=1.

      Delete
  13. Legend! Thanks so much Primoz

    ReplyDelete