Saturday, January 10, 2015

Implementing Record Assignment Operator [2]

Yesterday I hinted at having a working (and easy to use) solution which allows you to detect a record assignment (or copying, if you want) and to access both records (left and right part of the assignment operation). You can also modify record data when an assignment is detected. I also mentioned that my approach is very ugly, unstable, likely to cause problems in future Delphis and so on. You should read this article for the technical details, not to really use my code in an application. You have been warned!

Before I jump to hardcode Delphi stuff, I’d like to mention few different approaches that various smart people have already implemented. Most of them slipped below my radar in the past – and I’m guessing this may be the case for you too – so I’m listing them here in case you want to do more research on the topic.

Alternatives

The oldest solution – and the only one I was aware of – was written by a fellow Slovenian Delphi programmer Iztok Kacin. On his blog he posted a series of articles about writing a very fast clone of a TValue record. He discovered that the biggest slowdown comes from a very slow CopyRecord method (System.pas) and he devised a way to replace CopyRecord for his record with a faster version with runtime patching. As CopyRecord is also called during record assignment, his approach can be easily used to implement an assignment operator.

Another solution comes from Stefan Glienke. His DSharp library (wonderful stuff, much recommended!) contains a DSharp.Core.CopyOperator unit which I was not ware of although I’m a regular DSharp user. His code looks quite clean and doesn’t depend on any extremely low hacks like patching the exe. I’m still trying to understand it so I won’t go into more detail now. Expect an article on that topic soon …

Third solution (or at least a hint about it) was posted as a comment to the previous post. I’m not yet sure what Sergworks meant so I’m not really able to comment on his solution.

Also in comments to the previous post, Jorn E. Angeltveit mentioned a completely legit Delphi approach – declare an implicit operator for assigning a pointer to the record and then write all your assignments in form “rec2 := @rec1;” That’s a clean but unfortunately very error-prone solution. It would be just too easy to forget to write the @ and then Delphi would revert to “normal” copying. Maybe this would be a good solution if it is combined with a run-time detection of record copying (see the next section). Implicit operator could then be able to copy/modify the record while “normal” copy would trigger an exception. You can probably expect an article about that approach, too.

Detecting a Copy Operation

Let’s return to my code, first to the (relatively) clean part. It turns out that it is quite simple to detect a record copying operation.

  1. Add an IInterface field to a record.
  2. Write a class which implements its own refcount.
  3. Assign an instance of this class to the IInterface field.
  4. Checks the refcount in _AddRef. If it is higher than 1, a copy operation is in progress.

You can download sample application here. It contains all needed units and two projects. In this section we’ll be looking at the CatchAssignment.dpr project which implements a simple assignment guard.

First, let’s take a look at our modified test record. Important parts are highlighted.

type
TRec = record
private
FAssign: IInterface;
FData: integer;
FDestructor: IGpAutoExecute;
public
constructor Create(data: integer);
end;

{ TRec }

constructor TRec.Create(data: integer);
var
s: string;
begin
FData := data;
Writeln('Record [', FData, '] created');

FAssign := TAssignGuard.Create(
procedure(Self: TObject)
begin
Writeln('Copy operation in progress ...');
end);

s := Format('Record [%d] destroyed', [FData]);
FDestructor := AutoExecute(
procedure
begin
Writeln(s);
end);
end
;

The TAssignGuard class is implemented in the GpRecAssignOverload unit. It is a simple TObject descendant which implements all required methods for reference counting. Basically, it is a clone of TInterfacedObject with some small modifications, most important of which lies in the _AddRef method.

function TAssignGuard._AddRef: Integer;
begin
Result := TInterlocked.Increment(FRefCount);
if Result > 1 then
if assigned(FOnAssign) then
FOnAssign(Self);
end
;

The principle behind this guard is simple. When a record is assigned to another record, all interface fields are copied via IntfCopy method which increments refcount on each interface (otherwise the interface would be destroyed as soon as one record would go out of scope). So a normal interface, used only by one record, has a refcount of 1 and an interface shared between more than one record has a refcount of 2 or more.


Test code is just the same as yesterday, just the output is different.

procedure Test;
var
rec1, rec2: TRec;
begin
rec1 := TRec.Create(42);
Write('1>'); Readln;
rec2 := rec1;
Write('2>'); Readln;
rec2 := TRec.Create(17);
Write('3>'); Readln;
end
;


image


There’s not much you can do in TAssignGuard as you don’t have access to records it is being used in. You could pass the original record to the TAssignGuard in the record constructor, but there’s just no way to access the left part of the assignment operation (rec2 in “rec2 := rec1;”). Because of that limitation, this form is best used to guard against record copying – just put an exception inside the TAssignGuard handler and you’re done.


Accessing Both Records During Copy


OK, but yesterday I promised more than a simple guard. How does the yesterday’s demo work? (BTW, it is included in the code archive under name RecordAssignment.dpr).


To answer that, I’ll have to describe a bit of what happens when the code executes “rec2 := rec1;”.


Firstly, CopyRecord method is called. It receives pointers to both records, rec1 and rec2.


Secondly, CopyRecord detects that it’s dealing with a complex record (one containing refcounted entities). Then it executes a “field by field” copy – it walks the RTTI information for the record field by field and copies each in an appropriate manner.


Thirdly, when CopyRecords copies an interface (for example our IInterface), it calls IntfCopy. As in the TAssignGuard example above, we can then catch the operation in the _AddRef method.


The tricky part now is how to access record addresses that were passed to the CopyRecord (pointers to rec1 and rec2). Well, this is the dirty, unsupported, easy-to-break part I’m mentioning all the time. We have to step down to assembler and mess with registers and the stack.

procedure GrabRegisters(var rec1, rec2: pointer);
asm
{$IFDEF WIN32}
mov [edx], esi
mov [eax], ebx
{$ENDIF}
{$IFDEF WIN64}
mov rax, [rbp+$D8]
mov [rbp+$28], rax
mov rax, [rbp+$D0]
mov [rbp+$30], rax
{$ENDIF}
end
;

In the Win32 code, we are lucky as both addresses are still available in the ebx and esi registers.


In the Win64 code, we have to find values that were pushed to the stack some time before (in CopyRecord) and access them.


For OS X, Android and iOS, I have no idea. I don’t intend to support this solution and I didn’t want to research that option.


There’s a good likelihood that even the Win32 and Win64 won’t work in all Delphi versions. I have tested them only in XE2. Any changes in the RTL (in CopyRecord or in IntfCopy) or in the compiler may break the GrabRegisters.


Back to the code … TAssignRec<T>._AddRef is very similar to TAssignGuard<T>._AddRef except that it first calls the GrabRegisters to get addresses of both records (i.e. pointers to them).

function TAssignRec<T>._AddRef: Integer;
var
rec1, rec2: ^T;
begin
GrabRegisters(pointer(rec1), pointer(rec2));
rec2 := rec1;
Result := TInterlocked.Increment(FRefCount);

if Result > 1 then
if assigned(FOnAssign) then
FOnAssign(rec1^, rec2^);
end
;

Another thing to keep in mind is that the field implementing this “operator” should be the last one in the record.

type
TRec = record
private
FData : integer;
FAssignOp: IInterface; // should be last!
public
constructor Create(data: integer);
end
;

This way all other fields will already be copied once the assignment operator code is triggered and you’ll be able to work with them.


Conclusion


As I’ve said before, I don’t recommend using this code in production. If you, however, would like to play with it or even enhance it in something more usable, feel free to do anything. I’m releasing it under a ‘'completely free” license and you can use it as you wish.

3 comments:

  1. Different approaches but equally satisfactory results.
    My blog:http://www.raysoftware.cn/?p=518

    ReplyDelete
    Replies
    1. @Primoz: interesting article, and interesting approach. I have some code what would look a lot cleaner if would use this, but where it resolves one problem, it creates some new ones too.
      So I'd have to play with it a bit to know if it's really going to be an overall improvement, or that the tricky stuff add hard to debug problems, and if code remains understandable.

      Anyway, I like the "completely free" license. The underlying way of thinking is very much appreciated: to help fellow developers out by sharing knowledge without asking anything in return at all.

      @锐王: I see that the code manages to have a parameterless constructor and destructor for records which implicitly fire, and it lets you set up an assignment operator.

      The code behind it seems to be quite complex, and I had to use Google Translate for the blog, which doesn't convert all text, and the comments in the code are Chinese too. So at this point I don't fully understand how or why it works.

      https://translate.google.com/translate?hl=nl&sl=zh-CN&tl=en&u=http%3A%2F%2Fwww.raysoftware.cn%2F%3Fp%3D518

      With google translate I scanned through the rest of your blog, which looks very interesting.. did you ever consider writing your blog posts in English? English is not my native language either, but it's handy to communicate with people that have similar interests.

      Delete
  2. This is my code.http://www.raysoftware.cn/wp-content/uploads/2014/09/Source.zip
    English is not my native language too,My English is very bad.
    Read the code easier to understand.
    Operator_Initialize is constructor
    Operaor_Finalize is destructor
    Operator_Assign is Assign

    the limt :
    the record variable can not is global variable,object's field and dynamic array.

    ReplyDelete