Wednesday, February 10, 2021

Readers-writer lock - Part 4: Improving TLightweightMREW

While the TLightweightMREW is quite handy, it is not perfect. There's a weird assymmetry in it. On all operating systems that Delphi can compile for, read locks are reentrant (recursive) while write locks are not. In other words, if a thread already owns a read lock, it can call BeginRead again and it will succeed. Write locks are different. If a thread already owns a write lock and calls BeginWrite again, it will either deadlock (on Windows) or raise an exception (on other supported platforms).

This is, however, relatively simple to fix. I have implemented a simple wrapper for the TLightweightMREW lock in TLightweightMREWEx. This new record uses internal TLightweightMREW to provide locking and adds some simple logic to implement write lock reentrancy. The implementation and accompanying test program rwReentrantWriter can be found at https://github.com/gabr42/examples/tree/master/Reader-writer%20lock.

The TLightweightMREWEx record implements the same public functions as TLightweightMREW. Functions from the xxxRead family are simply forwarded to internal FRWLock: TLightweightMREW while the xxxWrite family of functions is reimplemented inside the new record:

type
  TLightweightMREWEx = record
  private
    FRWLock: TLightweightMREW;
    [volatile] FLockOwner: TThreadID;
    FLockCount: integer;
  public
    class operator Initialize(out Dest: TLightweightMREWEx);
    procedure BeginRead; inline;
    function TryBeginRead: Boolean; {$IF defined(LINUX) or defined(ANDROID)}overload;{$ENDIF} inline;
    {$IF defined(LINUX) or defined(ANDROID)}
    function TryBeginRead(Timeout: Cardinal): Boolean; overload; inline;
    {$ENDIF LINUX or ANDROID}
    procedure EndRead; inline;
    procedure BeginWrite;
    function TryBeginWrite: Boolean; {$IF defined(LINUX) or defined(ANDROID)}overload;
    function TryBeginWrite(Timeout: Cardinal): Boolean; overload;
    {$ENDIF LINUX or ANDROID}
    procedure EndWrite;
  end;

class operator TLightweightMREWEx.Initialize(out Dest: TLightweightMREWEx);
begin
  Dest.FLockOwner := 0;
end;

To understand the introduced logic, let's examine the new BeginWrite method:

procedure TLightweightMREWEx.BeginWrite;
begin
  if FLockOwner = TThread.Current.ThreadID then
    Inc(FLockCount)
  else begin
    FRWLock.BeginWrite;
    FLockOwner := TThread.Current.ThreadID;
    FLockCount := 1;
  end;
end;

When a record is not locked for writing, FLockOwner is zeroed out. When BeginWrite is called, first test fails and the else branch calls BeginWrite. When it returns, we can be sure that the write lock was granted. The code can now safely modify (as not other thread could have passed BeginWrite) the FLockOwner and FLockCount fields to mark that the current threads owns the lock.

If two threads call BeginWrite at the same time, only one will win while another will block so that also doesn't represent a problem.

If the first thread (the one owning the lock) calls BeginWrite again, the first if condition will be True and the code will simply increment the lock count. This can again only happen in one thread (the one owning the lock) so we can safely increment the lock count.

The EndWrite call does the same in reverse:

procedure TLightweightMREWEx.EndWrite;
begin
  if FLockOwner <> TThread.Current.ThreadID then
    raise Exception.Create('Not an owner');

  Dec(FLockCount);
  if FLockCount = 0 then begin
    FLockOwner := 0;
    FRWLock.EndWrite;
  end;
end;

A safety check at the beginning verifies that EndWrite was indeed called from the thread that owns the lock. After that the lock count is decremented and if the new lock count is 0, FLockOwner is cleared and the lock is released. As all the code up to the final line (the call to FRWLock.EndWrite) can only execute from one thread at a time, all variable manipulations are safe.

The two TryBeginWrite functions are implemented almost the same as BeginWrite except that we have to take care of managing the function result:

function TLightweightMREWEx.TryBeginWrite: Boolean;
begin
  if FLockOwner = TThread.Current.ThreadID then begin
    Inc(FLockCount);
    Result := true;
  end
  else begin
    Result := FRWLock.TryBeginWrite;
    if Result then begin
      FLockOwner := TThread.Current.ThreadID;
      FLockCount := 1;
    end;
  end;
end;

Tests

The testing code in rwReentrantWriter runs four simple and fast tests followed by two 10-second stress tests. If everything is OK, the program writes out OK and waits for Enter to be pressed.

Both stress tests constantly try to access write lock from ten threads. At the end of the test the code logs the number of acquired write locks for each thread. If everything is working correctly, we would expect about the same number of locks acquired in each thread.

For example, on my Windows notebook the test logs:

.....

47358 48721 45511 46172 46296 51432 47468 48406 48987 48025

46470 46667 46641 46801 47584 46641 46110 47101 46892 47417

OK

On Ubuntu 18.04 running under the WSL subsystem on the same machine the numbers are lower, but still quite balanced:

.....

10289 9195 8792 9629 8816 11759 7611 9334 7284 9163

8478 8595 9077 9948 9627 8729 8259 8282 11158 9247

OK


13 comments:

  1. Important note for SRW locks under Windows, Shared/Read locks cant NOT be called recursively. MSDN Documentation: "Shared mode SRW locks should not be acquired recursively as this can lead to deadlocks when combined with exclusive acquisition."

    ReplyDelete
    Replies
    1. Indeed. That is why recursive calls to TLightweightMREW.BeginWrite deadlock on Windows.

      Delete
  2. Forgot to say, Thanks for the article, great read.

    ReplyDelete
  3. Ryzen 5800x
    42419 46735 53009 57453 51471 50607 58571 49059 41342 51803
    51314 48449 51431 50943 51249 50296 50754 50342 50058 51073

    ReplyDelete
  4. Maybe a strange question, what about a ReadLock which can be upgraded to a WriteLock and DownGraded again?

    Suppose i go in a loop through some items (ReadLock), then i see an item which i want to alter, so what do i do now? i exit, and go to a writelocked section of code, but my item can be locked by another process when i exit the readlock, and enter the writelock. So something like lock.StartWriteLock, lock.EndWriteLock, and the readlock continues?

    or is my thinking all wrong about this?

    ReplyDelete
    Replies
    1. None of the OS implementations (Windows, Posix) supports that and I don't think this can be implemented as a "patch on".

      IIRC, original TMREWSync supports upgrading read locks to write locks but I never used that and I can be mistaken. You'll have to test it first.

      Delete
    2. You have two options, I believe.

      1) Use normal critical sections.

      2) Do the 'test, lock, test again' cycle, so:
      - Read through the items
      - If an item is found, exit and writelock
      - Read through the items again to see if the item is still there

      Delete
  5. I have a question regarding procedure TLightweightMREWEx.BeginWrite; Is it threadsafe to inspect and increment FLockCount in such manner? The same question is about FLockOwner. What if multiple threads call BeginWrite simultaneously?

    ReplyDelete
  6. A read to FLockOwner is atomic on all supported platorms. Given that, we have three situations: 1) the lock is unowned, 2) the lock is owned but not by the threads calling BeginWrite, 3) one of the threads calling BeginWrite owns the lock.

    1) If will fail for both threads, they will enter TryBeginWrite and one will win. Other will wait.

    2) If will fail for both threds, they will enter TryBeginWrite and both will wait.

    3) If will succeed for one thread, the other will enter TryBeginWrite and wait.

    ReplyDelete
    Replies
    1. Thank You for Your answer. It is clear now. And one more question: TLightweightMREWEx still does not allow to take read lock after taking write lock in the same thread. Is there any reason for such behaviour?

      Delete
    2. As the article states ;) - I don't see how that could be implemented as an add-on to the system primitive.

      Delete
  7. It needs one more addition :-) The RAII interface. There should be functions (and typecasts) that return TInterfacedObject as IUnknown, so that the following pattern becomes possible:

    procedure MyCode;
    begin
    // some long computations
    LightMREW.BeginWriteIntf(); // invisible ARC var here
    // code to update state, to write data
    end;

    ReplyDelete