When Chris Rolliston implemented a simple threaded queue class using TMonitor, I was quite ecstatic. Finally I will be able to compare my lock-free queue with a threaded implementation! (That’s something I wanted to do for a long time but I never found a willpower to write a locking queue.)
My elation continued while I was coding my test app and while I tested my own queue. And then I plugged in Chris’ queue and … everything went downhill :( Immediately, my test app started crashing. At first it looked like the bug was in Chris’ code, but when I enabled debug DCUs, debugger pointed to a very unlikely location – TMonitor. Unlikely because of two reason – because I know that Delphi people take quality to the heart and test the code and because I know Allen is an experienced guy who writes excellent code.
Please note that I’m not saying that there’s a bug in TMonitor – at least at the moment I’m not. It just looks that way. I would have to do some heavy debugging to prove this and at the moment my head is too full of pollen and antihistamine to do such work. That’s why I’m asking experienced Delphi programmers to jump in and help me find the true reason of the problem - TMonitor, TSimpleThreadedQueue or my test program.
To run the test program you would need fairly fresh OmniThreadLibrary (at least release 899) and Delphi XE. Download the source, make sure that path includes OmniThreadLibrary, recompile and run. Most probably it would help to have four cores to exhibit the problem although I was able to reproduce it on only one.
Anyway, run the application and click the “Bug?” button. If you see something like
popping up, then you have repeated the problem.
In my testing, clicking Break opens debugger in System.pas. If you get SimpleThreadedQueue.pas or speedtest1.pas instead, then most probably you don’t have debug DCUs enabled.
The test code in StartTest2 creates one writer and two reader threads. In my testing the problem occurs when one writer and one reader already completed their work and only the last reader is running. When this last reader calls Dequeue, TMonitor.Wait gets called. This one calls RemoveWaiter and crash occurs.
This is all I know about the problem at this moment. If you feel capable and can spare yourself some time, please download the app and try to help me finding the real cause for this crash. Thanks!
Updated 2011-10-11
This bug was fixed in Delphi XE2.
Gabr, see this link
ReplyDeletestackoverflow.com/questions/4856306/tthreadedqueue-not-capable-of-multiple-consumers
May be related, the error in TThreadedQueue seems to be in the TMonitor.
I will look into your code later today.
When the queue contains only 2 elements, Walker gets initialised with FWaitQueue.Next.Next (ie FWaitQueue), isn't it ?
ReplyDeleteAs a consequence, the while loop is never executed, and the queue either gets screwed up (FWaitQueue := Last is executed, but FWaitQueue.Next is never fixed up), or it stays untouched...
I have the same problem under Delphi 2010...
I have written a locking queue some time ago. I tested it on quad core processor and found out that in almost all cases it is just slightly slower then you own lock free implementation. There is very small difference. Much of the difference comes from TAnyValue being slower most of the time then TOmniValue, because I use interfaces. I wrote about this improvement briefly here: http://www.cromis.net/blog/2011/03/new-version-of-simplestorage-and-other-updates/
ReplyDeleteYou can find the code on my download page. I also have the test somewhere still and can mail it to you if you like.
Oh the code is in the threading unit from download.
ReplyDeleteMaybe the "while Walker <> FWaitQueue do" loop could be replaced by something like this:
ReplyDeleterepeat
if Walker = @WaitingThread then
begin
Last.Next := Walker.Next;
Break;
end;
Last := Walker;
Walker := Walker.Next;
until Last = FWaitQueue.Next;
(tested only with a pen and a sheet of paper !)
Hi,
ReplyDeleteI tested your code with the TThreadedQueue and the same error appeared. This is the same error as I reported in the link given above. I'm not the man to pinpoint what's going on in TMonitor. Looks like a job for Allen Bauer.
Funnily enough, I actually asked a similar question back in February: https://forums.embarcadero.com/thread.jspa?messageID=320898&tstart=0#320898. Leaving it for a bit, a few months later I looked at TMonitor again and came to the conclusion it did actually work if used more simply than I was originally attempting to use it. Obviously not though!
ReplyDelete@Leif: Yes, it looks like this usage of TMonitor doesn't support multiple consumers. Too bad - especially as it makes TThreadedQueue useless. :(
ReplyDelete@Iztok: I'll make sure to test with your queue too.
@Thierry: Your guess is as good as mine (at this moment).
This report might be interesting:
ReplyDeleteReport No: 78415 Status: Open
AV in TInternalConditionVariable.RemoveWaiter
http://qc.embarcadero.com/wc/qcmain.aspx?d=78415
QCWIN:Defect_No=78415
Sorry, now I see that the discussion Chris pointed to also mentions this report.
ReplyDelete@Moritz: Yes, that seems to be the same problem. Shame on Embarcadero for not fixing this with a hotfix!
ReplyDelete@Chris, Moritz, gabr
ReplyDeleteSo the same bug appears in SynObjs.pas AND in System.pas... **Sigh** Code duplication...
In the end, 2 bugs in 2 different units could be solved with just one shot ! That's pretty good news !
That's a 2009 bug, ouch, I guess it means that TMonitor hasn't been used much (if any) in production code out there since it was introduced... Give us our 4 bytes per instance back! ;)
ReplyDeleteWhen I initially posted the bug back in 2009, I've also attached a patch that fixes at least the error in SyncObjs.pas. That patch, however, disappeared from the report (presumably because it got overridden by an attachment that got added later). I'll have a look if I can re-attach the patch.
ReplyDeleteIs XE or 2010 required to compile/run the test app? Or could it work with Delphi 2009?
ReplyDelete@Michel: I think the test app should work in 2009 (but I may be wrong).
ReplyDeleteWith Delphi 2009 (after commenting out TStopWatch related code and changing Line 79 to "if not TMonitor.Wait(FQueue, LongWord(Timeout)) then Exit(wrTimeout);", no exception is thrown. Instead the application freezes, the log shows:
ReplyDeleteDebug Output: Exiting writer Process speedtest.exe (3928)
Debug Output: Awaited writers Process speedtest.exe (3928)
Debug Output: Exiting reader Process speedtest.exe (3928)
Looks like it is broken in a different way?
Hi!
ReplyDeleteHere's the patch that can be applied to SyncObjs.pas:
diff --git a/SyncObjs.pas b/SyncObjs.pas
index 4239a0b..f5382d9 100644
--- a/SyncObjs.pas
+++ b/SyncObjs.pas
@@ -953,7 +953,7 @@ begin
begin
WaitQueue := LockQueue;
try
- Last := WaitQueue.Next;
+ Last := WaitQueue;
Walker := Last.Next;
while Walker <> WaitQueue do
begin
@@ -969,7 +969,10 @@ begin
if Walker.Next = Walker then
WaitQueue := nil
else
- WaitQueue := Last;
+ begin
+ WaitQueue := Walker.Next;
+ Last.Next := WaitQueue;
+ end;
finally
UnlockQueue(WaitQueue);
end;
System.TMonitor.RemoveWaiter can be patched accordingly.
Hi!
ReplyDeleteHere's the patch that can be applied to SyncObjs.pas:
diff --git a/SyncObjs.pas b/SyncObjs.pas
index 4239a0b..f5382d9 100644
--- a/SyncObjs.pas
+++ b/SyncObjs.pas
@@ -953,7 +953,7 @@ begin
begin
WaitQueue := LockQueue;
try
- Last := WaitQueue.Next;
+ Last := WaitQueue;
Walker := Last.Next;
while Walker <> WaitQueue do
begin
@@ -969,7 +969,10 @@ begin
if Walker.Next = Walker then
WaitQueue := nil
else
- WaitQueue := Last;
+ begin
+ WaitQueue := Walker.Next;
+ Last.Next := WaitQueue;
+ end;
finally
UnlockQueue(WaitQueue);
end;
System.TMonitor.RemoveWaiter can be fixed accordingly.
Hi
ReplyDeleteWe seem to have this problem affecting us with Delphi XE.
It is affecting the datasnap callbacks in our application.
we can simulate with a small app, is there any way we can do these changes in Delphi XE system.pas and compile the DCU
Hi! I have created a patch for TMonitor in Delphi XE, based on Christian Gudrian's comment above. It can be downloaded from here: http://href.hu/x/hdu3
ReplyDelete