Wednesday, December 28, 2011

How to Find a Missing _Release

For quite some time, thread pool in OmniThreadLibrary had a weird problem – if you scheduled more than 80 tasks in one go, FastMM would report memory leaks when application was terminated. It bothered me, of course, but I could never find the real reason behind the problem. And so I left the code “as is” and said to myself that nobody would encounter it in practice, anyway.

And then Žarko came and wrote about me on his blog :( In case you are living in a dark cave and didn’t hear of him – Žarko runs immensely popular Delphi site at About.com. His testing (and most of all his highly reproducible test case) gave me a new energy to pursue the problem.

The reason for the memory leak was a reference count leak in TOmniInterfaceDictionary, which is now fixed in the SVN. I also found some very tricky race conditions in TOmniTask.Execute and TOmniTask.Terminate and a cause for weird "A call to an OS function failed" error in DispatchEvent. All in all, I had two very productive bug-hunting nights. But that’s not why I started writing this article. I wanted to tell you a debugging story titled “How to find a missing Release.”

I started at the end. Žarko’s test case produced (at least on some computers) a leak report which clearly stated that the background task worker, TOmniWorker, did not always get released, so I started with that and checked the reference count of the TOmniWorker objects after the thread pool was finished with them. [Almost all internals in OmniThreadLibrary are interface-based. That simplified the development a lot but also complicated problem finding. Still, I  believe that it was a good choice to take.] And behold! – only last ten-and-something (this happened two weeks ago and I’m now recreating it from my faulty memory) had proper reference count at this point, first (let’s say) 60 had reference count that was too large. It was off just by one, but that was enough – task was not released when it should have been.

I then spent quite some time measuring reference count at different points in the task’s execution. All to no avail – reference count was fine as far as I could trace the progress into the OTL innards, but when the task should be released, it was not fine anymore.

Again, I was lost, but then I remembered a little used function in the Delphi debugger – namely that it can log call stack that led to the execution of the current line. [Oh how I wish that I could also access this call stack with a RTL call!]

image

Then I wrote my own reference counting methods (_AddRef and _Release) in the TOmniTask class and put a breakpoint in both methods. Breakpoint was set to not break, to log the current value of the Result pseudo-variable and to log the call stack.

Breakpoint was put on a code branch that would only be executed if I turned on a special flag variable. This way, most TOmniTask objects were not logged, but I could selectively log refcount changes for two tasks – the first one (which I knew will not be destroyed properly) and the last one (which I knew that will be destroyed).

When the program terminated, I was left with a very long Event Log, full of debugger entries:

Breakpoint Call Stack: Log 10 stack frames Process app_11_ThreadPool.exe (7828)
    OtlTaskControl.TOmniTask._AddRef
    System._IntfAddRef(???)
    OtlTaskControl.TOmniThread.Create(TOmniTask($1F4C490) as IOmniTask)
    OtlTaskControl.TOmniTaskControl.Run
    OtlThreadPool.TOmniThreadPool.Create('GlobalOmniThreadPool')
    OtlThreadPool.CreateThreadPool('GlobalOmniThreadPool')
    OtlThreadPool.GlobalOmniThreadPool
    test_11_ThreadPool.TfrmTestOtlThreadPool.FormCreate($1ED4320)
    Forms.TCustomForm.DoCreate
    Forms.TCustomForm.AfterConstruction
Breakpoint Expression Result: 2

Breakpoint Call Stack: Log 10 stack frames Process app_11_ThreadPool.exe (7828)
    OtlTaskControl.TOmniTask._AddRef
    System._IntfCopy(???,???)
    :00409d98 @IntfCopy + $C
    OtlTaskControl.TOmniTaskControl.Run
    OtlThreadPool.TOmniThreadPool.Create(???)
    OtlThreadPool.CreateThreadPool('GlobalOmniThreadPool')
    OtlThreadPool.GlobalOmniThreadPool
    test_11_ThreadPool.TfrmTestOtlThreadPool.FormCreate($1ED4320)
    Forms.TCustomForm.DoCreate
    Forms.TCustomForm.AfterConstruction
Breakpoint Expression Result: 3

Breakpoint Call Stack: Log 10 stack frames Process app_11_ThreadPool.exe (7828)
    OtlTaskControl.TOmniTask._Release
    System._IntfClear(???)
    :00409d87 @IntfClear + $13
    OtlTaskControl.TOmniTaskControl.Run
    OtlThreadPool.TOmniThreadPool.Create(???)
    OtlThreadPool.CreateThreadPool('GlobalOmniThreadPool')
    OtlThreadPool.GlobalOmniThreadPool
    test_11_ThreadPool.TfrmTestOtlThreadPool.FormCreate($1ED4320)
    Forms.TCustomForm.DoCreate
    Forms.TCustomForm.AfterConstruction
Breakpoint Expression Result: 2

Then I separated this output into two refcount logs – one for the first task and one for the last (during which I wished I had also logged a Self address which would make my work much simpler) and at the end I compared both logs in two Notepads, opened side by side.

And there it was – a call to TOmniInterfaceDictionary.Resize that was present only in one log and which incremented refcount by one. :(

The rest of my evening was simple – write some unit tests, fix the Resize (and two other places in TOmniInterfaceDictionary where I messed things up) and finally the thread pool was working fine!

The moral of the story? Unit tests are good (but I’m still not writing them on a regular basis – no way!). Reference counting produces hard to find bugs (but I’m still using it). Delphi offers powerful debugging capabilities (but I would like to have more). And the most important of all – open source code is great, because other people can find bugs and write reproducible tests! :)

12 comments:

  1. Nice article Gabr and nice example of multithreading by Zarko :)

    ReplyDelete
  2. Two Notepad instances side by side? I winced a bit when I read that. If you've got two long, complicated text files that are mostly similar and you need to check them for differences, you really ought to be using Beyond Compare.

    It's not free, but it's not expensive either and it's worth every penny. I use it at work and for my personal projects, and it's a great way to simplify tasks like this. Plus, it's written in Delphi. :)

    ReplyDelete
  3. No need for Beyond Compare, PsPad can do that and it is free. And it's also written in Delphi :)

    ReplyDelete
  4. There's always a need for Beyond Compare. It's the software developer's Swiss Army Knife. It's not free but it's worth far more than it costs.

    ReplyDelete
  5. There are routines in the JCL that let you access the call stack. Have you tried those before?

    ReplyDelete
  6. @Lachlan, I know, and I have my own and there's MadExcept ... but I'd still like to have something built into the RTL.

    ReplyDelete
  7. @Mason, I'm using ExamDiff for text comparison purposes but in this case the low tech solution was the best one as they were other differences in the refcount log, for example pointer addresses.

    ReplyDelete
  8. >something built into the RTL.
    There is partial support, see:

    Exception.CleanUpStackInfoProc
    Exception.GetExceptionStackInfoProc
    Exception.GetStackInfoStringProc

    ReplyDelete
  9. I would imagine that the reason there's no built-in RTL function for retrieving a stack trace is that there's no formal definition of a stack frame in Win32. Just look at how often the debugger gets your stack trace wrong if you're inside a function that uses inline assembly. So any stack trace you get from any tool or library is just a best-guess based on its developer's understanding of how things usually work.

    That's one of the things I like about Win64. In order to make the new exception model work, Microsoft set a formal, defined standard for stack frames, and they enforce it by having the exception unwinder kill non-conforming processes on sight if they raise an exception.

    Having a formal and enforced definition of a stack frame should make stack trace tools a lot easier to create (and more reliable!) than in Win32. So it wouldn't surprise me to see a stack trace function show up in the RTL soon, though it may not be available for Win32.

    ReplyDelete
  10. @Mason, yes, I fully agree.

    ReplyDelete
  11. Thank you, Gabr! You have resolved my issue too:
    http://otl.17slon.com/forum/index.php/topic,261.0.html

    And the article itself is very informative too. I've never clicked the Advanced button in the Breakpoint properties window before.

    ReplyDelete
  12. KDiff3 is free, open source, and does almost everything BeyondCompare does, if not everything.

    There is no possibility to remain a professional developer very long and not come to need a good comparison and merge tool.

    Try KDiff3.

    Warren

    ReplyDelete