Friday, February 06, 2009

Adding connection pool mechanism to OmniThreadLibrary

I have a problem.

I have this thread pool, which needs to be enhanced a little. And I don’t know how to do it.

Well, actually I know. I have at least three approaches. I just can’t tell which one is the best :(

I’m talking about the OmniThreadLibrary – I know you guessed that already. The problem is the thread pool in OTL doesn’t allow for per-thread resource initialization and that’s something that you need when you’re implementing a connection pool (more background info here). I started adding this functionality but soon found out that I don’t have a good idea on how to implement it. Oh, I have few ideas, they are just not very good :(

Current

At the moment, OTL thread pool functionality is exposed through an interface. This interface is pretty high-level and doesn’t allow the programmer to mess with the underlying thread management. All thread information is hidden in the implementation section. There’s a notification event that’s triggered when pool thread is created or destroyed, but it is only a notification and is triggered asynchronously (and possibly with a delay).

In short:

type
TOTPWorkerThread = class(TThread)
end;

TOmniThreadPool = class(TInterfacedObject, IOmniThreadPool)
end;

Event handlers

The first idea was to add OnThreadInitialization/OnThreadCleanup to the IOmniThreadPool. [Actually something similar already exists - OnWorkerThreadCreated_Asy and OnWorkerThreadDestroyed_Asy – but those events are part of the previous implementation and will be removed very soon.] Those two events would receive a TThread parameter and do the proper initialization there.

There are some big problems though. Let’s say you’ll be implementing database connection pool. You’ll have to open a database connection in OnThreadInitialization. Where would you store that info then? In an external structure, indexed by the TThread? Ugly! Even worse – how would you access the database info from the task that will be executing in the thread pool? By accessing that same structure? Eugh!

Rejected.

Thread subclassing

A better idea is to implement a subclassed thread class in your own code and then tell the thread pool to use this thread class when creating new thread object. You’d then manage database connection in overridden Initialize/Cleanup methods.

type
TDBConnectionPoolThread = class(TOTPWorkerThread)
strict private
FDBConnection: TDBConnectionInfo;
protected
function Initialize: boolean; override;
procedure Cleanup; override;
end;

GlobalOmniThreadPool.ThreadClass := TDBConnectionPoolThread;

Looks much better but there’s again a problem – I’d have to expose TOTPWorkerThread object in the interface section and that’s just plain ugly. Worker thread mechanism should be hidden. Only few people would ever be interested in it.

Thread data subclassing

An even better idea is to add an empty

TOTPWorkerThreadData = class
end;

definition to the interface section of the thread pool unit. IOmniThreadPool would contain a property ThreadDataClass which would point to this definition. And each worker thread would create/destroy an instance of this class in its Execute method.

You’d add database management as

type
TDBConnectionPoolThreadData = class(TOTPWorkerThreadData)
strict private
FDBConnection: TDBConnectionInfo;
protected
constructor Create;
destructor Destroy; override;
end;

GlobalOmniThreadPool.ThreadDataClass := TDBConnectionPoolThreadData;

[Maybe the constructor has to be virtual here? I never know until I try.]

There’s still a question of accessing this information from the task (and it goes the same for the previous attempt – I just skipped the issue then). I’d have to extend the IOmniTask interface with a method to access per-thread data.

Thread data with interfaces

While writing this mind dump a new idea crossed my mind – what if thread data would be implemented as an interface, without the need for subclassing the thread or thread data? In a way it is a first idea just reimplemented to remove all its problems.

Task interface would be extended with thread data access definitions, approximately like this:

type
IOtlThreadData = interface
end;

IOtlTask = interface
property ThreadData: IOtlThreadData;
end;

Thread pool would get a property containing a factory method.

type
TCreateThreadDataProc = function: IOtlThreadData;

IOmniThreadPool = interface
property ThreadDataFactory: TCreateThreadDataProc;
end;

This factory method would be called when thread is created to initialize thread data. Each task would get assigned that same interface into its ThreadData property just before starting its execution in a selected thread. Task would then access ThreadData property to retrieve this information.

In the database connection pool scenario, you’d have to write a connection interface, object and factory.

type
IDBConnectionPoolThreadData = interface(IOtlThreadData)
property ConnectionInfo: TDBConnectionInfo read GetConnectionInfo;
end;

TDBConnectionPoolThreadData = class(TInterfacedObject, IDBConnectionPoolThreadData )
strict private
FDBConnection: TDBConnectionInfo;
protected
constructor Create;
destructor Destroy; override;
end;

function CreateConnectionPoolThreadData: IDBCOnnectionPoolThreadData;
begin
Result := TDBConnectionPoolThreadData.Create;
end;

GlobalThreadPool.ThreadDataFactory := CreateConnectionPoolThreadData;

This approach requires slightly more work from the programmer but I like it most as it somehow seems the cleanest of them all (plus it is implemented with interfaces which is pretty much the approach used in all OTL code).

So, dear reader, what do you think? If you have better idea, or see a big problem with any of those implementations that I didn’t think of, please do tell in the comments!

2 comments:

  1. Not sure.
    The "Thread data with interfaces" seems to be quite good, though I'm not a big fan of interfaces...

    Why not try to implement a kind of "Model Gui Mediator Pattern" (1) mixed with a "Relationship Manager Pattern"? (2)
    Mediator i.e. Manager = the OT Pool
    GUI = Task Object
    Model = Data classes (DBConnectionData)

    With (1), you can ensure a DBConnectionData object can adapt itself i.e. react to the situation (update the DB when Task has modified data, rollback... automatically restore broken connection) and update the task (e.g. because of a DatabaseEvent).

    With (2), you can provide the Task object with more "Data" so it's able to handle situations that cannot be responded automatically by the DBConnectionData.
    Furthermore, any Task-class could be combined with any Data-Class you need. Even a "shared" DBConnectionData should be possible.

    I could figure out pools where there's a mix of Data classes and a corresponding mix of Tasks:
    - Task one, requiring Database connection
    - Task 2, requiring Internet connection
    - Task 3, requiring admin user privileges from the OS

    Just my 2 cents... hope I'm not "too wrong" ;)

    Both patterns ar described here

    ReplyDelete
  2. In fact I did implement that last approach. It is already in the repository.

    I don't see exactly how that pattern applies here (but then I'm not a big fan of patterns).

    The last one is easily solvable with three thread pools, each using different per-thread data.

    ReplyDelete