Monday, February 01, 2010

Parallel.ForEach.Aggregate

Totally out-of-band posting (I’m still working on the second part of the “blocking collection” trilogy), posted just because I’m happy that the code works.

This code.

procedure TfrmParallelAggregateDemo.btnCountParallelClick(Sender: TObject);
var
numPrimes: integer;
begin
numPrimes :=
Parallel.ForEach(1, inpMaxPrime.Value)
.Aggregate(
procedure (var aggregate: int64; value: int64) begin
aggregate := aggregate + value;
end)
.Execute(
function (const value: TOmniValue): TOmniValue begin
if IsPrime(value) then
Result := 1
else
Result := 0;
end);
Log('%d primes from 1 to %d', [numPrimes, inpMaxSummand.Value]);
end;

Anonymous methods are simply great. They make the code unreadable, but they are oh so useful!

Everything is in the trunk. Checkout and enjoy.

19 comments:

  1. Anonymous20:29

    I can understand it makes you happy that code works.
    Look ma, all in one statement!
    Can you understand that people who see this code will weep?

    ReplyDelete
  2. Well ... if you would care enough to introduce yourself, I would care more to write a long answer.

    In short - doing this without the Parallel framework would cause the program to grow by a factor of 5 to 10 and that would not make it any more readable.

    ReplyDelete
  3. ... and if you replace the numPrimes reference in the log statement with the Parallel.ForEach then you can ditch the local var and temp assignment too (removing 3 lines) ... ;)

    ... and of course If IsPrime(Value) etc should be replaced with Result := IfThen(IsPrime(Value, 1, 0)), which removes another 4 lines from the proc... ;)

    It's often possible to make code very terse and compact, yet do a lot of work. The real trick is reducing the eye-watering on first contact with the code. I find the literal inclusion of the anonymous method code the most unsettling for reading code that uses them, and it can be difficult to separate the 'normal' code from the 'anonymous' code...

    Raymond.

    ReplyDelete
  4. I understand your sentiments exactly and I also have the same problem. Anonymous procs are really hard to read and make the code much more complicated.

    On the other hand I stand by what I have said before. Rewriting the same problem in a "normal" threaded way makes it totally unreadable. You loose the core of the problem in a maze of twisty little statements, all alike. Try coding it and you'll see.

    ReplyDelete
  5. So here's your challenge :-)

    How do you write this code so that it still uses the Parallel support, but reduces eye-watering while reading the code?

    Instead of literal anonymous procedures, if those 'anonymous' procedures were structured more like they were 'normal' procedures in the code, yet called in the same anonymous context, I think this would immensely improve how readable, understandable and maintainable this sort of code is...

    ReplyDelete
  6. I'm open to any suggestions ...

    ReplyDelete
  7. We'e probably in language change land here, but something like this just seems to be so much better:

    procedure TfrmParallelAggregateDemo.ParallelAggregator(var aggregate: TOmniValue; const value: TOmniValue); Anonymous;
    begin
    aggregate.AsInteger := aggregate.AsInteger + value.AsInteger;
    end

    function TfrmParallelAggregateDemo.ParalellIsPrime(const value: TOmniValue): TOmniValue; Anonymous;
    begin
    if IsPrime(value) then
    Result := 1
    else
    Result := 0;
    end;

    procedure TfrmParallelAggregateDemo.btnCountParallelClick(Sender: TObject);
    var
    numPrimes: integer;
    begin
    numPrimes :=
    Parallel.ForEach(1, inpMaxPrime.Value)
    .NumTasks(inpNumCPU.Value)
    .Aggregate(ParallelAggregator)
    .Execute(ParalellIsPrime);
    Log('%d primes from 1 to %d', [numPrimes, inpMaxSummand.Value]);
    end;

    To me, this code is so much more readable, and only used a few extra lines (many of which are white space). The new keyword Anonymous means "Don't call this procedure directly, but treat it as if it had been typed in as a literal anonymous method".

    You could even experiment with an anonymous parameter keyword to allow a procedure to be passed to a method and have the signature of the parameter control whether the it is given a method reference, or an anonymous method.

    What do you think?

    ReplyDelete
  8. and it would look even better if all the indenting had not been stripped when I posted that last comment :-)

    ReplyDelete
  9. Anonymous03:01

    Perhaps the IDE could apply different formatting to anonymous methods. That might make them at least stand out more as being removed from regular code.

    ReplyDelete
  10. @Raymond: Now that would be unreadable. The main "thing" about anonymous methods is that they capture the current context and variables. I can implement ForEach/Aggregate with normal methods (and I should really add this functionality too), but it would not offer the same level of flexibility.

    ReplyDelete
  11. I understand how anonymous methods capture the current context and variables.

    I am not sure why packaging the logic that operates on a context into a more manageable container rather then pasting the code in place is not an improvement. Capturing a context should not rely on a clumsy 'paste the code here' model. The compiler is quite smart to figure it out as in the example above.

    Anonymous methods are powerful - I don't think they need to result in obfuscated code!

    ReplyDelete
  12. This comment has been removed by a blog administrator.

    ReplyDelete
  13. I just don't think that will be readable.

    Imagine:

    procedure Something; anonymous;
    begin
    DoSomethingWith(thisVariable); //captured variable
    end;

    // lots and lots of code

    procedure SomethingElse;
    var thisVariable: integer;
    begin
    SomeProc(Something);
    end;

    Because Something and SomethingElse are far apart, you have to jump here and there to understand the code, which variables are used and from where they are captured. Verrrry bad - at least in my opinion.

    ReplyDelete
  14. Then you'd probably keep it close to the context

    Anyway, it's just a thought...

    ReplyDelete
  15. @Raymond: I think gabr's code is much more readable and comprehensible than a version scattered through several functions/methods which you have to skim up and down. Mind you this kind of mentality is what sanitized that ugly DOM API and let web developers curse less (jQuery!). I hope we see Delphi community adopting this aggressively. Events are passe :)

    ReplyDelete
  16. @Raymond: I've reformatted the code. Better now?

    ReplyDelete
  17. @karatas: Events? No, I think this is almost a macro model, but much smarter. Anyway, your and Gabr's comments are correct regarding having this spread around being a bad idea. Perhaps local anonymous function that share the context, but retain encapsulation of the logic would be better?

    @Gabr: Yes - that does look a little better.

    I don't think I'm the only developer who feels some laws have been violated when they read code like this. Perhaps it's just a matter of getting used to it, perhaps not :)

    ReplyDelete
  18. Victor10:34

    Yes, the reformatted code has improved the readability a lot.
    Also, I remember in the earlier code you used Aggregate.AsInteger, which is now removed, that also helps. Stuff like AsInteger makes me think about variants, yuck.
    I am the 'anonymous' above, by the way.

    Just another idea to improve readability: you could define the aggregate and execute methods in local variables, and still have the advantages of anonymous code. But just seeing that code in a separate statement from the parallel.foreach would help me read and understand it better.
    I also have difficulty with the 'chaining style' in passing arguments to parallel.foreach; I would prefer:
    Parallel.ForEach(FromValue, ToValue, AggregateCode, ExecuteCode);

    ReplyDelete
  19. .AsInteger was removed because I added another set of Aggregate and Execute overloads that work with integers instead of TOmniValues.

    Chaining style will stay. The problem here is the combinatorial explosion. At this moment, OtlParallel implements four different Execute methods and four different Aggregate methods. Not every combination of them makes sense, but still there would have to be about eight ForEach(low, high, ...) methods and about eight ForEach(enumerator, ...) methods defined to cover all bases.

    Thanks for introducing yourself. It is much nicer to talk to a real person.

    ReplyDelete