Monday, April 23, 2007

When changing semantics, make sure that existing code will break

This is something I learned the hard way - by many times making the same mistake.

Let's say you have a function AdjustBitrate that accepts one parameter representing new bitrate in kb/s (whatever that may mean).  It's prototype is simple:

procedure TBaseClass.AdjustBitrate(newBitrate_kb_s: integer);

Due to a changed requirements, you have to redesign this function so that it will take not a new bitrate, but a bitrate increase (relative to the existing bitrate). What do you do?

procedure TBaseClass.AdjustBitrate(bitrateIncrease_kb_s: integer);

Wrong! Just the mistake I was talking about. You changed the semantics of a method without breaking existing code!

Just think of what is your next task - you have to find all places where AdjustBitrate is called and fix them to pass bitrate increase as a parameter instead of target bitrate. Without the help of the compiler you'll surely miss one or two. True, Ctrl-Shift-Enter (or Search, Find References) in new IDEs can help a lot, but it is still not perfect. It won't show you other projects by your colleagues, which are unaware of this change in the semantics. Or even worse - what if this is a part of some 3rd party library that many other programmers are using? How will you force them all to update?

The answer is simple - your change must break existing code. In this case, the simplest solution is to rename the method:

procedure TBaseClass.AdjustBitrateBy(bitrateIncrease_kb_s: integer);

That's all. The compiler will warn all programmers using your code that AdjustBitrate does not exist and then they'll find AdjustBitrateBy and notice the change in parameters (hope, hope).

Sometimes you're completely sure that no code outside the current project can be using the method you're working on (maybe it is a private method). In that case, I usually rename the method (for example, Add becomes xAdd), fix all call sites for this method (compiler will report errors everywhere I'm trying to use Add), and then rename it back (xAdd -> Add).

Technorati tags: , ,

Friday, April 20, 2007

Using descriptive variable names

If I've learned something in my programming career, it is a fact that your coding style changes all the time. [Not all the time for the better, I have to admit - there was a weird phase when if..then..else alignment ... no, I'm not yet able to discuss this. Too scary.] It is not only alignment that is affected (hanging begin..end etc) but the way how you split stuff into classes, units, and methods, and how you name entities.

Recently I made a change in the latter. I'm a big fan of using long entity names (even my 'for' variables are usually named iSomething), but recently noticed that I cannot always pack enough meaning into a name. I was doing some DVB transport stream manipulations and noticed that most of the time entity names only conveyed half of their real-world semantics. For example, I had an originalBitrate variable but there was no way of telling if this bitrate is specified in bits per second, bytes per millisecond or maybe kilobits per second. Or I had a VideoStart property and I had no idea if it is specified in milliseconds or PCR units (basic time unit in transport streams) or even as a byte offset from the start of the transport stream. I had to look into my documentation or even into the code to see how the entity in question was initialized.

That was clearly Not Good and I needed a better way. I started decorating names with descriptive suffixes.

Nowadays I'm using originalBitrate_kb_s and VideoStart_PCR and I can immediately tell that former is stored in kilobytes per second and latter in PCR units. I'm using this approach any time that simple entity name is not enough to describe its contents. For example, I'm using _pct suffix when variable holds a percentage of something and _UTC when TDateTime field contents are stored not as a local but as an universal time and even _ref when a pointer/object variable is not an owner of some data but only holds and external reference to it.

While suffixes are good for documenting purposes they also improve the code readability. I can immediately tell that assignment bitrate_b_ms := someOtherBitrate_B_s is wrong and that someTime := otherTime_UTC is wrong or suspicious. I can even tell that the formula Result := base_PCR + MSToPCR(offset_B / bitrate_B_ms) makes some sense. [B] / [B/ms]  gives [ms], which MSToPCR somehow converts to [PCR] units which are then added to some base timestamp, also stored in [PCR] format. Check.

Decorators can be helpful, but still you should use them sparingly. Using a variable bitrate_kb_s_unverified_data_reported_from_external_dll is not such a good idea. Comments are still useful when you have to document that level of semantics. 

Technorati tags: , ,

Wednesday, April 11, 2007

The most important Delphi setting

I have my favourite Delphi setting. A setting that is not enabled by default, but which I always turn on. A setting that many developer's don't use and many can't live without. A setting that you should use. Always.

I'm not telling you which setting this is. Yet.

Look at this code fragment first.

  i := 1;                
while (s[i] <> s[i+1]) and (i < Length(s)) do
i := i + 1;
There is a problem in this code. Can you spot it?

This small fragment accesses s[Length(s) + 1], which is an undefined value and may even lie outside of allocated virtual memory and cause an Access Violation during program execution. This happens because of a simple bug - tests in the while loop should be reversed so that i < Length(s) proposition is tested first.

  i := 1;                
while (i < Length(s)) and (s[i] <> s[i+1]) do
i := i + 1;

The really big problem with this code is that Delphi doesn't report any error when you run it. At least with default settings it doesn't.

And now we came to my favourite Delphi setting - Range checking.

Select Project.Options and check Range checking.

Click OK and rebuild the project. Run it again.

See! Delphi can find such errors, it just needs a little motivation!

I strongly recommend to enable range checking for all new projects. Close all projects in Delphi, then select Projects.Default Options.Delphi for Win32 and check Range checking (in BDS 2006; instructions for other IDE versions may differ).

If range checking is so good, then why is it not turned on by default? I suspect that it was originally disabled because of speed issues on slow CPUs. Range checking can quickly add 10% to the execution time if your code uses string and array indices a lot. Nowadays this is hardly noticeable but by enabling range checking, old code may break in very interesting ways. Especially if it contains some range-related bugs ...

If you want to execute some time-critical code without range checking, you can put {$R-} before it (to disable rangle checking) and {$R+} after. Or, if you like more self-descriptive commands, {$RANGECHECKS OFF} and {$RANGECHECKS ON}. And please don't try to guess what the time-critical part of your program is, use a profiler.


Technorati tags: , ,

Friday, April 06, 2007