Saturday, September 30, 2006

There's no firewall like PF

[No Delphi content here, sorry.]

If you are, by any chance, using Open/FreeBSD for firewalling, be sure to read three excellent articles on PF writen by the Daniel Hartmeier. Yes, the maker of PF himself.

Firewall Ruleset Optimization

Testing Your Firewall

Firewall Management

Tuesday, September 26, 2006

GpStructuredStorage internals

[This is a second article in the 'embedded file system' series. If you missed the first part, you can read it here.]

GpStructuredStorage compound file is organized in 1 KB blocks. First block contains header, then the content alternates between a file allocation table (FAT) fragment block and 256 blocks managed by the preceeding FAT fragment. Each block can be represented by a number - header block is block #0, first fat fragment is block #1 and so on.

[header:HEADER:1024]
[fat entry:FATENTRY:1024]
256 x [block:FOLDER/FILE:1024]
[fat entry:FATENTRY:1024]
256 x [block:FOLDER/FILE:1024]
...
[fat entry:FATENTRY:1024]
<=256 x [block:FOLDER/FILE:1024]

Header starts with a signature, which must always be 'GpStructuredStorage file'#13#10#26#0.

HEADER
[signature:32] // PChar
[unused:964]
[storage attribute file:4]
[storage attribute file size:4]
[first FAT block:4]
[first unused block:4]
[root folder:4]
[root folder size:4]
[version:4] // storage system version
Header ends in:
  • block number and size of the internal file containing global storage attributes
  • block number of the first FAT block (always 1)
  • block number of the first unused file block
  • block number and size of the root folder
  • storage system version (at the moment $01000200, or 1.0.2.0)

Each FAT fragment contains 256 32-bit numbers, one for each of the 256 following file blocks. This number is either 0 if block is last in the FAT chain, or it is a number of the next file block in the chain. Each block is linked into exactly one chain. It can either belong to a file/folder or to a empty block list. Address of the first block in the empty list is stored in the header ([first unused block] entry).

FATENTRY
256*[next block:4] // next-pointers for this block; 0 = unused
FAT structure defines the limitations of my file format - last block must have number that is less than 4294967294. As block are 1 KB in size, total file size cannot exceed 4294967294 * 1 KB, which is slightly less than 4 TB. Enough for all practical purposes, I think.

Folders are simple - each folder is just a file. It contains file information records and is terminated with two 0 bytes.

FOLDER //potentially split over several blocks
[FILE_INFO]
[FILE_INFO]
...
[FILE_INFO]
[0:2]
File information record is a variable length record containing file name (up to 64 KB), attributes, length, and address of the first file block (additional blocks are reached by following the FAT chain).
FILE_INFO
[file name length:2]
[file name:1..65535]
[file attributes:ATTRIBUTES:4]
[file length:4] // 4 GB per file
[first file block:4]
At the moment, only two attributes are defined. One specifies that file is actually a subfolder, and another designates a special file containing file attributes (for discussion of attributes see the previous article).
ATTRIBUTES
$0001 = attrIsFolder
$0002 = attrIsAttributeFile
That's just about everything that is to tell about the compound file format. Armed with this knowledge, one can easily write a compound file browser/repair utility. 

Friday, September 22, 2006

Writing an embedded file system

Once upon a time, Julian M. Bucknall wrote an interesting article for The Delphi Magazine. Well, he wrote more than one and they are all interesting and (that's the best part) he's still writing them, but I particularly liked that one.

The article name is Manage The Damage (it had to be Julian's Jimmy Somerville phase) and it was published in Issue 69 (May 2001). Yes, quite an old stuff - at least in computing terms. Inside, Julian described an implementation of an embedded file system i. e. a file system that rests inside another storage (a file in an ordinary file system, for example). That's such a useful concept that even Microsoft recognizes it (think OLE structured storage).

In short, embedded file system (or compound file, or structured storage) allows you to store many unconnected or loosely connected pieces of information inside one physical file in a structured way. For example, it can be used to store different parts of configuration settings for a large application, or it can be used for data acquisition to store many data packets inside one file, or it can be used as a simple database to store text fragments or code snippets.

Although I liked the idea behind Manage The Damage, I couldn't live with the implementation. It was too limited for my taste (embedded file system was limited to 32 MB) and implementation was a mess of pointers, which made the code almost nonportable to .NET.

And then, as all fairy tales start, I decided to write my own implementation.

Why not use OLE structured storage, you'd ask? Well, I like to know how my data is stored (Have you ever tried to look inside an OLE structure storage file with a hex editor? I did. Not a pretty sight.), I want the implementation to be fast and simple to use from Win32 and potentially .NET. Besides that, it sounded like an interesting challenge.

So how did I fare? Good, if I'm the one to answer. There were no pointers killed while writing the code, total size of the file system is limited only to 4 TB (4 GB for files stored inside the compound file) and file internals are easy to understand (well, at least to me ;) ).

The code was used in some commercial projects. Also, GExperts use it for snippet storage (CodeLibrarian.fs file). It seems to be stable and mostly bug-free and as such I'm releasing it to the public, with the usual string attached.

For the brave, here's the code and test suite: GpStructuredStorage.

If you're still interested, continue reading. I'll show few examples on how the GpStructuredStorage can be used.

A small how-to

Next fragment creates compound file and then reopens it for reading. Note that the compound file is implemented as an interface and doesn't need explicit destructor calls as such.

Instead of a file name, one can also send a TStream or descendant to the .Initialize method.

storage: IGpStructuredStorage;
storage := CreateStructuredStorage;
storage.Initialize(CStorageFile, fmCreate);
// write and read here
storage := CreateStructuredStorage;
storage.Initialize(CStorageFile, fmOpenRead);
// from now on, only reading is allowed

Now that we have the storage interface, we can create a file and then read it.

strFile: TStream
strFile := storage.OpenFile('/folder/file.dat', fmCreate);
try
// write to strFile
finally FreeAndNil(strFile); end;
strFile := storage.OpenFile('/folder/file.dat', fmOpenRead);
try
// read from strFile
finally FreeAndNil(strFile); end;

There is no need to create a /folder manually - every file access automagically creates all required folders.


Still, you are free to do it the old way.

storage.CreateFolder('/folder/subfolder');
if not storage.FolderExists('/folder/subfolder') then
//panic

Of course, there is also a FileExists function.


File enumeration is simplified to the max.

files: TStringList;
files := TStringList.Create;
try
storage.FileNames('/folder', files);
finally FreeAndNil(files); end;

(To enumerate folders, one would use FolderNames instead of FileNames.)


Additional information on file or folder can be access via FileInfo property:

FileInfo['/full/path/to/file.dat']


Currently, FileInfo only exports file's size (FileInfo[].Size) and file attributes (FileInfo[].Attribute).


Attributes offer you a way to store additional string info for each file and folder. Unlimited number of attributes can be stored and the only limitation is that both attribute name and value must be stringified.


storage.FileInfo['/folder/file.dat'].Attribute['author'] := 'Gp';

At the end, I must mention that it is also possible to Move and Delete files/folders and Compact (defragment) the file system.


If I have convinced you, go and use the stuff. If not, wait for the next episode.


Next in the embedded file system series




  • Internals of a GpStructuredStorage file
  • Total Commander plugin to look into GpStructuredStorage files.


Coming soon to a feed near you.

Specialization

A human being
should be able to change a diaper,
plan an invasion,
butcher a hog,
conn a ship,
design a building,
write a sonnet,
balance accounts,
build a wall,
set a bone,
comfort the dying,
take orders,
give orders,
cooperate,
act alone,
solve equations,
analyze a new problem,
pitch manure,
program a computer,
cook a tasty meal,
fight efficiently,
die gallantly.

Specialization is for insects.

- Robert A. Heinlein

 

UPDATE: Fixed the author (sometimes I'm soooo stupid).

Thursday, September 21, 2006

Sir! Do you need a list? Cheap, just for you!

Actually, it is free and comes with only one string attached.

I have just uploaded new version of my lists unit. At this moment it contains ten useful classes:

  • TGpIntegerList is a TList-compatible class used to store integers.
  • TGpInt64List is a TList-compatible class used to store int64 numbers.
  • TGpIntegerObjectList is a TList-compatible class used to store integers and associated objects.
  • TGpIntegerObjectList is a TList-compatible class used to store int64 numbers and associated objects.
  • TGpCountedStringList is a TStringList descendant that has each string item associated with an integer counter (internally stored in the Objects property).
  • TGpTMethodList is a list of TMethod records.
  • TGpObjectRingBuffer is a fixed-sized ring buffer of TObject references.
  • TGpObjectMap is one-dimensional array, indexed by objects and containing objects.
  • TGpObjectObjectMap is two-dimensional array, indexed by objects and containing objects.
  • TGpDoublyLinkedList is doubly-linked list of TGpDoublyLinkedListObject descendants.

Abuse it at will. Additions welcome.

Wednesday, September 20, 2006

Failed to find standard type 'TObject'? C'mon, DevCo!

I have just installed BDS Hotifx rolloup - the one that everybody else on the planet is blogging about - and now my BDS claims that I don't know how to write a class!

C'mon, DevCo, you can do better than that!

 LATER: OK, third restart of BDS helped. No idea what's going on but I certainly hate it when syntax checker (or whatever they call it) goes amok.

Monday, September 04, 2006

If you'll be visiting Amsterdam this weekend ...

... you can meet me on IBC. Most of the time I'll be on stand 2.111. Look for the guy named Primoz - that's me.

Saturday, September 02, 2006

Can't register already registered component

 

Few days ago an avid Delphi programmer posted this image on the Slovenian Delphi forum:

I've been seeing similar errors from Delphi 5 times. This error usually appears when package in question uses some units that are also used by another (already loaded) package and this second package is not listed in the requires section of the first package.

Following steps helped me to solve such mysteries in the past:

  • Uninstall all custom packages.
  • Restart Delphi.
  • Open each custom package, rebuild and install it. During that process, Delphi usually notices the problem mentioned above, warns about it and updates the requires list.

The other approach that may help is a batch file that rebuilds and reinstalls all custom packages. But that is just a workaround if you can't find the real cause of the problem.

BTW, the colleague has found the problematic package but not by following my procedure (he did try it but without success).

BTW2, he is also an author of interesting open-sourced mail read si.Mail.