Delphi uses a Pascal-like syntax for .dfm
files which are meant to store data about forms and various controls, but also other resources and miscellaneous data. These files are compiled to a special binary format and then used to construct and initialize the forms when the application starts.
I wanted to parse these files to retrieve some of the more useful data therefrom, but I wasn't able to find any documentation about the format, so I analyzed it myself and that means you don't have to. What follows is therefore a description of the format and the data model is stores. Some basic knowledge of principles of data storage is expected.
The file's header consists of the 4-byte signature (TPF0
in ASCII), followed by a string that identifies the type of the object, and a string containing the name of the object. All strings are stored in a variable-length format, prefixed by their length as a single byte and with no terminating characters. Some of the stored strings (like the first two) are also symbols, meaning they refer to existing entities within the program. In that case, their name should be a valid identifier (a letter or underscore followed by letters, digits or underscores) or multiple identifiers joined by a dot (.
).
The main object's name is followed by its data properties. Each property starts with its name as a symbol, followed by its value. If the name is empty, there is no value and there are no more properties. The value of the property starts with its type identifier, expressed as a byte with a specific value (little-endian). These are the types that I encountered, and how they may be mapped to .NET types:
0x02
: signed byte (1 byte);System.SByte
.0x03
: signed int16 (2 bytes);System.Int16
.0x04
: signed int32 (4 bytes);System.Int32
.0x05
: double-precision float (8 bytes);System.Double
.0x06
: a string;System.String
.0x07
: a symbol;System.String
.0x08
: the valuefalse
(0 bytes);System.Boolean
.0x09
: the valuetrue
(0 bytes);System.Boolean
.0x0A
: a binary blob, prefixed by its length (int32);System.Byte[]
.0x0B
: a list of symbols, terminated by a zero-length symbol;System.String[]
.0x0D
: the valuenull
(0 bytes);System.Object
.0x0E
: a list of items, each item is prefixed by its type, either0x00
ending the list, or0x01
for a dictionary (no other item types known), followed by a standard list of properties;System.Collections.Generic.Dictionary<string, object>[]
.0x01
: a list of typed values, the form is same as a property value;System.Object[]
.0x00
: terminates0x01
(0 bytes).
The list of properties is followed by a list of nested objects, each starting with its type and name (as symbols), followed by its own list of properties and nested objects. If the nested object's type name is empty, the list of the objects should be immediately terminated.
Well and that's all to it, at least everything I was able to find. If you happen to find some other possibilities, feel free to let me know (with examples).
Since I intended to use this knowledge in .NET, here is some code that should be able to read these files and convert them to a tree-like structure. The result is mostly similar to JSON, with a few more types, and explicit names and types of objects, so perhaps YAML might be a better choice if you want to convert it to a standardized format.
No comments:
Post a Comment