March 12, 2022

Delphi form data (TPF0) binary format

Delphi uses a Pascal-like syntax for .dfm files which are meant to store data about forms and various controls, but also other resources and miscellaneous data. These files are compiled to a special binary format and then used to construct and initialize the forms when the application starts.

I wanted to parse these files to retrieve some of the more useful data therefrom, but I wasn't able to find any documentation about the format, so I analyzed it myself and that means you don't have to. What follows is therefore a description of the format and the data model is stores. Some basic knowledge of principles of data storage is expected.

February 14, 2022

The Models of CONCURrent SGML

Nowadays, the old glory of SGML is almost forgotten as it got replaced by XML in all but a handful locations, but the language itself still has some very impressive capabilities that, I dare say, might yet make it a viable option for some very specific cases in the future. One of those features is "CONCUR", a.k.a. the original namespaces. This comparison is inaccurate of course, so I will first take a few paragraphs to describe what this feature is, and how it relates to the whole SGML ecosystem.

December 25, 2021

Identifiers for some file formats

I managed to find a gold mine when it comes to URIs and identifiers: the DocBook notations module. DocBook is an older SGML/XML-based standard developed by OASIS used for writing documentations, and with its DTD comes a module that “imports” several then-common notations for use with unparsed entities.

December 21, 2021

What Flavour is Your Function?

One programming article worth reading is definitely What Color is Your Function? by Bob Nystrom. A good old rant about the design of several programming languages, one that separates them into two groups of good languages and bad languages, depending on a presence of a certain feature.

The gist of the article is describing a feature called “colored functions” and then analyzing languages depending on whether there is some mechanism akin to this. The presence of a callback mechanism for asynchronous programming is basically the only application of these colored functions, but I believe pointing out other similar mechanisms is worth doing.

December 12, 2021

XML Lite (definitely not XML 2.0)

Years after the conception of XML, it had been customary for programmers to point out its perceived flaws and suggest fixes to the language. This trend has somewhat declined in recent years, as developers have either learnt to live with XML, or discovered that they can use something else.

I do not share these feelings towards XML. In many cases, I find it suitable for various purposes, and I think many cases of criticism stemmed from misunderstanding. Yet even I would like to add some features to the language, based on my experience. Let's take a look at them.

November 29, 2021

Finding the UUID for almost anything

Universally Unique Identifiers are a nifty way of obtaining identifiers for resources, objects, or concepts, without the need for a central assigning authority. Arguably the largest public use of UUIDs is from Microsoft's products, where they (known as GUIDs) identify classes or interfaces within COM, for example 450d8fba-ad25-11d0-98a8-0800361b1103 identifies the My Documents folder, accessible via shell.

What's less known is the fact that UUIDs have a specific structure and are not necessarily composed of random numbers. The earliest generated UUIDs used time as one guarantee of uniqueness (the UUID in the example above was created in 1997); these are version 1 UUIDs. Nowadays, you can still use them if you want to preserve the creation time in the identifier, but the most common are version 4 UUIDs that consist almost entirely of (pseudo-)random bytes.

Usually, one associates the generation of UUIDs with some random process that produces different identifiers each time it is invoked. A less known version of UUIDs, however, makes it possible to produce identifiers for certain resources deterministically, that is based solely on some input data and producing the same result each time.

July 10, 2021

HTTPS redirection in Apache HTTP Server, properly

HTTP is the standard text-based protocol for retrieving documents from web servers, but, like most net protocols, has the unfortunate issue of using a transport method that is insecure. The packets that correspond to an HTTP message may be observed by routers before they reach their destination, and the whole message may be reassembled and inspected by malicious devices. Thus plain HTTP is commonly wrapped in TLS, resulting in HTTPS wherein messages are encrypted and can be decrypted only by the actual endpoints.

HTTPS is not absolutely necessary, however. As a user, you don't need it when browsing static content, unless you care about anyone in the way not knowing what browser you use and what content you view, or you want to verify the identity of the server. HTTPS is good to have as there will always be users that need it, but in these cases, it shouldn't however be forced upon other users, as it does bring some inconveniences. You have to present a valid certificate, it has to be signed, it cannot be expired, and it is always possible the version of TLS you use becomes obsolete. In any of these cases, all web clients start warning the user or stop the website from being accessed at all.

This is where Upgrade-Insecure-Requests steps in. It is a header sent by all modern browsers, informing the server that the client wishes to use HTTPS whenever possible. It can be turned off however, and most other tools (command line-based or in programming languages) do not send it by default, so you can specify the protocol by yourself freely.