July 26, 2022

Thought on the Future of Computers

I am going to make a prediction about the future of computing, not in a distant future but more likely what might be in a few decades. It's also not a particularly creative prediction, because the process that might eventually lead to it has already started. Yet it's interesting to imagine, so here it is:

Nowadays, the core of a computer is the CPU, the unit which runs the programs that you execute, performs arithmetic operations, memory access etc. I envision the future of processing to be separated into 4 distinct units, perhaps bundled into a single chip, perhaps each being provided by a specific cards. Ideally, these units could be accessed and operated within a single instruction set, allowing programmers to combine them in complex and efficient programs. The units would be:

  • The logic unit (LU). This is more or less just an evolution of the CPU, just as minimal to run basic programs as it is today. Its purpose is to offer implementations of arithmetic and logic operations, with predictable and consistent results. The parts of this "unit" nowadays are the ALU (arithmetic logic unit) and the FPU (floating-point unit), which I imagine will be more integrated into the CPU in the future. The support for vectorization is also going to increase, and full support for other "datatypes" like strings might be added.
  • The analog unit (AU). In contrast to the LU, the results of this unit are unpredictable and inconsistent, but highly efficient at the same time. With analog operations, one could perform complex arithmetic tasks like matrix multiplication, solving equations, or evaluation of complex functions which would otherwise take lots of operations and CPU cycles. By chaining basic operations, one could build a "program" operating on currents and voltages, and let the universe itself execute it. While the results may be imprecise, it doesn't matter at all in some applications like video playback, and the results could still be made more precise through the LU, the result of the AU serving as an initial imprecise but sufficiently accurate guess.
  • The highly-parallelized unit (HPU). This is an evolution of the GPU, able to perform operations on large-scale data, organized in a discrete vector space. It is suitable for the same tasks as today, like graphics processing or AI, thanks to its ability to run a single program in parallel (such as a texture mapping function, or a single layer in a neural network) over an immense number of spatially-connected inputs.
  • The quantum unit (QU). The future of quantum computing is not going to replace every computer with a qubit-based one, but once it becomes feasible, quantum units would provide everything needed to solve problems in ways that are simply impossible with traditional algorithms. The qubit exists in a superposition of 0 and 1, the values of a traditional bit, manipulated using quantum extensions of standard logic operations, and even some quantum-specific operations. A computer performing such operations would in a way be present in multiple, possibly infinite, different states simultaneously. We can then let the desired state interfere with the whole computation, allowing us to extract it at the point of measurement. Algorithms taking advantage of this fact have uses in cryptography, such as Shor's algorithm for factoring large integers (and breaking some cyphers as a result), but some general computing problems could be "miraculously" solved using quantum computing as well, such as Grover's algorithm being able to perform search on a collection that is proportional in its speed to the square root of the size of the collection (instead of simply linearly proportional to the size as one would expect), theoretically also possible to improve further to the cube root.

These are the possibilities that I think would be most likely, but there are other possibilities. Other units could be envisioned, for example a neural unit better suitable for modelling neural networks (but I think this can be spread over HPU and AU), or a hypercomputational unit which would, theoretically, be able to solve uncomputable problems, but at the moment this is more of a fantasy than a possibility, as we lack the knowledge of physical process should would enable us to build such a unit (unlike quantum computing).

June 14, 2022

RDF-star and prototypes

Updating the RDF specification to create a new version is not an easy task. While getting new features added in is (relatively) not that hard, the cloud of related specifications, such as SPARQL, SHACL or OWL, is pretty large, and the whole ecosystem of applications and databases made to work with it is even larger. As such, getting new features actually out in the field is definitely not trivial, and some of the issues that may arise as part of that are still apparent today ‒ RDF 1.1 came out in 2014, while the latest version of SPARQL is still from 2013, and as a result, plain literals in SPARQL are still not quite exactly the same as string literals, even though RDF 1.1 doesn't make the distinction anymore.

A different approach is to develop a new version sideways, i.e. create a variant of RDF just with your own favourite new feature. This is similar to how XML is defined, as a set of specifications whereby the core XML specification, to the surprise of many, doesn't actually define anything related to namespaces, xml:lang or datatypes. The folks at RDF did a similar thing and created RDF-star, which is why this flew under my radar for so long.

April 1, 2022

The Glory of Semi-Structured Data

A not-so-recent trend in information storage has been an increase of use of an interesting blend of structured and unstructured data. Such a style of expressing information is convenient for users, and with the advent of neural networks and prompt-based programming, also manageable for automatic processing.

March 12, 2022

Delphi form data (TPF0) binary format

Delphi uses a Pascal-like syntax for .dfm files which are meant to store data about forms and various controls, but also other resources and miscellaneous data. These files are compiled to a special binary format and then used to construct and initialize the forms when the application starts.

I wanted to parse these files to retrieve some of the more useful data therefrom, but I wasn't able to find any documentation about the format, so I analyzed it myself and that means you don't have to. What follows is therefore a description of the format and the data model is stores. Some basic knowledge of principles of data storage is expected.

February 14, 2022

The Models of CONCURrent SGML

Nowadays, the old glory of SGML is almost forgotten as it got replaced by XML in all but a handful locations, but the language itself still has some very impressive capabilities that, I dare say, might yet make it a viable option for some very specific cases in the future. One of those features is "CONCUR", a.k.a. the original namespaces. This comparison is inaccurate of course, so I will first take a few paragraphs to describe what this feature is, and how it relates to the whole SGML ecosystem.

December 25, 2021

Identifiers for some file formats

I managed to find a gold mine when it comes to URIs and identifiers: the DocBook notations module. DocBook is an older SGML/XML-based standard developed by OASIS used for writing documentations, and with its DTD comes a module that “imports” several then-common notations for use with unparsed entities.

December 21, 2021

What Flavour is Your Function?

One programming article worth reading is definitely What Color is Your Function? by Bob Nystrom. A good old rant about the design of several programming languages, one that separates them into two groups of good languages and bad languages, depending on a presence of a certain feature.

The gist of the article is describing a feature called “colored functions” and then analyzing languages depending on whether there is some mechanism akin to this. The presence of a callback mechanism for asynchronous programming is basically the only application of these colored functions, but I believe pointing out other similar mechanisms is worth doing.