March 14, 2021

URI scheme for identifying linked data entities by their identifier

It is a good practice in RDF to identify entities with a URI scheme that may allow one to eventually arrive at a description of the identified entity. However, the individual URI patterns aren't semantical, and there isn't a uniform standard that would allow identifying an entity within a particular dataset. This is an attempt to devise one. The resulting URI should behave like a "link" that connects to the dataset and finds the actual URIs used for the entity. In theory, it could even be used directly to identify the entity itself.

Identifying resources works by traversing inverse functional properties. This kind of a property (an instance of owl:InverseFunctionalProperty) behaves like a function from its range to its domain. Thus a particular value assigned to the property serves as its primary key which, when used with the property, uniquely identifies the resource.

March 13, 2021

More about datatypes in RDF

Datatypes in RDF are arguably the most confusing and also underutilized tool in RDF. To understand datatypes, we first need to understand literals.

Literals

While RDF and XML are very different in many aspects, they share some core concepts. XML at its core doesn't really have standard datatypes like you'd find in many programming languages and other data languages like JSON. Instead, you have text (or character data if you will) which might get a specific meaning via other facilities, but to an external observer, everything is only text (or whitespace if you want to go into details).

RDF is very similar to XML in that a literal is simply a piece of (character) data. Unlike XML however, it is also possible to assign a datatype to the literal. The notion of a plain (untyped) literal was changed somewhat in RDF 1.1, making xsd:string the implicit datatype (more on that later). Specific serialization formats may define syntax for other common typed literals, such as numbers or booleans, but all of them are still backed by text. Thus a literal is simply a piece of text, optionally with something that identifies its datatype (we forget language-tagged literals for now).