February 22, 2021

Dynamic Self-contained Infinite XML

Imagine a window showing some sort of a feed. There are multiple panels displaying various messages as they arrive, plus some boxes that show statuses of other things, or the time, for example. This is a complex system usually thought of as calls to different APIs, various types of messages, presentation layer etc. What if I told you however that such a feat could be done even with only standardized formats, no client-side scripting and a single XML document and HTTP request?

Overview

We're gonna use HTTP as the method for producing the resulting document and its resources, MIME for describing the various dynamic parts of the document, and XML for combining them all together. The task of presenting the XML document to the end user is not relevant to this article (XSLT).

The technique relies on two standard technologies: multipart messages and external XML entities. Both are well described and standardized, so in theory any tool that fully implements them should be able to process the whole document. In practice, such a tool will most likely be hard to come by, as this is a combination of two advanced aspects of their respective technologies.

Infinite entities

An XML document may include other XML fragments

<?xml version="1.0"?>
<!DOCTYPE form [  
  <!ENTITY e1 SYSTEM "http://example.org/entity1.ent">
]>
<form>&e1;</form>

This document references and includes a file located at http://example.org/entity1.ent (should have type application/xml-external-parsed-entity). Usually, when a parser finds a reference to an external parsed entity, it downloads the resource and seamlessly transfers its content to the document.

This is enough to describe an infinite XML: if the server never terminates the content of the external entity, neither will the content of <form> ever end. As a result, the element keeps growing in size. Here is also the first caveat – SAX-based XML parsers will never reach </form>, and so the document will in a sense be never complete and anything after that will be lost. This requires DOM or XPath-based processing of XML documents, as the content can be lazily loaded when a visitor finds the external entity reference.

Dynamic entities

While infinite entities were in essense a natural extension of the XML specification, dynamic (i.e. changing) entities require the application to understand MIME. In this case, the server produces a stream of files, wrapped inside multipart/x-mixed-replace:

Content-Type: multipart/x-mixed-replace; boundary="boundary" 

--boundary 
Content-Type: application/xml-external-parsed-entity 

<time>16:35</time>
--boundary 
Content-Type: application/xml-external-parsed-entity 

<time>16:36</time>
--boundary 
Content-Type: application/xml-external-parsed-entity 

<time>16:37</time>
--boundary--
and so on

This content type is supposed to denote a stream of files where each new one that arrives should discard and replace the previous one. If an application understands this, it may theoretically reform the document every time a new version of the entity arrives.

This technique is also not without its issues – a simple parser may not understand the multipart type and could either reject the content or try to read the whole message as a single XML entity. This can be fixed by placing incorrect XML before the first boundary in an attempt to stop the parser, or using both the initial and final part of the message to produce a marker element for further processing.

Putting it all together

This already does the trick if the server is configured to produce these files from different locations, but it is also possible to wrap this in a single response.

For this purpose, message/related is used to denote that the individual files are interlinked. The first object should be the whole XML document (as application/xml).

Normally, any additional files should follow, but since we already have two potentially infinite streams, we have to break them into smaller parts and send them interspersed in the response stream.

The message/partial is used to break up a long file into individual messages. It is up to the server to determine the length of the segments, their order or priorities.

The individual files would then start with a Content-ID header, assigning them a unique identifier within the message. The cid: URI scheme is used in the definitions of the external XML entities.

All in all, the client should be able to handle the following:

  • Parse the XML document in a lazy structure; do not embed external entities.
  • Load the whole MIME message and produce "virtual" streams of the individual objects (referred to via cid).
  • Read content streams and buffer yet unprocessed data. Keep only the last fully constructed object in case of multipart/x-mixed-replace.
  • Navigating to an external entity reference shall either jump into its current value (but keep the reference) for multipart/x-mixed-replace, or start parsing the external XML fragment in its corresponding stream. New nodes should be prepended before the external entity, and further visits to it will continue parsing its stream.
  • Multiple references to the same entity have to read from the same stream, so either a buffer has to be present for every occurence of the entity reference, or all locations have to be tracked and updated at once.

Note that even the actions described here as modifying can be performed on a read-only representation of a document. Specifically in the case of .NET, it is possible to use the XPathNavigator to present a view of the full document but with parts dynamically constructed from other sources during navigation. It is also possible to lazily resolve the entities during parsing, with a custom XmlResolver that simply produces a custom processing instruction for every external entity reference.

No comments:

Post a Comment