An Identifier by any Name

Gerry Tyra - April 2015

Abstract:

When any message is transmitted, there has to be some mechanism to inform the receiver what type of message is coming in. Without some mechanism to identify the message type, the message is just digital noise. But, the mechanism used does not need to be immediately readable by a person.

Introduction:

A newly received message must be identified before the contents can be correctly interpreted. There are many ways to make this determination, some being more flexible than others.

Rigid, Flexible, Extensible, or Brute Force:

First let's address and then dismiss sorting by time and space. It is possible to designate a pairing between the network port number and the message type. But, there are only so many ports available, and the mapping must be rigorously maintained. Similarly, the data stream can be time domain multiplexed, but with the same management issues. Both of these methods have been used and are viable in cases where a limited number of messages are planned. However, as the projected complexity of the messaging system increases, explicitly designating the message type in some sort of header becomes the preferred approach.

Part of the issue is how much data to put into the header. A bloated header can become a major limiting factor in the system, consume more network resources that the message that it represents.

Another consideration is the number of attributes that the message identifier needs to differentiate. Based on previous discussions, the minimum set is a message identifier, derived message version and revision number. As a matter of general management, providing a group designation is also useful to differentiate between messages with fundamentally different applications.

Bits and Bytes

Given a number of bits, there are only a set number of messages that can be uniquely identified. As a general rule, a header will have a fixed size and interpretation. The representation will usually be properly word aligned. Failing to word align increases the overhead cost of processing each header received. At the same time, if the header is too long, there is a network bandwidth cost associated with transferring the messages.

If a byte is used as the basic element, the group, message ID, derivation and revision will require 4 bytes. But, the number of derivations and revisions may never approach the 255 element limit of a byte designator. Yet there may be well over 255 messages types, with or without, allowing for a group designation.

Sparse Population:

Four bytes would result in over four billion possible message definitions. The reality, even for a mature system would more likely be in the thousands. The result is a sparse matrix of usable messages. In this case, a vector is wasteful of space, while direct enumeration is extremely cumbersome and difficult to maintain.

The alternative is to apply maps to the numbering system. A number of maps were already used in the prototype code as a matter of practical coding. Using maps opens the identifier to a much broader range of identifiers at only a small organizational cost.

As a simplification, consider using an unsigned short (16 bits) for the group, message, derivation and revision. First, this provides 4 billion base message identifiers. While this isn't infinite, compared to the JTIDS message set for Link-16 it might as well be. And while it is unlikely that any message definition will survive long enough to go through 65000 derivations or revisions, having the greater range simplifies the issues of roll over in the value.

But, let us approach the problem from a different direction: use a 32 bit integer as the main ID, but map each active value to a quadruplet of values representing the group, message, derivation and revision.

The Keepers of the Numbers:

In any messaging system, some accepted “authority” has to control the allocation of message numbers. After all, two different developers can not create two different messages with the same ID. Such a conflict is not acceptable under any circumstances.

Given that there must be an assigning authority, it is simple enough for the authority to provide the main ID, as well as a mapping. It would also be possible to allow some level of delegation of this authority by giving blocks of main IDs to a secondary authority that is, for example, responsible for a specific message group.

Summary:

Mapping functions are needed to manage spare message numbering systems. This opens the possibility of using much larger ranges for some of the identifying values without having to resort to significantly larger IDs in the message headers. A Map will go from a sparse integer to a target message class as easily as it will from concatenated values that result in a 64 bit number.

The allocating authority is functionally the same, and the mapping is not any more complex.