Fallacy of Human Readability

Gerry Tyra - April 2015

Abstract:

People searching for greater productivity in software development, frequently people with little hands-on experience in the area they are controlling, push for improved human readability in the software and/or the messaging system. The argument presented here is that such an attempt is counter productive and, in some senses, delusional.

Introduction:

If you are a maintenance programmer trying to trace back into the code some obscure bug that managed to get through acceptance testing, you definitely want access to the source code with all of the comments. Similarly, if you are tracking message traffic through a system, for test or debug, being able to discern the message's content is useful. What is questioned here is the usefulness of having a messaging system working in a format that is “readable” (e.g., XML).

It isn't even One and Zeros:

The first fallacy is that a “human readable” message is actually human readable. The message going down a wire is a series of electrical differentials, usually encoded to improve reliability. Once received and at “rest”, the data is a different series of electrical potentials. These electrical states are interpreted in context as representing some binary value. These values are not just binary. Most flash memory devices store two or more bits per cell,

Once the binary value is available, a few people can understand the value in binary or hex format (a combination of practice and frequent use), but most people will need a further abstraction.

From Binary to Display:

For the moment, let us assume that we are looking at an XML message. We start with a string of bytes, which represent characters. The first question is if ASCII is been used for the representation, or some other standard? Once this is known, the data has to be marshaled and organized for display or printing. Then the data is translated to pixels. Only at this point is a person able to interpret the data.

Even if we step into the realm of science fiction, and a bit of science, direct interface to a brain requires several levels of interpretation and translation.

Who is Talking to Whom?:

By now, the reader has hopefully realized that direct human readability is an illusion. The rational question becomes how to most efficiently get data from the source to the “consumer”?

If we look at text on a web page, it is data prepared by a person, directly or indirectly, and is meant to be read by another person. Hence, the use of HTML or XML as a transport mechanism is reasonable. Similarly, the efforts to define a standard for digital text documents for long term archival storage has real merit.

On the other hand, the picture that pops up as part of the web page that you are looking at is not transported as an ASCII string. Rather is is sent as a HTML reference to a formatted picture file, which is stored and transferred as compressed binary.

Now consider a radar or barometric altimeter in an aircraft, assuming that it is digital at all. It produces a binary number representing the aircraft's altitude in some units. To levy a requirement at this point to provide a XML representation is both unnecessary and counter productive. The translation from a simple binary value to a XML representation is extensive. A friend described it as two native English speakers on the phone talking in French and German so that a possible Russian listener would have an easier time of understanding them.

Once the data is available is a recognizable message format, the flight control processors want the value, human readability is irrelevant. Similarly, several other systems on the aircraft will need the numerical value. And, yes, one of those “systems” is the pilot. But, the point here is that the pilot is only one of many, and the only one that has a real use for human readability.

Testing:

As a simple example, consider the readily available tool “Wireshark” for analyzing Ethernet packets. In its default configuration, it will present the various headers, with the ability to parse and sort by various aspects contained in the header.

It will also display the content of the data packet in both hexadecimal and ASCII. If the data payload of the message were in XML, and assuming that ASCII was used for the original encoding, the ASCII field would be directly readable. But, without any formatting, extracting understandable data will still be tedious.

However, Wireshark allows for plug-ins, which can enhance the interpretation of data packets. So, it is reasonable to expect that for a given project, someone would create a plug-in to make a XML message, based on a known schema, more readable.

However, if a “human readable” format needs a plug-in, and a compacted format can auto generate a comparable plug-in to expand a natively encoded message (see the accompanying paper “To
C or Not to C? That is the Question”) , what is the advantage of the human readable version?

Efficiency:

As has been explained in “An Alternative Approach to Avionics: KISS”, to much greater detail, data transfer efficiency is of critical importance to the operation of an aircraft. And that data transfer has to occur in an organized manner, as the consumer of the data has to understand the encoding standards of the producer of the data.

An established set of standards and processes are important, for without them, there can be no communications. At the same time, those standards must allow for timely growth and change. Without a mechanism for growth, the users of the any standard will find or create other standards to meet their immediate operational needs.

Forcing data into a standard with little option for change or growth results in excessive complexity as a result of compromises in the design (Link-16/STANAG 5516, is a case in point).

Summary:

Effective communications requires standards that can be implemented and are conducive to change/extension. At the same time, such standards must embrace the reality of the environment that the standards will be used in.

As such, it is argued that the limited virtues of a human readable format have no value or virtue, and excessive costs.