Fractus Development: November 2010

Rarely in the development of software is there a single technique which both boosts the productivity of the programmer and the efficiency of his implementation. One very generic issue which is applicable to any non-trivial program is serialization -- how to represent an arbitrarily complex data structure as a series of bits, which can then be written to disk, socket or other I/O system, then be reconstructed later. This is made complicated by the common requirement of differing source and sink points for the serialized data, especially in network programming. Various endianess and language choice present tricky challenges.

XML achieves the former goal -- boosting the productivity of the programmer -- through its myriad libraries available for nearly any popular language which provide methods for manipulating documents as an object model (DOM - "Document Object Model") or simply as a stream of its elements (SAX - "Simple API for XML"). Each "record" -- an intentionally vague term -- is intuitively serialized as an element and its children, allowing for representation of data as a tree (i.e., 1:N), or as a row (i.e., 1:1). As a simple example, however, consider an instance of a model for some CD (less the songs' musical data):

 <cd artist="Elliott Smith" title="Figure 8">

<track title="Son of Sam" index="1" bitrate="160" duration="3:04" />

<!-- the other tracks go here -->

</cd>

Before such an extensible (no pun intended) and widely relevant standard existed for serializing data, especially before extremely powerful commodity computers existed at low cost, programmers resorted to binary protocols that they designed themselves, on a case-by-case basis. Using XML, a programmer would end up generating the text 'index="..."' for each record. Because a CD wouldn't contain more than 256 tracks, this could easily just be represented as a single byte at a predetermined offset from the start of the record. Yet in a unicode encoded XML file, 'index="255"' would require 11*2 = 22 bytes. Not very efficient.

XML has other drawbacks. Who is to say that a certain piece of data which has a 1:1 relationship with its parent element should be an attribute and not a separate child element? Both would produce logically valid solutions. The attribute, or child element as the case may be, is a logical part of its parent. Having a separate child element would, later on, allow for the child element to have its own child elements if this were necessary. But just using a single attribute would be more efficient. Then, if a programmer changed the definition, would other programmers be happy with writing separate parsers for different versions of a similar document? If a Document Type Definition is not made for a given XML subset (as my experience has shown is often the case), how does a programmer know what elements and attributes can be added to generate a valid document?

These issues with XML are usually trivial, depending on how the XML is used. Representing an image as
<image name="hello.jpg"><row number="0"><pixel red="0"... would, of course, be ridiculous. However, even representing a very large CD collection may be practical with XML, because disk space is cheap and programmers' time is not. JSON offers a (potentially) much more efficient solution to nearly the same kinds of problems. But what about serializing data which is already binary? The typical solution is to use Base64 encoding, or similar, which isn't terribly efficient compared to writing raw bytes in an agreed-upon endianness.

Protocol Buffers, a very innovative and extremely useful solution developed by Google for their internal serialization needs offers a great alternative to text-based serialization without significantly compromising efficiency.

Protocol Buffers have also dramatically improved my productivity with implementing the Fractus protocol. I will jump right in with an example of the definition of a simple message. This is how I represented a public key to be sent over a socket:



message PublicKey {

    required string encoding = 1;

    required bytes public_key = 2;

}

A detailed explanation of the Protocol Buffer definition language can be found on Google's Protocol Buffer website. It's very straightforward to follow.

This way, the public key itself can be in any format, though X.509 is the representation used now. The encoding is simply "X.509" in Unicode. Very efficient and yet still extensible.

Using this definition is extremely simple. "protoc" is the Protocol Buffer compiler provided by Google. It's even obtainable through Aptitude if you're on Ubuntu - `sudo apt-get install protobuf-compiler` will provide you with the utility needed to turn this into Java, C++ and Python with (if you wish) a single command. Of course, this utility can also be downloaded directly from Google at the link provided above.

Each generated message comes with its own "Builder," which includes components for requisite parts of the message, such as enumerations (which naturally differ by each languages' native representation). Then it's as simple as using the builder to construct a message, then serializing it, eventually to an array of bytes, which can then be deserialized by the receiver in any language without much hassle.

Here's how I made a public key message:



        ProtocolBuffer.PublicKey pk =

                ProtocolBuffer.PublicKey.newBuilder()

                .setEncoding(em.getEncodingFormat())

                .setPublicKey(ByteString.copyFrom(em.getPublicKey()))

                .build();



        ByteArrayOutputStream os = new ByteArrayOutputStream();

        pk.writeTo(os);

The "em" object is the "encryption manager" which, among other things, keeps track of the user's public key data. "os," the ByteArrayOutputStream, now contains a cross-platform, serialized version of our public key. There's still one problem.

Protocol buffers natively only supports an RPC interface -- not implementation. So if you are using the messages as part of a network application, as I am, you have to do that part yourself. I chose to simply write my own because the RPC interface was very limited. It's a wrapper of the serialized protocol buffers which includes its own message type descriptor, which Google does not include. It's simply two bytes, prepended to the message, which is later used on the receiving end. For details, see the FractusMessage class in my Github repository. The client receiving the messages first reads these two bytes, and based on the current state of the connection (such as whether it's authenticated, and to whom it is), the message is sent to a strategy object which determines what to do with it, like change the icon in the buddy list, or display an instant message.

Protocol buffers can dramatically increase your productivity as a programmer and decrease the amount of resources your applications use -- in networking, memory and CPU. Benchmarks are widely available and usually place protocol buffers well ahead of any competition. They were a clear choice for Fractus, which transfers text data and will transfer audio and video in the near future as well. Using a uniform system for serializing all such kinds of data is important for writing clean, maintainable and high-performing code.

Fractus Development

Tuesday, November 16, 2010

How I used Protocol Buffers in Fractus

Monday, November 1, 2010

Netbeans GUI Builder Bug in Sun JVM

Protocol Buffers

Fractus Development Blog is now live!