Monday, March 2, 2009

XML

Its an old topic. But watch out, it is just the beginning ...

1. The Got Milk Commercial


Imagine a Got Milk Commercial as follows.

"A grandma walking slowly in a super market with a shopping cart. Suddenly her cell phone rings. She pulls up the cell phone out of her pocket and flips the phone. The camera rooms in on the cell phone screen. It shows the message 'Got Milk?' with signature from the refrigerator."

It is a got milk commercial. But it also shows that we have entered the era of mass messaging. Every device with a network pulse can and wanted to communicate with each other effectively. The grandma's cell phone detected itself is in the super market, so it sends a message (through some messaging service) back home to the refrigerator. A few seconds later, the refrigerator sends back a message with the "Got Milk?" message.

What does the above has anything to do with XML? And why does the weird looking data format so attractive to almost every industry to store, present, and exchange information with it?

In short, it is due to the claimed (and accepted) advantage -- simple, open, extensible, and interoperable.

The message format in exchange for the Got Milk Commercial above is likely to be in the XML format.

So it would be politically incorrect to not stress the importance of considering XML as the packaging mechanism in the web messaging environment. Such as when exchanging messages in "Messaging in the Cloud" environment, the message itself would be more "open" if each message is packaged in the XML format.

"XML provides a basic syntax that can be used to share information between different kinds of computers, different applications, and different organizations without needing to pass through many layers of conversion."
-- wikipedia (http://en.wikipedia.org/wiki/XML).

XML has been widely used for the web. Such as HTML pages, RSS , and the Atom Syndication Format, etc.

For messaging in web environment, restful or not, XML should be evaluated and considered as the message packaging format until a better mechanism is developed.


2. Security Revisited.

In the previous article Security Made Easy, we showed some Java API to sign and encrypt a piece of data. We did not address how the pieces of information should be packaged. By now, you probably have thought about how the information can be packaged.

A high level rule is to have all the security information (message meta data), such as algorithms, parameters, secret key, signature value, encoding rule (for signature and encrypted data), etc. set as the child elements of the "header" element of the message.

And the encrypted (and encoded) data be set as a child element of the message "body" element of the message.

The "piece of data" to be signed and encrypted can (should) be isolated into a byte[] to avoid any XML canonicalization issues.

Specifications for XML signature and encryption are developed for quite sometime. But in practice, you may choose to define some simple elements as mentioned above to avoid issues (such as canonicalization) described at the links below.

http://en.wikipedia.org/wiki/XML_Signature
http://www.cs.auckland.ac.nz/~pgut001/pubs/xmlsec.txt


3. Base64 Encoding.

The XML signature or encryption output is often produced as a byte[]. The byte[] requires to be encoded before it is set as the value of an XML element. Base64 encoding/decoding is often used to "translate" a byte[] into a String and vice versa.

Java does not provide a public/supported implementation. The following are links to source code from JXTA.

Base64Encoding for Java:
https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/impl/src/net/jxta/impl/util/BASE64OutputStream.java
https://jxta-jxse.dev.java.net/source/browse/jxta-jxse/trunk/impl/src/net/jxta/impl/util/BASE64InputStream.java


4. Summary.

XML provides the mechanism to package a message into a self-contained message. When a message is self-contained, the message can travel through "the cloud" without losing any required information to process the content of the message.

A self-contained message can live much longer than one that's not.

No comments:

Post a Comment