XmlReader exception on UTF-8 XML strings with Byte-Order-Mark (BOM)

The story is like this. Our ThermalLabel SDK allows you to get an XML string representing the thermal label you might have created through code or by using our Visual ThermalLabel Editor Add-on. The method that creates such XML is called GetXmlTemplate and the one you use to “load” any template is called LoadXmlTemplate.

Recently, a customer came back to us stating that if you write a code like this:

tLabel2.LoadXmlTemplate(tLabel1.GetXmlTemplate());

the following exception was raised:

System.Xml.XmlException : Data at the root level is invalid. Line 1, position 1.

That was shocking! The XML string generated by the GetXmlTemplate method should be read by the LoadXmlTemplate method! Now the interesting part of this bug is that if you save a file with the XML content and then read and pass it to the load method, it works just fine. So why did it work with a file and not with a pure string? Let’s try to explain it a bit.

The GetXmlTemplate method returns a String. To get an XML representation of the label we serialize all objects to XML and then we use an XmlWriter on a MemoryStream. We convert such XmlWriter to String by converting the bytes from the MemoryStream like this:

return Encoding.UTF8.GetString(memStream.ToArray());

The key is in the creation of the UTF8 object under Encoding class. UTF8 is of type UTF8Encoding which one of its constructors lets you to specify “whether to provide a Unicode byte order mark (BOM)

The Encoding’s UTF8 property gets an instance of UTF8Encoding with BOM enabled! That means that each string returned by GetString method is prefixed by these three bytes [239, 187, 191] which is the BOM for UTF-8

Now the error description of the XmlException makes sense as the first byte is invalid for an XML string content! The error happens when you create an XmlReader object specifying a StringReader object i.e. something like this:

XmlReader reader = XmlReader.Create(new StringReader(myXmlString), xmlSett);

NOTE: The error is thrown when you invoke Read method of XmlReader

To solve the issue you should verify the present of any BOM in the string containing the XML string and remove it from the string before creating the StringReader or handle it somehow. In our case, as we explicitly generate XML strings using UTF-8 encoding, we solved the issue by doing something like this:

XmlReader reader = XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(myXmlString)), xmlSett);

Advertisements

2 Responses to XmlReader exception on UTF-8 XML strings with Byte-Order-Mark (BOM)

  1. Rohit Gupta says:

    what is this xmlSett ? XmlReaderSettings ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: