var xml = @"
<Invoice>
<Timestamp>1/1/2017 00:01</Timestamp>
<CustNumber>12345</CustNumber>
<AcctNumber>54321</AcctNumber>
</Invoice>";
XML to JSON conversion with Json.NET
This is a repost that originally appeared on the Couchbase Blog: XML to JSON conversion with Json.NET.
XML data can be converted to JSON, which can be loaded into Couchbase Server (Couchbase Server 5.0 beta now available). Depending on the source of the data, you might be able to use a tool like Talend. But you may also want to write a simple C# .NET application with Newtonsoft’s Json.NET to do it.
XML data
For the purposes of this tutorial, I’m going to use a very simple XML example. If your XML is more complex (multiple attributes, for instance), then your approach will also have to be more complex. (Json.NET can handle all XML to Json conversions, but it follows a specific set of conversion rules). Here’s a sample piece of data:
Notice that I’ve got this XML as a hardcoded string in C#. In a real-life situation, you would likely be pulling XML from a database, a REST API, XML files, etc.
Once you have the raw XML, you can create an XmlDocument
object (XmlDocument
lives in the System.Xml
namespace).
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
Conversion with Json.NET
Once you have an XmlDocument
object, you can use Json.NET to convert that object into a Json representation.
var json = JsonConvert.SerializeXmlNode(doc, Formatting.None, true);
In this example, I’m asking Json.NET to serialize an XML node:
-
I used
Formatting.None
. If I wanted to display the actual Json, it might be better to useFormatting.Indented
-
The last
true
specifies that I want to omit the root object. In the XML above, you can think of<Invoice></Invoice>
as the root object. I just want the values of the Invoice object. If I didn’t omit the root node, the resultant Json would look like:{"Invoice":{"Timestamp":"1/1/2017 00:01","CustNumber":"12345","AcctNumber":"54321"}}
Saving the Json result
Finally, let’s put the Json into Couchbase. The easiest way to do this would be to again call on JsonConvert
to deserialize the Json into a C# object
. That object would then be used with Couchbase’s bucket.Insert(…)
method.
object transactObject1 = JsonConvert.DeserializeObject(json);
bucket.Insert(Guid.NewGuid().ToString(), transactObject1);
With this method, the Json would be stored in Couchbase like so:
That might be fine, but often times you’re going to want more control of the format. With Json.NET, we can serialize to a given class, instead of just object
. Let’s create an Invoice
class like so:
public class Invoice
{
public DateTime Timestamp { get; set; }
public string CustNumber { get; set; }
public int AcctNumber { get; set; }
}
Notice that there is some type information now. The Timestamp is a DateTime
and the AcctNumber is an int
. The conversion will still work, but the result will be different, according to Json.NET’s conversion rules. (Also check out the full Json.NET documentation if you aren’t familiar with it already).
Invoice transactObject2 = JsonConvert.DeserializeObject<Invoice>(json);
bucket.Insert(Guid.NewGuid().ToString(), transactObject2);
The result of that insert will look like:
-
Notice that the timestamp field is different: it’s stored in a more standardized way.
-
The acctNumber field value is not in quotes, indicating that it’s being stored as a number.
-
Finally, notice that the field names are different. This is due to the way Json.NET names Json fields by default. You can specify different names by using the
JsonProperty
attribute.
That’s it
One more minor thing to point out: I used Guid.NewGuid().ToString()
to create arbitrary keys for the documents. If you have value(s) in the XML data that you want to use for a key, you could/should use those value(s) instead.
This blog post was inspired by an email conversation with a Couchbase user. If you have any suggestions on tools, tips, or tricks to make this process easier, please let me know. Or, contact me if there’s something you’d like to see me blog about! You can email me or contact me @mgroves on Twitter.