Introduction to XML and XmlDocument with C#

This tutorial is all about using XML in C# programs. Starting with an introduction to XML, we then look at parsing and loading XML data.

2,825 words, estimated reading time 11 minutes.
Introduction to Programming with C#

This article is part of a series of articles. Please use the links below to navigate between the articles.

  1. Learn to Program in C# - Full Introduction to Programming Course
  2. Introdution to Programming - C# Programming Fundamentals
  3. Introduction to Object Oriented Programming for Beginners
  4. Introduction to C# Object-Oriented Programming Part 2
  5. Application Flow Control and Control Structures in C#
  6. Guide to C# Data Types, Variables and Object Casting
  7. C# Collection Types (Array,List,Dictionary,HashTable and More)
  8. C# Operators: Arithmetic, Comparison, Logical and more
  9. Using Entity Framework & ADO.Net Data in C# 7
  10. What is LINQ? The .NET Language Integrated Query
  11. Error and Exception Handling in C#
  12. Advanced C# Programming Topics
  13. All About Reflection in C# To Read Metadata and Find Assemblies
  14. What Are ASP.Net WebForms
  15. Introduction to ASP.Net MVC Web Applications and C#
  16. Windows Application Development Using .Net and Windows Forms
  17. Assemblies and the Global Assembly Cache in C#
  18. Working with Resources Files, Culture & Regions in .Net
  19. The Ultimate Guide to Regular Expressions: Everything You Need to Know
  20. Introduction to XML and XmlDocument with C#
  21. Complete Guide to File Handling in C# - Reading and Writing Files

XML, or Extensible Markup Language, stores data in a form where it can easily be retrieved and shared. XML is commonly used for sharing data between two disconnected systems - for example, two organisations running different software applications.

XML in C# is made easy with the .Net Frameworks toolkit of XML classes. You can easily read data in, parse the nodes and values, deserialise an XML document into a C# class and vice versa.

Reading XML with the XmlReader class

There are two main methods for reading XML with C# - The XmlDocument class and the XmlReader class.

XmlDocument reads the entire XML content into memory and then lets you navigate back and forward in it as you please, or even query the document using the XPath technology.

The XmlReader is a faster and less memory-consuming alternative. It lets you run through the XML content one element at a time while allowing you to look at the value and then move on to the next element. By doing so, it consumes very little memory because it only holds the current element, and because you have to manually check the values, you will only get the relevant elements, making it very fast.

To read XML documents from an XMLReader simply follow this template.

C#
XmlReader xmlReader = XmlReader.Create("https://mydomain.com/fem.xml");
while(xmlReader.Read())
{
  if((xmlReader.NodeType == XmlNodeType.Element) && (xmlReader.Name == "Dollar"))
  {
    if(xmlReader.HasAttributes)
      Console.WriteLine(xmlReader.GetAttribute("currency") + ": " + xmlReader.GetAttribute("rate"));                    
  }
}

We start by creating the XmlReader, using the static Create() method. We then loop through all the elements in the document using the Read() method. It advances the reader to the next element automatically until the end of the file is reached. Inside the loop, we can now use one of the many properties and methods on the XmlReader to access data from the current element. In this case, we are selecting all the node elements with the name Dollar and extracting the attributes currency and rate.

Reading XML with the XmlDocument class

This next code block uses the XmlDocument class to read the entire XML contents into memory.

C#
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("https://mydomain.com/fem.xml");
foreach(XmlNode xmlNode in xmlDoc.DocumentElement.ChildNodes[2].ChildNodes[0].ChildNodes)
  Console.WriteLine(xmlNode.Attributes["currency"].Value + ": " + xmlNode.Attributes["rate"].Value);

Since the object is now in memory we can iterate through the nodes using foreach (or for) loops.

This is only possible because we know the structure of the XML document, and it's not very flexible, pretty or easy to change later on. However, the way you navigate an XML document very much depends on the XML source and the data you need.

Using the XmlNode class gives us access to a lot of other useful information about the XML, for instance, the name of the tag, the attributes, the inner text and the XML itself.

Using XPath with the XmlDocument class

We saw how we can use loops and calls to the ChildNodes property to access data from the XmlDocument. It was simple, but only because the example was simple. It is, however, difficult to manage when the XML structure gets more complicated and it didn't do much good for the readability of our code.

Another way to navigate XML is to use XPath queries.

The XmlDocument class has several methods which takes an XPath query as a parameter and then returns the resulting XmlNode(s).

In the following example, we will use the SelectSingleNode() method to get the "title" element from XML document.

xml
<?xml version="1.0" encoding="utf-8"?>
<myTestClass xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <myString>Hello World</myString>
  <myInt>1234</myInt>
  <myNode>
    <myCustomValue value="test">Page Title</myCustomValue>
  </myNode>
  <myArray>
    <item><title>qwerty</title></item>
    <item><title>asdfgh</title></item>
    <item><title>zxcvbn</title></item>
    <item><title>123456</title></item>
  </myArray>
</myTestClass>

If you look at the XML, you will see that there is a element as a child element of the element, which is then a child element of the element, the root. That query can be described like this in XPath: /myTestClass/myNode/myCustomValue.

We simply write the names of the element we're looking for, separated with a forward-slash (/), which states that the element should be a child to the element before the preceding forward-slash. Using this XPath is as simple as this:

C#
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("https://mydomain.com/fem.xml");
XmlNode titleNode = xmlDoc.SelectSingleNode("/myTestClass/myNode/myCustomValue");
if(titleNode != null)
  Console.WriteLine(titleNode.InnerText);

We can also use the SelectNodes() method to find all the item nodes as an array and then print out information about them.

C#
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("https://mydomain.com/fem.xml");
XmlNodeList itemNodes = xmlDoc.SelectNodes("/myTestClass/myArray/item");
foreach(XmlNode itemNode in itemNodes)
{
  XmlNode titleNode = itemNode.SelectSingleNode("title");
  if(titleNode != null)
    Console.WriteLine(titleNode.InnerText);
}

The SelectNodes() method takes an XPath query as a string, just like we saw in the previous example, and then it returns a list of XmlNode objects in a XmlNodeList collection.

Writing XML with the XmlWriter class

As with reading XML files, there are two main methods for writing XML files back to disk.

The difference between the two is mainly about memory consumption. The XmlWriter uses less memory than XmlDocument. Another important difference is that when using XmlDocument, you can read an existing file, manipulate it and then write back the changes. With XmlWriter, you will have to write the entire document each time.

Here's an example of writing XML using the XmlWriter class:

C#
XmlWriter xmlWriter = XmlWriter.Create("test.xml");

xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("persons");

xmlWriter.WriteStartElement("person");
xmlWriter.WriteAttributeString("age", "32");
xmlWriter.WriteString("Paisley Carson");
xmlWriter.WriteEndElement();

xmlWriter.WriteStartElement("person");
xmlWriter.WriteAttributeString("age", "22");
xmlWriter.WriteString("Owen Broughton");

xmlWriter.WriteEndDocument();
xmlWriter.Close();

And this is the result written to disk:

xml
<persons>
  <person age="32">Paisley Carson</user>
  <person age="22">Owen Broughton</user>
</persons>

Writing XML with the XmlDocument class

Here is the code for writing the same data to an XML file using the XmlDocument class. It is much more programmatically than the XmlWriter class.

C#
XmlDocument xmlDoc = new XmlDocument();
XmlNode rootNode = xmlDoc.CreateElement("persons");
xmlDoc.AppendChild(rootNode);

XmlNode personNode = xmlDoc.CreateElement("person");
XmlAttribute attribute = xmlDoc.CreateAttribute("age");
attribute.Value = "32";
personNode.Attributes.Append(attribute);
personNode.InnerText = "Paisley Carson";
rootNode.AppendChild(personNode);

personNode = xmlDoc.CreateElement("person");
attribute = xmlDoc.CreateAttribute("age");
attribute.Value = "22";
personNode.Attributes.Append(attribute);
personNode.InnerText = "Owen Broughton";
rootNode.AppendChild(personNode);

xmlDoc.Save("test.xml");

As you can see, this is a bit more object-oriented than the XmlWriter approach, and it does require a bit more code, but imagine a situation where you just need to go into an existing XML document and change a few values. Using the first method, you would have to first read all the information using an XmlReader, store it, change it, and then write the entire information back using the XmlWriter.

The XmlDocument holds everything in memory so updating an existing XML file becomes a lot simpler.

C#
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("test.xml");
XmlNodeList personNodes = xmlDoc.SelectNodes("/persons/person");
foreach(XmlNode personNode in personNodes)
{
    int age = int.Parse(personNode.Attributes["age"].Value);
    personNode.Attributes["age"].Value = (age + 1).ToString();
}
xmlDoc.Save("test.xml");

Serialization and Deserialization

Serialization is the process of saving class member data into an XML document which can be transmitted over the internet or saved to a text file. Deserialization is the reverse process - loading class member data from an XML document.

Only public classes and public properties and fields can be serialised - methods and private members are not serialised and cannot be deserialized. Private classes cannot be serialised and will result in a compilation error.

When you serialise to XML, the resulting file can be used to transmit data across the Internet via web services (a web service project will automatically serialise a class without your knowledge and deserialize it back into a class at the other end). It can also be used to save custom user or application preferences, save the state of an application when it is closed or export data for other programs or archiving.

Let's start with a simple test class comprising a couple of public fields, private fields and a method. For this example, fields have not been encapsulated to keep the code simple.

C#
public class myTestClass
{
  public string myString = "Hello World";
  public int myInt = 1234;
  public string[] myArray = new string[4];
  private int myPrivateInt = 4321;

   public string myMethod()
  {
    return "Hello World";
  }
}

If you just have a simple class with a default constructor (no parameters) and you do not want to control serialisation then you need not do anything special. If on the other hand, your class does not have a parameterless constructor, or you need to control how serialisation is performed then you will need to implement the ISerializable interface.

We can create some simple code to convert the class above into an XML document using a XmlSerializer class and a StreamWriter.

C#
class Program
{
  static void Main()
  {
    / create new object and populate test data
    myTestClass test = new myTestClass();
    test.myArray[0] = "qwerty";
    test.myArray[1] = "asdfgh";
    test.myArray[2] = "zxcvbn";
    test.myArray[3] = "123456";

    / these lines do the actual serialization
    XmlSerializer mySerializer = new XmlSerializer(typeof(myTestClass));
    StreamWriter myWriter = new StreamWriter("c:/myTestClass.xml");
    mySerializer.Serialize(myWriter, test);
    myWriter.Close();
  }
}

In this example, the first line creates a XmlSerializer using the typeof method to create a serialize specific to the myTestClass class. We then create a StreamWriter pointing to an XML file on the C drive. Finally, we call the Serialize method of the Serializer passing in the parameters for the writer and the object itself. Finally, we close the writer stream.

This will result in an XML file like this:

xml
<?xml version="1.0" encoding="utf-8"?>
<myTestClass xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <myString>Hello World</myString>
  <myInt>1234</myInt>
  <myArray>
    <string>qwerty</string>
    <string>asdfgh</string>
    <string>zxcvbn</string>
    <string>123456</string>
  </myArray>
</myTestClass>

Deserialization

Deserialization is the reverse process. We will load in an XML document, pass the data to the deserializer and produce an instance of the class populated with the data.

C#
static void Main()
{
  myTestClass test;

  XmlSerializer mySerializer = new XmlSerializer(typeof(myTestClass));
  FileStream myFileStream = new FileStream("c:/mtTestClass.xml",FileMode.Open);

  test = (myTestClass)mySerializer.Deserialize(myFileStream);
}

You need to cast the Deserialize result into the correct class for the data being received since it will by default return an object.

If you are working with ASP.Net and you are using any other than in-process state management (InProc) such as State Server or SQL Server, then you will need to mark each of the classes to be serialised as "Serializable" to prevent errors. Again you can implement the ISerializable interface as well if you wish.

C#
Session["sessionTest"] = test;

When using the above code to add an instance of myTestClass to the ASP.Net session cache (which is configured to state server or SQL server) the CLR will raise a runtime SerializationException :

Type 'myTestClass' in Assembly App_Code.nrkp4p10, Version=0.0.0.0, Culture=neutral,PublicKeyToken=null" is not marked as serializable.

This exception can be avoided by adding the class attribute Serializable to the declaration.

C#
[Serializable]
public class myTestClass
{
  public string myString = "Hello World";
  public int myInt = 1234;
  public string[] myArray = new string[4];
  private int myPrivateInt = 4321;

  public string myMethod()
  {
    return "Hello World";
  }
}

You can now add this class to the cache where it will be automatically serialised and deserialized when needed.

Automatically Generate Classes from XML

There is a command-line tool which you can use to generate C# classes from an XML document. This is especially useful if the XML document is very big and contains lots of complex types. For more information, please see the article in the link below.

XML Web Services

XML Web Services provide a means of communication between distributed systems and enable true platform independence. Web services typically use a form of XML and SOAP to exchange data between the client and the server.

XML Web services allow data to be presented wrapped in an XML envelope for transport over networks, either LAN or WAN but mainly over the Internet. Web services expose business logic functions to web-enabled clients through the use of a protocol. The two main protocols in use are Simple Object Access Protocol (SOAP) and JavaScript Object Notation (JSON).

Web services provide information about themselves so that clients can write applications that talk to the service and use the data. This information is provided using a Web Services Description Language (WSDL) document.

The main advantage of Web Services is that allows programs written in different languages to communicate data with each other in standards compliant way over the internet and interact directly with your database via managed code.

XML Web Services
XML Web Services

XML Web Services are best described as self-contained, modular applications that can be invoked across the Internet. They can handle something as simple as retrieving a date/time string or stock quote to a complex task such as online ordering and B2B communications with Punchout and cXML.

XML Web Services can only access your business logic and database through the managed code within the service. This structure protects your data integrity and preserves existing business logic.

Applications can make use of multiple web services as a data source; each web service can be from a different supplier.

XML Web Services
XML Web Services

It is important to note that an XML Web Service written in .Net can be consumed by a web application written in PHP and vice-versa. An XML Web Service can be written using almost any programming language and consumed by any other. The only requirement is that they can serialise and deserialize data in an agreed format.

You can create a new XML Web Service project from within Visual Studio or Visual Web Developer and the project will provide a skeleton service that will display "Hello World".

C#
using System;
using System.Linq;
using System.Web;
using System.Web.Services;
using System.Web.Services.Protocols;
using System.Xml.Linq;

[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
/ To allow this Web Service to be called from script, using ASP.NET AJAX, uncomment the following line. 
/ [System.Web.Script.Services.ScriptService]
public class Service : System.Web.Services.WebService
{
  public Service () {

    /Uncomment the following line if using designed components 
    /InitializeComponent(); 
  }

  [WebMethod]
  public string HelloWorld() {
    return "Hello World";
  }
}

You may notice that the HelloWorld method is marked with a WebMethod attribute. Only methods marked in this way will be exposed to the XML Web Service. You can create as many methods as you require and they will be hidden unless WebMethod is specified.

When creating a new XML Web Service, the standard template is a good start, but there are a few things that you should change, the first of which is the namespace URI. This should be changed to the base URL of the location of the published web service. Secondly, if your web service is going to be called from an ASP.Net application you should uncomment the System.Web.Script.Services.ScriptService line to allow AJAX commands to be transferred. You should also delete (or cannibalise) the HelloWorld method.

For these tutorials, we are going to create a web service with two methods, one to convert Celsius to Fahrenheit and another to do the opposite.

The first step is to create two methods to handle the conversions.

C#
[WebMethod]
public double toCelsius(string Fahrenheit)
{
  double degrees = System.Double.Parse(Fahrenheit);
  return (degrees - 32) * 5 / 9;
}

[WebMethod]
public double toFahrenheit(string Celsius)
{
  double degrees = System.Double.Parse(Celsius);
  return (degrees * 9 / 5) + 32;
}
Creating a Simple XML Web Service
Creating a Simple XML Web Service

And that's it! You can compile and run this web service now and the compiler will launch your web browser pointed at the services landing page. From there you can invoke the methods. When you develop your own web services don't forget to validate user input.

Was this article helpful to you?
 

Related ArticlesThese articles may also be of interest to you

CommentsShare your thoughts in the comments below

If you enjoyed reading this article, or it helped you in some way, all I ask in return is you leave a comment below or share this page with your friends. Thank you.

This post has 18 comment(s). Why not join the discussion!

We respect your privacy, and will not make your email public. Learn how your comment data is processed.

  1. RA

    On Saturday 28th of July 2012, Rajeev said

    Wonderful explanation … I really appreciate the author for explain in the simple words . Great work :)

  2. EF

    On Wednesday 25th of April 2012, Efrat said

    Thank you very much for the article. very helpful.
    !!!

  3. MA

    On Wednesday 4th of April 2012, Mani said

    This article is great!

    I learned a lot and I believe that if every teacher in the world would teach that way, we would all have a Ph.D

    Thanks again,

    Mani

  4. CA

    On Sunday 25th of March 2012, Carlos said

    Really easy to understand, thank you.

  5. MA

    On Tuesday 3rd of January 2012, Mani said

    this page is so informative for layman like me.

  6. SO

    On Thursday 17th of November 2011, someuser said

    Very useful article.

    If you get an exception like I was getting at:

    XmlSerializer mySerializer = new XmlSerializer(typeof(myTestClass));

    Make sure, the type and public properties in that type all have the parameterless constructor.

  7. MA

    On Tuesday 4th of October 2011, Manikandan said

    Its very useful for very beginner of dot.net

  8. UD

    On Wednesday 7th of September 2011, utpal das said

    Can you please explain the exception "the writer is closed or in error state". I get this when i deserialize a saved file and populate my class, and again i try to serialize with the same name.

  9. AM

    On Wednesday 22nd of June 2011, Amey said

    Thanks a lot for this wonderful article.Keep it up.......

  10. BN

    On Wednesday 23rd of February 2011, Byron Nelson said

    Very nice little demo- exactly what I needed! I can't believe saving & retrieving structures is so easy. The times have sure changed.

  11. AC

    On Wednesday 24th of February 2010, Aaron Cardoz said

    What about the [NonSerialized] decoration of a member.

    This can also be used to exclude it from being serialized.

    1. Tim Trott

      On Wednesday 24th of February 2010, Tim Trott  Post Author replied

      While [System.NonSerialized] can be used to exclude public fields from being serialised, it cannot be used on properties; [XmlIgnore] however can be used on either type.

      You would normally use NonSerialized to serialise to binary or SOAP, and XmlIgnore if you were serialising to XML with an XmlSerializer.

  12. GE

    On Thursday 21st of January 2010, geggio said

    Is it possible to exclude some field from being serialized? The example would be the case when I don't want the string myString in the output xml.
    thanks

    1. Tim Trott

      On Friday 22nd of January 2010, Tim Trott  Post Author replied

      Yes it is possible to exclude a property from being serialised, simply add [XmlIgnore] before the declaration.

      Example:

      C#
      [Serializable]
      public class Product
      {
          public string StyleCode;
          public string Image;
          [XmlIgnore]
          public string Description;
          pubilc decimal Price;
        }
      }

      This will exclude Description from appearing in the serialised XML

  13. SC

    On Tuesday 19th of January 2010, Sanjay Chatterjee said

    Article is too good to read. I am totally clear to serialization concept now. Before I was confused. Thanks for writing in such a straight forward way. Keep it up.

  14. DE

    On Friday 27th of February 2009, Deven said

    Good One. Carry on the good work

  15. AN

    On Sunday 23rd of November 2008, anant said

    thanks its too good to read & understand

  16. PR

    On Thursday 23rd of October 2008, Prasad said

    Thank You for the Article