Simple XML Parser in C# using XmlDocument

In this article we will look at how to read a websites Sitemap.xml with C# and parse it's contents using a simple XML Parser.

By Tim Trott | C# ASP.Net MVC | September 14, 2009

An XML Sitemap is a specially structured XML file which provides important structural information about a website to search engine crawlers for indexing purposes. The basic sitemap structure looks like this.

xml
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemalocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
  <url>
    <loc>https://lonewolfonline.net/</loc>
    <priority>1.0</priority>
    <lastmod>2010-09-14</lastmod>
    <changefreq>daily</changefreq>
  </url>
  <url>
    <loc>https://lonewolfonline.net/simple-xml-parser/</loc>
    <priority>0.5</priority>
    <lastmod>2009-09-14</lastmod>
    <changefreq>monthly</changefreq>
  </url>
</urlset>

Individual <url> tags are wrapped inside the containing <urlset> nodes. Each <url> represents a page on the site. Inside the <url> node, are four nodes.

The <loc> node represents the page URL.

The <priority> node represents the webmaster-defined site map priority.

The <lastmod> node represents the date on which the page was last modified.

The <changefreq> node indicates how often the page is updated and suggests to the search engine how often to crawl it again.

Code coder coding html xml web sourcecode
Simple XML Parser in C#

Writing a Simple XML Parser in C#

For this example, I am creating a small console application and outputting the results to the screen. I am also reading the sitemap from a file, but you can just as easily download files from a website instead.

You can also download a sample project from GitHub.

C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml;

namespace SitemapXMLParser
{
    class Program
    {
        static void Main(string[] args)
        {
            XmlDocument urldoc = new XmlDocument();
            urldoc.Load("Sitemap.xml");

            XmlNodeList xnList = urldoc.GetElementsByTagName("url");

            foreach (XmlNode node in xnList)
            {
                Console.WriteLine("url " + node["loc"].InnerText);
                Console.WriteLine("priority " + node["priority"].InnerText);
                Console.WriteLine("last modified " + node["lastmod"].InnerText);
                Console.WriteLine("change frequency " + node["changefreq"].InnerText);
                Console.WriteLine(Environment.NewLine);
            }
        }
    }
}

Download from GitHub 

Was this article helpful to you?
 

Related ArticlesThese articles may also be of interest to you

CommentsShare your thoughts in the comments below

If you enjoyed reading this article, or it helped you in some way, all I ask in return is you leave a comment below or share this page with your friends. Thank you.

There are no comments yet. Why not get the discussion started?

We respect your privacy, and will not make your email public. Learn how your comment data is processed.