C#: Lower Case All XML Tags with Regex

April 24th, 2009 by Mel Leave a reply »

Sometimes when accepting an XML document from an uncontrolled source using Linq to XML, it’s useful to convert all tags and attributes to lower case before processing the XML. This is because Linq to XML is case-sensitive and you can’t always rely on the program producing the XML to follow your casing standard for elements and attributes.

So here’s a quick and dirty single line of code that will accomplish just this in C# using a regular expression:

Regex.Replace(
    xml, 
    @"<[^<>]+>",
    m => { return m.Value.ToLower(); }, 
    RegexOptions.Multiline | RegexOptions.Singleline);

And here’s that functionality all nice and wrapped up inside of an extension for XElement:

public static class XElementExt
{
    public static string LowerCaseTags(string xml)
    {
        return Regex.Replace(
            xml,
            @"<[^<>]+>",
            m => { return m.Value.ToLower(); },
            RegexOptions.Multiline | RegexOptions.Singleline);
    }
}

Note: The Regex class is defined in System.Text.RegularExpressions

Here’s an example of the resulting affect.

Before:

<ParentNode>
   <ChildItem TestAttribute="ValueCasing" >
	This text Will not Be Harmed!
   </ChildItem>
</ParentNode>

After:

<parentnode>
   <childitem testattribute="valuecasing" >
	This text Will not Be Harmed!
   </childitem>
</parentnode>

You’ll notice that with this method all text within element tags is converted to lower case. This means that attribute values will lose any special casing they may have had, which may or may not be a problem for what you’re doing.

Advertisement

12 comments

  1. Trafz says:

    This ruins CDATA 🙁

  2. sliper dragon says:

    thanks…

  3. Very use full. Thx 😉

  4. Khawar says:

    Great work…..
    problem solved… =))

  5. ravee says:

    tahnks a toneee gr8 work

  6. Dänu says:

    Thanks mate, used this one to lower case mshtml generated html tags.

  7. Kaan says:

    I and i aren’t same in Turkish..
    So, Isparta isn’t isparta..
    İstanbul = istanbul
    Isparta = ısparta
    <meta name="keywords" content="Isparta, İstanbul…

    • Mel says:

      Hey Kaan,

      Thanks for the comment! Yes, if you’re implementing this method you need to be sure that any case-sensitive data is not stored in attributes, but is instead stored inside the element.

      Cheers!

  8. Senthil says:

    how to read website link in the XML file? using c# .net

Leave a Reply