Posts Tagged ‘Regex’

C#: Lower Case All XML Tags with Regex

April 24th, 2009

Sometimes when accepting an XML document from an uncontrolled source using Linq to XML, it’s useful to convert all tags and attributes to lower case before processing the XML. This is because Linq to XML is case-sensitive and you can’t always rely on the program producing the XML to follow your casing standard for elements and attributes.

So here’s a quick and dirty single line of code that will accomplish just this in C# using a regular expression:

Regex.Replace(
    xml, 
    @"<[^<>]+>",
    m => { return m.Value.ToLower(); }, 
    RegexOptions.Multiline | RegexOptions.Singleline);

And here’s that functionality all nice and wrapped up inside of an extension for XElement:

public static class XElementExt
{
    public static string LowerCaseTags(string xml)
    {
        return Regex.Replace(
            xml,
            @"<[^<>]+>",
            m => { return m.Value.ToLower(); },
            RegexOptions.Multiline | RegexOptions.Singleline);
    }
}

Note: The Regex class is defined in System.Text.RegularExpressions

Here’s an example of the resulting affect.

Before:

<ParentNode>
   <ChildItem TestAttribute="ValueCasing" >
	This text Will not Be Harmed!
   </ChildItem>
</ParentNode>

After:

<parentnode>
   <childitem testattribute="valuecasing" >
	This text Will not Be Harmed!
   </childitem>
</parentnode>

You’ll notice that with this method all text within element tags is converted to lower case. This means that attribute values will lose any special casing they may have had, which may or may not be a problem for what you’re doing.