Category Archive Regular Expressions

ByMel

C#: Lower Case All XML Tags with Regex

Sometimes when accepting an XML document from an uncontrolled source using Linq to XML, it’s useful to convert all tags and attributes to lower case before processing the XML. This is because Linq to XML is case-sensitive and you can’t always rely on the program producing the XML to follow your casing standard for elements and attributes.

So here’s a quick and dirty single line of code that will accomplish just this in C# using a regular expression:

Regex.Replace(
    xml, 
    @"<[^<>]+>",
    m => { return m.Value.ToLower(); }, 
    RegexOptions.Multiline | RegexOptions.Singleline);

And here’s that functionality all nice and wrapped up inside of an extension for XElement:

public static class XElementExt
{
    public static string LowerCaseTags(string xml)
    {
        return Regex.Replace(
            xml,
            @"<[^<>]+>",
            m => { return m.Value.ToLower(); },
            RegexOptions.Multiline | RegexOptions.Singleline);
    }
}

Note: The Regex class is defined in System.Text.RegularExpressions

Here’s an example of the resulting affect.

Before:

<ELEMENT>
    <ChildTag>
        <inner>This text Will not Be Harmed!</inner>
    </ChildTag>
</ELEMENT>

After:

<element>
    <childtag>
        <inner>This text Will not Be Harmed!</inner>
    </childtag>
</element>

You’ll notice that with this method all text within element tags is converted to lower case. This means that attribute values will lose any special casing they may have had, which may or may not be a problem for what you’re doing.

ByMel

Credit Card Regular Expression

So for the project I’m currently working on I need to verify credit card numbers input by the user. So I found a regular expression online that would do almost all of it, but it lacked a few necessary validations such as Discover cards and 13-digit Visas. So I modified it to work with almost all forms of Visa, MasterCard, American Express, and Discover cards.

It also supports white space and dashes in between blocks of numbers, as would be found on an actual credit card.

Here it is:

^(((4\d{3})|(5[1-5]\d{2})|(6011))[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4})|(3[4,7][\d\s-]{13})|(4[\d\s-]{12})$

It’s not 100% perfect for catching invalid Discover or 13-digit Visa cards but it will recognize valid ones. For best results, strip out any non-digits from the input string before running it through the regular expression.

Cheers!

ByMel

C#: Regular Expressions

I’ve decided to document what little knowledge I have on using Regular Expressions in C#. Nothing grand, just a list of formats, special characters and usage.

Control Characters:

Character Matches
. Any character but the newline (\n)
$ Characters at the end of a string
^ Characters at the beginning of a string. Also used in conjunction with ‘[]’ to specify “not.”
+ One or more of the specified characters
* Zero or more of the specified characters
? Zero or One of the specified characters
\ Used to escape special characters as well as signify special character sets
( ) Used to specify a collection of characters to match
[ ] Used to specify a set of single characters or ranges to match
{ } Used to specify how many times to match a given character(s)
| Used as a logical OR. Allows one or more expressions to be selected for a match

Special Character Sets:

Character Matches
\w Any word character. Same as [A-Za-z0-9_]
\W Any non-word character. Same as [^A-Za-z0-9_]
\s Any whitespace character. Same as [ \t\v]
\S Any non-whitespace character. Same as [^ \t\v]
\d Any digit. Same as [0-9]
\D Any non digit. Same as [^0-9]

Read More