C#: Regular Expressions

September 27th, 2008 by Mel Leave a reply »

I’ve decided to document what little knowledge I have on using Regular Expressions in C#. Nothing grand, just a list of formats, special characters and usage.

Control Characters:

Character Matches
. Any character but the newline (\n)
$ Characters at the end of a string
^ Characters at the beginning of a string. Also used in conjunction with ‘[]’ to specify “not.”
+ One or more of the specified characters
* Zero or more of the specified characters
? Zero or One of the specified characters
\ Used to escape special characters as well as signify special character sets
( ) Used to specify a collection of characters to match
[ ] Used to specify a set of single characters or ranges to match
{ } Used to specify how many times to match a given character(s)
| Used as a logical OR. Allows one or more expressions to be selected for a match

Special Character Sets:

Character Matches
\w Any word character. Same as [A-Za-z0-9_]
\W Any non-word character. Same as [^A-Za-z0-9_]
\s Any whitespace character. Same as [ \t\v]
\S Any non-whitespace character. Same as [^ \t\v]
\d Any digit. Same as [0-9]
\D Any non digit. Same as [^0-9]

Escaped Characters:

Character Matches
\t Horizontal Tab
\r Carriage Return
\n New Line
\b Backspace
\x5C An ASCII character in 2-digit hexadecimal
\u4B52 A Unicode character in 4-digit hexadecimal

