C#: Retrieve data from webpage

So I came across a fun assignment this week that I’m sure has been done by many different people in many different programming languages. The challenge was to “scrape” a website for information autonomously and save it off to a file.

I accomplished this by first using a wrapper class for .NET’s own HTTPWebRequest object that simplified posting to a web site and retrieving the result. I then used regular expressions to find the data I wanted, stored it in a string, and later wrote it to a file.

I’m not going provide the specific program I wrote as it’s still proprietary, but I will give a small example of how this can be done. The example will include: posting to a website, retrieving the results (HTML for the page), and parsing the resulting page to find what you want.

The class I used to post to the site was done by Robert May and can be found here:

Here is an example of using this class to perform a search at CraigsList under the ‘for sale’ category and retrieving the results:

// Create the post object
PostSubmitter post =
    new PostSubmitter("");

// Add our parameters
    "Ford Truck"

// Specify our action type (Post | Get)
post.Type = PostSubmitter.PostTypeEnum.Get;

// Retrieve the results
string result = post.Post();
Green Sudoku, a C# App

So I’ve written a small Sudoku game in C#. I originally started writing it so that I could have an interface to test my Sudoku Solving Algorithm. But as I worked on it, it became more and more a nice little application worthy of its own existence.

It only has 3 puzzles hard-coded into it that I’ve been using for testing purposes, but they’re fun to play, even though they’re a little on the easy side. My plan is to update the game to automatically download puzzles from some of the major Sudoku game websites that are out there. This would allow you to play literally millions of puzzles from several different levels of difficulty.

So I post it here for your puzzle-solving pleasure! Please be patient with it as it has a few bugs I’m still working out. If you happen to find any please let me know by commenting in this post!

This link will provide you with a direct download to a zip file that contains the game and a few files it needs to run. After downloading it just extract it anywhere you’d like and enjoy!

* (360KB)

Green Sudoku Screenshot

*Note: You need to have .NET framework version 2.0 or later installed on your system. If you’re not sure whether you have it you can download and run this small program from Microsoft which will install it if it’s not present on your system.


C#: Regular Expressions

I’ve decided to document what little knowledge I have on using Regular Expressions in C#. Nothing grand, just a list of formats, special characters and usage.

Control Characters:

Character Matches
. Any character but the newline (\n)
$ Characters at the end of a string
^ Characters at the beginning of a string. Also used in conjunction with ‘[]’ to specify “not.”
+ One or more of the specified characters
* Zero or more of the specified characters
? Zero or One of the specified characters
\ Used to escape special characters as well as signify special character sets
( ) Used to specify a collection of characters to match
[ ] Used to specify a set of single characters or ranges to match
{ } Used to specify how many times to match a given character(s)
| Used as a logical OR. Allows one or more expressions to be selected for a match

Special Character Sets:

Character Matches
\w Any word character. Same as [A-Za-z0-9_]
\W Any non-word character. Same as [^A-Za-z0-9_]
\s Any whitespace character. Same as [ \t\v]
\S Any non-whitespace character. Same as [^ \t\v]
\d Any digit. Same as [0-9]
\D Any non digit. Same as [^0-9]

