Friday 19 December 2014

DotNetZip - Zip Compression and Decompression .NET 4 and Earlier

Quick post about a useful third-party zip compression library if you're working in .NET framework version 4 or earlier.

I am working on a SQL Server 2012 Integration Services (SSIS) package. One of the steps in the package is to decompress a set of zip archives. Unfortunately, .NET script tasks in the SSIS package can only target .NET framework version 4 and earlier. This means that I couldn't make use of the new zip compression classes introduced in .NET 4.5 (see System.IO.Compression).

Fortunately though, there are a handful of open source .NET zip libraries available. The one I opted for is called DotNetZip. DotNetZip has an intuitive API and is working well with a large number of files (I am decompressing approximately 15,000 archives). The library is available as a NuGet package. The two snippets below show just how easy it is to compress and decompress files using zip.

To compress a file into a zip archive:
using (var zipFile = new ZipFile())
{
    // The empty string parameter ensures the file is archived to the 
    // root of the archive
    zipFile.AddFile(@"C:\Test\Data.txt"string.Empty);
    zipFile.Save(@"C:\Test\Data.zip");
}

To decompress files out of a zip archive:
using (var zipFile = new ZipFile(@"C:\Test\Data.zip"))
{
    zipFile.ExtractAll(@"C:\Test\Output");
}

Saturday 22 November 2014

Short-circuit Boolean Evaluation

A useful but often not thought about feature is that many programming languages, including C#, support short-circuit evaluation of Boolean expressions. For example, in the following logical AND condition:

if (OperandOne && OperandTwo) 
{    
}

OperandTwo is only evaluated if OperandOne equals the Boolean value of true. If OperandOne's value is false, then the entire condition evaluates to false regardless of the value of any other operand in the condition (basic boolean algebra). Hence, there is no reason to continue evaluating the subsequent operands. This functionality is known as short-circuiting. You can confirm this language feature in C# with some basic code:

void Main()
{
    if (MethodOne() && MethodTwo())
    {
    }
}

bool MethodOne()
{
    Console.WriteLine("MethodOne: I got evaluated!");
    return false;
}

bool MethodTwo()
{
    Console.WriteLine("MethodTwo: I got evaluated!");
    return true;
}

After executing the code above, you'll see that the Console.WriteLine call in MethodOne only gets called. Short-circuit evaluation also works for the logical OR operator. Let's change the condition in the above code snippet to look like:

if (MethodOne() || MethodTwo())
{
}

If you now execute the code, you'll see that both of the Console.WriteLine calls in MethodOne and MethodTwo get called. However, if you now change MethodOne to return true and re-execute, you'll see that C#'s short-circuit evaluation kicks-in and only the Console.WriteLine in MethodOne gets executed.

Short-circuit evaluation gives you, as the programmer, a neater way to express your code. For example, without short-circuit evaluation, you wouldn't be able to construct conditions such as:

if (person != null && person.Age > minimumAge)
{
}

In this condition, a check is first made to ensure the person object isn't null before we go ahead and try to access the Age property on the object. If the person object is null, then the short-circuiting kicks-in and prevents the incorrect access of the Age property (which would result in a NullReferenceException to be thrown).

A lesser known nuance of C# is that short-circuit evaluation can be bypassed as a side effect of using the bitwise operators & and | without losing the semantics of the logical operation. For example, if we reuse our code snippet from earlier and make a minor adjustment to the condition to use the bitwise AND operator, we'll have:

void Main()
{
    if (MethodOne() & MethodTwo())
    {
    }
}

bool MethodOne()
{
    Console.WriteLine("MethodOne: I got evaluated!");
    return false;
}

bool MethodTwo()
{
    Console.WriteLine("MethodTwo: I got evaluated!");
    return true;
}

After executing the snippet above, you'll now see that both the Console.WriteLine in each of MethodOne and MethodTwo get called despite MethodOne (the first operand) returning false. Using a single & over && does not affect the semantics of the logical AND operator - so the code within the braces of the if-condition would still not get executed in this case (as the entire expression still evaluates to false). However, the key thing to note is that all operands are evaluated when bypassing short-circuit evaluation. Similar to the single & operator, short-circuit evaluation is also bypassed when using a single | (bitwise OR) without losing the semantics of OR.

It is generally advised not to bypass short-circuit evaluation but I find that it is important to be aware of it to catch bugs. In the past, I have noticed on several occasions that subtle bugs were introduced where a developer has used a single & and one of the operands in the expression had a side effect (e.g. a method that performs an action). Perhaps they came from a non-short-circuit-featured language where the single character versions of && and || are used to express logical AND and OR?. Take the following code snippet as an example:

if (person.HasChanges & _repository.UpdatePerson(person))
{
    ...
}

In this case, the UpdatePerson method will always be called, even when the person object states that it has no changes. This is a subtle bug which can easily be missed. Nevertheless, it is fixed by using the && operator to make use of short-circuit evaluation!

Saturday 6 September 2014

Encouragement from Visual Studio

It's always nice to get some encouragement, in whatever form it comes in! Now you can get it from Visual Studio itself by installing this fun little add-on (developed by Phil Haack).

Sunday 24 August 2014

CsQuery - JQuery for .NET

I'm currently working on a personal use utility application which requires some web scraping to extract data (from HTML) which can be locally processed by the application. Getting the raw HTML was straightforward enough using the HttpWebRequest and HttpWebResponse classes in the System.Net namespace.

I then reached the point where I had some raw HTML in a string that I needed to parse. After doing a quick search I found CsQuery (available as a NuGet package) which is an open source JQuery port for .NET. I was able to easily extract the data I required from the HTML using the familiar JQuery-like selectors. There is an example code snippet below which shows just how easy it is to use CsQuery.

var html = new StringBuilder();
html.Append("<html><body>");
html.Append("<h1>Hello, world!</h1>");
html.Append("<p class='intro'>This program is using CsQuery.</p>");
html.Append("<p id='author'>CsQuery is a library written by James Treworgy.</p>");
html.Append("</body></html>");

var dom = CsQuery.CQ.Create(html.ToString());

// Get the inner text of an element by element name selector
Console.WriteLine(dom["h1"].Text());

// Get the inner text of an element by class name selector
Console.WriteLine(dom[".intro"].Text());

// Get the inner text of an element by id selector
Console.WriteLine(dom["#author"].Text());

// Add a class to an element
dom["h1"].AddClass("title");
            
// Update the title text using new class in selector
dom[".title"].Text("CSQuery - a JQuery port for .NET");

// Now retrieve the new title by a class selector
Console.WriteLine(dom[".title"].Text());

// Pause console
Console.ReadLine();

The example source code is available in a C# console application project on GitHub - https://github.com/rsingh85/CsQueryExample.

Sunday 10 August 2014

C# Static Code Analysis with NDepend

I was fortunate enough to recently obtain a Pro license for NDepend. If you've never heard of NDepend, it is a powerful .NET code analysis tool which provides useful information to support you in writing better code. As I work on a large codebase, my initial worry was that NDepend may slow down my user experience with Visual Studio. However, after using it for one month, I have not come across a single moment in which NDepend slowed my IDE experience.

I'll quickly admit that I have no extensive experience of using code analysis tools, therefore I can't comment on how NDepend compares to other tools in the market - but I can comment on the features that I've found very useful. A quick glance on the documentation section of the NDepend website shows the powerful feature set that the tool supports. Below I briefly go through some of the features that I've found useful to date. I still feel that I've barely scratched the surface of some of the features that NDepend supports (particularly CQLinq), so the hope is to come back and update this post as and when I find something else I like.

Visual Studio Support and Integration - NDepend supports all the major versions of Visual Studio (2008, 2010, 2012 and 2013). I installed NDepend (the Pro license) on Visual Studio 2010 Professional Edition and have had no issues since installation. I have also installed a trial version of NDepend on the more later Visual Studio 2013 and found that it works just as good. The installation process was quick and straight forward. I particularly like the non intrusive nature of the tool - after installation, you get an "NDepend" menu option in Visual Studio. From this menu you can run an analysis on your projects. There is also a small circle icon which appears in the bottom right corner of Visual Studio - if you hover your mouse over it you'll get quick access to certain options (like running an analysis or viewing the dashboard) and also information on the number of NDepend code rules violated (more on Code Queries and Rules below).

NDepend Dashboard - Once an NDepend analysis completes, you get the option to view the NDepend Dashboard. The dashboard itself opens within Visual Studio as a new tab and provides a comprehensive array of information. Some of the information you get is:

- Lines of code (split by the number of lines you've written and the number of lines not written by you)
- Method complexity information
- Quantity information on assemblies, namespaces, methods, fields, source files and lines of comments
- Code coverage by tests
- Third party usage
- Information on violated code rules

Note that NDepend also is able to create a report in HTML which provides all this information and more. I imagine that saving "snapshots" of these reports over time would really help in assessing how your codebase is growing in terms of quality over time.

Code Rules and Queries - Perhaps one of the main features which makes NDepend so powerful is CQLinq. CQLinq stands for Code Query LINQ which allows you to query your codebase using LINQ-based queries. NDepend comes with a large number of predefined queries which give you useful information out of the box. Using the "Queries and Rules Explorer" panel you can see the CQLinq code for each query and also define your own query. You can think of CQLinq as a way to very easily reflect on your codebase using simple LINQ-based syntax that you'll already be familiar with. CQLinq provides a very useful mechanism for you to extend NDepend based on the information you want to extract from your code base. It also opens the opportunity for you to share your CQLinq queries with other NDepend users. You can find out more about CQLinq here.

Additional Visual Studio Features - In addition to its core features, NDepend provides a number of extra useful features like code diffing, dependency analysis, visualisation of code metrics and richer code search features. From what I understand, NDepend search is essentially a user interface over CQLinq.

As mentioned above, I'm a fairly new NDepend user. I find it an intuitive tool to use and quickly found myself benefitting from the information the tool provides. I'm particularly now interested in trying out NDepend on a greenfield project and seeing how it affects and shapes the day-to-day development of a new codebase. The features mentioned above are a very small subset of NDepends full feature set. If you're interested in trying out and knowing more about the tool - you can download it from here and read about all the supported features here. I also encourage you to go through the very useful documentation part of the site.

Saturday 19 July 2014

Empty Collections

I was reading one of my favourite Java programming books this morning and came across a simple coding pattern that I've been practicing in C# since I first read this book. I feel it's a useful thing to share here. For those of you that are interested, the book is "Effective Java" by Joshua Bloch. It contains a series of bite sized chunks of tips/recommendations that Java developers can use. Although the book focuses on the Java programming language, I feel there are lots of things it discusses that are transferrable to any programming language. One of these tips is the topic of this post.

Joshua mentions that it's better to return empty collections rather than null when writing methods. The reason being that any code calling your method then doesn't need to explicitly handle a special null case. Returning an empty collection makes the null check redundant and results in much cleaner method calling code. In C#, the System.Linq.Enumerable class has a useful generic method called Empty. When called with a type parameter, this method returns an empty instance of IEnumerable<T> (where T is your type parameter).

An example use:
public class House
{
    public IEnumerable<Person> CurrentOccupants { get; set; }
        
    public House()
    {
        CurrentOccupants = Enumerable.Empty<Person>();
    }
}
I also find that this method of instantiating (or returning) a collection states my intent more clearly than something like:
CurrentOccupants = new List<Person();
The MSDN documentation for Enumerable.Empty states that it caches the returned empty sequence of type TResult - so using this consistently for the same types can give (probably negligible) performance or memory usage advantages.

This pattern also reduces the chances of null reference errors from being raised in calling code. As mentioned above, the calling code then doesn't need to go through the usual boiler-plate null check code on the returned collection. Naturally, this leads to cleaner code as the calling code can process the collection the same way if it has items or doesn't have items within it.

Monday 17 March 2014

Compiling ASP.NET MVC Views

If you've ever worked in ASP.NET MVC then you may have noticed that the server-side code in your views (e.g. your Razor logic) is not compiled as part of the build process. By default, views are compiled at runtime which means that any syntax error in your server-side view code may annoyingly only manifest after you've built, deployed and requested for the view which contains the error.

Fortunately, you can instruct the compiler to include your views in the build process by using the project-level boolean property MvcBuildViews. Setting this property to true will ensure your ASP.NET MVC views are also included in the build process and any syntax errors are caught at compile time.

To set this property, open your ASP.NET MVC application's csproj file in your favourite text editor. Locate the MvcBuildViews element and set its inner value to true. If you already have the project open in Visual Studio, the IDE will automatically detect the file change and ask you to reload the project. Once reloaded, you'll have compile time safety for your views.

The only disadvantage to this that I've noticed so far is that your builds take a bit longer to complete, however I've found this negligible compared to finding the error at runtime and having to do a patch deployment to another environment!

Tuesday 25 February 2014

Debugging and Viewing .NET Code

Quick post, a colleague of mine recently circulated a very useful Microsoft link (below) which allows you to browse through the .NET framework source code through the browser. Very useful for learning how things work under the hood.

.NET Framework Reference Source

You can also setup your instance of Visual Studio to debug into the framework - see this blog post.

There is also a useful related Visual Studio extension here.