Type Systems (C#)

Posted Friday, April 3, 2015 in Software Engineering

Introduction

Microsoft's .NET framework contains an interesting type system. The common type system that .NET's common language runtime (CLR) supports is further implemented by C#, a popular .NET derivative. Extensions to .NET's common type system in C# offers a wide range of capabilities in the assignment, declaration, and type conversion. While C#'s type system can easily encompass a book's worth of information some of the key features will be discussed.

First off, it should be known that a programming language's type system refers to the method that a language handles data structures. A type can refer to something as simple as eight continuous bits or a complex mixture of methods, pointers, and properties. A programming language does not necessarily need a type system to operate. Low-level languages such as Assembly may manipulate data directly instead of through type-defined variables. However, there can be significant benefits from type systems that help to reduce a developer's effort. For example, imagine having to design a method that accepts a number and determines if that number is even or odd and returns the result. Performing such a task may be impossible without explicit knowledge of what data is being accepted and in what format. A type system can greatly improve the ease of designing such a method by specifying the type of data the method can receive and enforce type safety.

Common Language Runtime

Type safety is a method of ensuring that data assigned a type will only be of that type. For example, a variable declared as a number type may be protected from being assigned string data. C# accomplishes type safety through intermediate language compilation and during runtime with the CLR. Compilation of C# code into intermediate language code will verify type assignments and give a compilation error if a typed variable is assigned unsupported data (Albahari, J. & Albahari, B., 2012).

When compiled C# code is executed it is further compiled and managed by the CLR which performs type safety work, memory management, and more. For example, attempting to convert a string to an integer using an explicit cast will result in an InvalidCastException (Richter, 2012). Likewise, if a 32-bit unsigned integer was assigned a value greater than approximately 4.3 billion then an OverflowException would be thrown by the CLR (Richter, 2012).

The CRL manages two distinct types. These are value types and reference types. The difference between these two types is the method that the CLR assigns data assigned to each type in the stack and heap (Richter, 2012). The stack is a per-thread memory allocation that C# uses to store value type data and method parameters (Albahari, J. & Albahari, B., 2012). The stack uses a first-in-first-out memory allocation method that simplifies memory management and may provide some performance improvements over heap-allocated data (Albahari, J. & Albahari, B., 2012).

Value Types

Value types, sometimes referred to as primitives, are simple data structures that can be expected to have a known data size and be viewed in memory as a contiguous collection of bits. One of the most commonly used value types is the integer which in real-world usage would refer to a rational or whole number. For C#, the default integer type refers to a signed 32-bit number in which the first bit determines whether the number is negative or positive, and the next 31 bits specify the integer's value in binary. There are several variations of the default integer type in C# such as unsigned integers and 64-bit integers.

While integers can be used for a wide variety of applications, they are unsuitable for many mathematical operations that require irrational numbers. The floating-point type acts as a representation of irrational numbers using an algorithm to convert a number of bits to a number in which the decimal point can float. For example, C# will use implicit typing via the var keyword to automatically assign the double type to when an irrational number is expected as a result such as with the division of two integers. A double is a floating-point type that is 64-bit in size.

Reference Types

Value types are useful in both simplicity and performance but lack many of the features expected in an object-oriented programming language such as the addition of a constructor and definable methods. C# solves the limitations of value types by allowing conversion to reference types using a method called boxing (Richter, 2012). Boxing allows for a value type to be transferred onto the heap and be converted to a reference type such as System.Object or System.Nullable. In the method of boxing, all C# types can be converted to the System.Object base type.

Reference types can have dynamically sized data and enjoy the benefits of an object-oriented programming language with definable methods, additional properties, and class inheritance. The System.String type is one of C#'s reference types and can have a dynamic length of data. System.String allows for the assignment of text data. Other reference types allow for the assignment of similarly complex data such as coordinates, image data, and data arrays. Developers are also able to specify their own reference types as a class. In object-oriented programming languages like C# a class can incorporate multiple methods and properties while also allowing inheritance. For example, a car class may inherit the vehicle base class and all of its methods and properties. The car class could then implement additional methods and properties unique to the car object. The CLR would then ensure type safety such that only a car object could be assigned to the car type (Richter & Van de Bospoort, 2013).

Implicit Typing

One of the features of C#'s compiler is the ability to determine the resulting type from a method or arithmetic operation. For example, C# will assume that the result of the division of two integers will produce a double type. In cases where C# can determine the type during an assignment, the var keyword can be used to declare a variable. The implicit typing mechanism allows the developer to spend less time declaring exact variable types but may reduce the understandability of the code. The developer may have to use their best judgment when assessing the benefits and issues of using the var keyword.

Generics

The CLR provides type safety for value types and reference types for all .NET languages. C# takes reference types a step further with special types such as generics. A generic type contains a parameter that accepts a type definition (Brosgol, 2009). For example, the System.Collections.Generic.List represents a generic list in which a type can be specified. Once the generic list is declared only the type that was specified can be added as an item to the list. A list of strings can only contain strings, a list of integers can only contain integers, and so on. Generic types can be further implemented as parameters to methods and classes to retain type safety while still allowing flexibility in parameter types.

Nullable Types

Another useful implementation of generic types is the System.Nullable type. The System.Nullable type can be used to extend the functionality of a type by allowing non-nullable types to have a null value. A null value means that no data on the heap has been assigned to a variable (Wills, 2014). Value types do not reference data on the heap and therefore cannot be assigned a null value. The System.Nullable type lets value types such as an integer be assigned a null value. There is a shorthand for creating System.Nullable by appending a question mark to the type definition. For example, the integer type can be declared nullable as int? as shorthand for System.Nullable.

Conclusion

Programming languages that utilize .NET and its CLR have the benefit of type safety at both compile time and runtime. C# extends the CLR even further with the use of expansive reference types with useful implementations such as generics and nullable types. Use of implicit typing and the ability to define reference types gives flexibility to C#'s type system.