Overthinking CSV With Cesil: Adopting Nullable Reference Types

Posted: 2020/07/07 | Author: kevinmontrose | Filed under: code | Tags: cesil |Comments Off

In the previous post I went over how Cesil has adopted all the cool, new, C# 8 things – except for the coolest new thing, nullable reference types. C#’s answer to the billion dollar mistake, I treated this new feature differently from all the rest – I intentionally delayed adopting it until Cesil was fairly far along in development.

I delayed because I felt it was important to learn how an established codebase would adopt nullable reference types. Unlike other features, nullable references types aren’t a feature you want to use in only a few places (though they can be adopted piecemeal, more on that below) – you want them everywhere. Unfortunately (or fortunately, I suppose) as nullable reference types force you to prove to the compiler that a reference can never be null, or explicitly check that a value is non-null before using it, an awful lot of existing code will get new compiler warnings. I put my first pass at adopting nullable reference types into a single squashed commit, which nicely illustrates how it required updates to nearly everything in Cesil.

A Refresher On Nullable Reference Types

Before getting any further into the Cesil specific bits, a quick refresher on nullable reference types. In short, when enabled, all types in C# are assumed to be non-nullable unless a ? is appended to the type name. Take for example, this code:

public string GetFoo() { /* something */ }

In older versions of C# it would be legal for this method to return an instance of a string or null since null is a legal value for all reference types. In C# 8+, if nullable reference types are enabled, the compiler will raise a warning if it cannot prove that /* something */ won’t return a non-null value. If we wanted to be able to return a null, then we’d instead use string? as it’s return type. One additional bit of new syntax, the “null forgiving” ! in suffix position, lets you declare to the compiler that a value is non-null.

Importantly, this is purely a compile time check – there’s nothing preventing a null from sneaking into a variable at runtime. This is a necessary limitation given the need to interoperate with existing .NET code, but it does mean that you have to be diligent about references that come from code that has not opted into nullable reference types or is outside your control. Accordingly, it’s perfectly legal to lie using the null forgiveness operator.

Adopting Nullable Reference Types

If you’ve got a large existing codebase, adopting nullable references type all at once is basically impossible in my opinion. I tried a couple times with Cesil and abandoned each attempt, you just drown in warnings and it’s really difficult to make progress given all the noise. The C# team anticipated this, thankfully, and has made it possible to enable or disable nullable reference times on subsets of your project with the #nullable directive. I worked through Cesil file by file, enable nullable reference types and fixing warnings until everything was converted over at which point I enabled them project wide.

The changes I had to make fell into four buckets:

Adding explicit null checks in places where I “knew” things were always non-null, but needed to prove it to the compiler
1. An example are the reflection extension methods I added, that now assert that, say, looking up a method actually succeeds
2. These also aided in debugging, since operations now fail faster and with more useful information
Wrapping “should be initialized before use” nullable reference types in a helper struct or method that asserts a non-null value is available.
1. I used a NonNull struct and Utils.NonNull method I wrote for the purpose
2. A good example of these are optional parts of (de)serialization, like Resets. A DeserializeMember won’t always have a non-null Reset, but it should never read a non-null Reset during correct operation
Explicit checks at the public interface for consumers violating (knowingly or otherwise) the nullability constraints on parameters.
1. I used a Utils.CheckArgumentNull method for this
2. You have to have these because, as noted earlier, nullable reference types are purely a compile time construct
3. Interestingly, there is a proposal for a future version of C# that would make it simpler to do this kind of check
Refactoring code so a reference can be proven to always be non-null
1. A lot of times this has minor stuff, like always fully initializing in a constructor
2. In some cases what was needed was some value to serve as a placeholder, so I introduced types like EmptyMemoryOwner

Initially most changes fell into the first two buckets, but over time more was converted into the fourth bucket. Once I started doing some light profiling, I found a lot of time was being spent in null checks which prompted even more refactoring.

At time of writing, there are a handful of places where I do use the null forgiving operator. Few enough that an explicit accounting can be given, which can illustrate some of the limitations I found with adopting nullable reference types:

In DEBUG builds of AwaitHelper
In AsyncEnumerable’s constructor
Five times for local variables in AsyncReaderBase
In DynamicRow.GetAtTyped
Four times in DynamicRowEnumerator
Four times in PassthroughRowEnumerator
In Enumerable’s constructor
Two in DEBUG builds of ReaderStateMachineToDebugString()
Four default!s in NeedsHoldRowConstructor
Four, not all defaults this time, in SimpleRowConstructor
Three in SyncReaderBase
In ReadResult’s constructor
Two in ReadResultWithComment’s constructors

That’s a total of 32 uses over ~32K lines of code, which isn’t all that bad – but any violations of safety are by definition not great. Upon further inspection, you’ll see that 29 of them are some variation of default! being assigned to a generic type – and the one in DynamicRow.GetAtTyped is around casting to a generic type (the remaining two are in DEBUG-only test code). Basically, and unfortunately, nullable references can get awkward around unconstrained generics. The problem is an unconstrained generic T could be any combination of nullable and value or reference types at compile time, but default(T) is going to produce a null for all reference types – it’s a mismatch you can’t easily work around. I’d definitely be interested in solutions to this, admittedly rare, annoyance.

Accommodating Clients In Different Nullable Reference Modes

That covers Cesil’s adoption of nullable reference types, but there’s one big Open Question around client’s adoption of them. A client using Cesil could be in several different “modes”:

Completely oblivious to nullable reference types, neither using nullable annotations themselves nor using types from libraries that do
1. All pre-C# 8 will be in this mode
2. Post-C# 8 code that has not enabled nullable reference types, and whose relevant types aren’t provided by libraries that have enabled them are also in this mode
3. At time of writing this is probably the most common mode
Oblivious themselves, but using types with Cesil that do have nullable annotations
1. Pre-C# 8 code that uses types provided by dependencies that have been updated with nullable references types is in this mode
2. Post-C# 8 code with a mix of files with nullable reference types enabled will also be in this mode
Opted into nullable reference types, but using types with Cesil from dependencies that have not enabled nullable reference types
1. Post-C# 8 code with dependencies that have not yet enabled nullable reference types will be in this mode
Opted into nullable reference types, and all types used with Cesil likewise have them enabled
1. At time of writing this is the least common mode

Cesil currently does not look for nullable annotations, instead if you want to require a reference type be non-null you must annotate with a DataMemberAttribute setting IsRequired (if you’re using the DefaultTypeDescriber), provide a Parser that never produces a null value, or a Setter that rejects null values. This is behavior is most inline with the “completely oblivious” case (mode #1 above). However, Cesil could check for nullable annotations and default to enforcing them (as always, you’d be able to change that behavior with a custom ITypeDescriber). This aligns most with mode #4, which represents the desired future of C# code. Either behavior can result in some weirdness, with Cesil either doing things the compiler would prevent a client from doing or failing to do things a client could easily do.

To illustrate, if Cesil ignores nullable annotations (as it does today) but client has enabled them (ie. they are in modes #3 or #4) this could happen:

class Example
{
  public string Foo { get; set; } // note that Foo is non-nullable
}

// ...
var cesilFoo = CesilUtils.ReadFromString(@”Foo\r\n”).Single().Foo;
cesilFoo.ToString(); // this will throw NullReferenceException

// ...
var clientExample = new Example();
clientExample.Foo = null; // this will raise a warning

Basically, Cesil could set Example.Foo to null but the client’s own code couldn’t.

However, if Cesil instead enforces nullable annotations but the client is oblivious to them (the client is in mode #2 above) then this is possible:

// Example is defined in some place that _has_ opted into nullable references

public class Example
{
  public string Foo { get; set; } // note that Foo is non-nullable
}

// ...

// Cesil will throw an exception, since Foo has no value
var cesilFoo = CesilUtils.ReadFromString(@”Foo\r\n”);

// ...

var clientExample = new Example();
clientExample.Foo = null; // this code from the client raises no warnings

In this case, Cesil cannot do something that the client can trivially do.

As I see it there are a few different options, and Cesil needs to commit to one of them. The Open Question is:

Which of these options for nullable reference treatment should Cesil adopt?
1. Ignore nullable annotations, clients should perform their own null checks
2. Enforce nullable annotations as part of the DefaultTypeDescriber, if a client needs to disable it they can provider their own ITypeDescriber
3. Provide an Option to enable/disable nullable reference type enforcement
  - This will move the logic out of ITypeDescribers, so client control via that method will no longer be available
  - If this is the route to take, what should the value for Options.(Dynamic)Default be?

As before, I’ve opened an issue to gather long form responses. Remember that, as part of the sustainable open source experiment I detailed in the first post of this series, any commentary from a Tier 2 GitHub Sponsor will be addressed in a future comment or post. Feedback from non-sponsors will receive equal consideration, but may not be directly addressed.

The next post in this series will cover how I went about developing tests for Cesil.

Kevin Montrose

Overthinking CSV With Cesil: Adopting Nullable Reference Types

A Refresher On Nullable Reference Types

Adopting Nullable Reference Types

Accommodating Clients In Different Nullable Reference Modes

Related

Archive