Overthinking CSV With Cesil: Writing Dynamic Types

Posted: 2020/06/16 | Author: kevinmontrose | Filed under: code | Tags: cesil |Comments Off

I covered how to write known, static, types with Cesil in my previous post. As with reading, Cesil also supports dynamic types.

In my post on dynamic reading, I argued dynamic is still worth supporting due to how convenient it makes some common read operations. I feel the case for writing dynamic types is much weaker – it is rare to want to write heterogeneous types, and even rarer to not be able to easily map such a mixed collection to a single known type. All that said, for symmetry’s sake Cesil does have extensive support for writing dynamic types.

As with reading, writing static and dynamic types is essentially symmetric. All the same methods are provided, supporting all the same operations. The only difference is rather than using Configuration.For<TRow>() you use Configuration.ForDynamic(), and rather than IBoundConfiguration<TRow> being parameterized by a type TRow it’s parameterized by dynamic.

When using the DefaultTypeDescriber, performance varies considerably based on the “kind” of dynamic you are writing. Cesil special cases “well known” dynamic types for improved performance – namely the dynamic rows Cesil creates and ExpandoObject are treated specially. For other DLR aware types Cesil will use IDynamicMetaObjectProvider directly, which is considerably slower. Plain .NET types delegate to the usual EnumerateMembersToSerialize method, which implements “normal” .NET behavior.

Cesil allows customizing the members discovered, and the order they’ll be written in, by using a custom ITypeDescriber with your Options and implementing the GetCellsForDynamicRow directly. Simple inclusive/exclusive can be controlled by subclass the DefaultTypeDescriber and overriding the ShouldIncludeCell method. I’ll cover how this works in more detail in a later post that goes in depth into all of Cesil’s configuration options.

And that’s about it for dynamic serialization – there’s not a lot to cover since so much of it is “just like writing static types, but dynamic.” This post’s Open Question is, accordingly, more “tactical” than previous ones:

Is there a better interface for discovering dynamic “cells” than the IEnumerable<DynamicCellValue>-returning GetCellsForDynamicRow() method currently on ITypeDescriber?

The interface isn’t technical wrong, but it has the undesirable property that general implementations will allocate at least a little bit for each row written. An allocation-free alternative would be a marked improvement, provided it doesn’t come at the cost of flexibility or reasonable performance.

As before, I’ve opened an issue to gather long form responses. Remember that, as part of the sustainable open source experiment I detailed in the first post of this series, any commentary from a Tier 2 GitHub Sponsor will be addressed in a future comment or post. Feedback from non-sponsors will receive equal consideration, but may not be directly addressed.

In my next post I’ll go into detail on all the configuration options Cesil supports. It’ll be a long post, as Cesil supports customizing the expected format, as well as almost every detail of describing and mapping types.

Kevin Montrose

Overthinking CSV With Cesil: Writing Dynamic Types

Related

Archive