Overthinking CSV With Cesil: Open Source UpdatePosted: 2020/09/24 Filed under: code | Tags: cesil Comments Off on Overthinking CSV With Cesil: Open Source Update
It’s been about 4 months since I started this series on Cesil. In that I’ve published 12 blog posts and made numerous updates to Cesil. Having just released a new version (0.6.0), it feels like a good time to do a small retrospective on some of the less technical parts of my efforts.
First, the GitHub sponsors update – not a single one. I find this unsurprising, as I’m sure most readers do – honestly, it took some effort to not snark about the likely outcome in earlier posts. I do think this serves as a good experimental validation of my expectations though.
I’m not exactly new to OSS, I’ve got a couple libraries with 1M+ downloads, this non-trivial blog, and have some contributions back to the broader ecosystem. In other words, I’m probably a bit above average in terms of OSS footprint. But, I do this for fun (like many others) – I’ve never gone out and solicited sponsorships, or otherwise tried to cultivate a following. Some have seen success with Patreons, or consulting, or sponsored screencasts – all of which I find decidedly unfun.
My big takeaway from this little sponsorship experiment is: things like GitHub Sponsors are tools you can use but creating a sustainable open source project is ultimately a job, and if you’re coding for fun you probably aren’t going to do that job. Modulate your expectations accordingly.
Second, all the Open Questions. I’ve sprinkled nine throughout the blog series so far, and four (~44%) have seen some engagement – not a bad ratio in my opinion.
The “answered” Open Questions which all shipped in version 0.6.0:
- Alternatives to IEnumerable for ITypeDescriber.GetCellsForDynamicRow
- Benjamin Hodgson suggested a Span-based approach that can reuse allocations.
- Do Cesil’s Options provide everything needed in a CSV library?
- Frederic Morel requested multi-character value separators.
- Is there anything missing from IReader(Async) and IWriter(Async)?
- Stefan requested that the write methods which process multiple rows return the actual number of rows written, and new overloads for WriteComment & WriteCommentAsync that take ReadOnlySpan<char> and ReadOnlyMemory<char> respectively.
- How should Cesil treat nullable reference types in client code?
- Adam S-P supported the option which defaulted to runtime enforcement, while allowing clients to override null checking behavior. After some false starts, this approach evolved into the nullability handling that is now part of Cesil.
Remaining Open Questions at time of writing are:
- Are there any useful dynamic operations around reading that are missing from Cesil?
- Do the conversions provided by the DefaultTypeDescriber for dynamic rows and cells cover all common use cases?
- Does Cesil give adequate control over allocations?
- Are there any reasonable .NET type schemes that Cesil cannot read or write?
- What additional testing does Cesil need?
Third and finally, an aside on naming. When I first started on what became Cesil, I was expecting to do a lot of IL generation which meant I’d probably pull in Sigil, and thus “CSV with Sigil” became Cesil in the same way “JSON with Sigil” became Jil. However that never happened, as I got more into development I became convinced that the future is going to look more AOT-y, more source-generator-y, and just less ILGenerator-y.
Then the second I published Cesil folks pointed out how close it was to Cecil, a library for manipulating IL. Given the above, I’m not particularly attached to the name but didn’t have a good alternative and figured both libraries were in different enough areas it was unlikely to be an issue in practice. So naturally I was immediately proven wrong, as I went to contribute some small improvements to Coverlet… which makes extensive use of Cecil. Discussing these changes (which kept happening as Stack Overflow was also considering using Coverlet) was a real laugh riot.
So, I should really change the name of Cesil. I still don’t have any great ideas (naming is hard, after all) so I’ve opened another “Open Question” Issue to collect alternatives. Primary goal is to find something that won’t be confused for other projects, while still at least hinting at “CSV”.
And that wraps up the Open Source update. In the next post, I’ll be digging into performance and maybe giving an update on naming.