Are you measuring your social media buttons?

Posted: 2012/06/20 | Author: kevinmontrose | Filed under: pontification | 4 Comments

I’d like you to go to a Stack Overflow (or any Stack Exchange) question, and see if you spot a nominally major change. Such as this one, with a nice answer from Raymond Chen (I think it’s been more than month since I recommended his blog, seriously go read it).

Spot it?

Of course not, what we did is remove our social media sharing buttons.

Left is old, right is new. A number of other instances were also removed.

The Process

I’m a big advocate of arguing from data and while I freely admit there are many things that are hard to quantify, your gains from social media buttons aren’t one of them.

When discussing sharing buttons, be careful about establishing scope. Keeping or removing social media buttons isn’t about privacy, personal feelings on Twitter/Facebook/G+, or “what every other site does” (an argument I am particularly annoyed by); it’s about referred traffic and people. There’s a strong temptation to wander into debate, feeling, and anecdotal territory; stick to the data.

You also need to gather lots of data, at Stack Exchange we prefer to slap a pointless query parameter onto urls we’re tracking. It’s low tech, transparent, and doesn’t have any of the user facing penalties redirects have. In particular we slapped ?sfb=1, ?stw=1, and ?sgp=1 onto links shared via the appropriate sharing buttons. Determining click-throughs is just a matter of querying traffic logs with this approach. We’ve been gathering data on social media buttons in particular for about four months.

Note that we’re implicitly saying that links that are shared that nobody clicks don’t count. I don’t think this is contentious, spamming friends and followers (and circlers(?), whatever the G+ equivalent is) is a bit on the evil side. Somebody has to vindicate a share by clicking on it, otherwise we call it a waste.

This sort of nonsense isn’t worth it.

With all this preparation we can actually ask some interesting questions; for Stack Exchange we’re interested in how much traffic is coming from social media (this is an easy one), and how many sharers do so through a button.

Traffic gave us no surprises, we get almost nothing from social media. Part of this is probably domain specific, my friends really don’t care about my knowledge of the Windows Registry or the Diablo III Auction House. The sheer quantity of traffic coming from Google also drowns out any other source, it’s hard to get excited about tweaking social media buttons to bring in a few thousand extra views when tiny SEO changes can sling around hundreds of thousands. To put the difference in scale in perspective, Google Search sends 3 orders of magnitude more traffic our way then the next highest referrer which itself sends more than twice what Twitter does (and Twitter is the highest “social” referrer).

Now that I’ve established a (rather underwhelming) upper-bound for how much our share buttons were getting us, I need to look at what portion can be ascribed to the share buttons versus people just copy/pasting. This is where the slugs come in, presumably no-one is going to add a random query parameter to urls they’re copy/pasting after all.

You basically want to run a query like this:

SELECT *
FROM TrafficLogs
WHERE
-- everybody tries to set Referer so you know you got some juice from them
(RefererHost = 't.co' OR RefererHost = 'plus.url.google.com' OR RefererHost = 'www.facebook.com')
AND
-- Everybody scrapes links that are shared, don't count those they aren't people
UserAgent NOT LIKE '%bot%'
AND
-- Just the share buttons, ma'am
(Query LIKE '%stw=1%' OR Query LIKE '%sfb=1%' OR Query LIKE '%sgp=1%')

Adapt to your own data store, constrain by appropriate dates, etc. etc. But you get the idea. Exclude the final OR clause and the same query gives you all social media referred traffic to compare against.

What we found is that about 90% of everyone who does share, does so by copy/pasting. Less than 10% of users make use of the share buttons even if they’ve already set out to share. Such a low percentage of an already pretty inconsequential number doesn’t bode well for these buttons.

One final bit of investigation that’s a bit particular to Stack Exchange is figuring out the odds of a user sharing their newly created post. We did this because while we always show the question sharing links to everyone, the answer sharing links are only shown to their owner and only for a short time after creation. Same data, a little more complicated queries, and the answer comes out to ~0.4%. Four out of every thousand new posts* on Stack Overflow get shared via a social media button (the ratio seems to hold constant for the rest of Stack Exchange, but there’s a lot less data to work with so my confidence is lower on them).

Data Shows They’re No Good, Now What?

Change for change’s sake is bad, right? Benign things shouldn’t be moved around just because, but for social media buttons…

We know our users don’t like them, they’re sleezy (though we were very careful to avoid any of the tracking gotchas of +1 or Likes), and they’re cluttering some of our most important controls (six controls in a column on every question is a bit much, but to be effective [in theory] these buttons have to be prominent).

These buttons start with a lot of downsides, and the data shows that for us they don’t have much in the way of upsides. So we we could either trudge along telling ourselves “viral marketing” and “new media” (plus we’ve already got them right, why throw them away?), or admit that these buttons aren’t cutting it and remove them.

So, are you measuring the impact of your site’s social media buttons?

You’re probably not in the same space as Stack Exchange, your users may behave differently, but you might be surprised at what you find out when you crunch the numbers.

_{*By coincidence this number seems to be about the same for questions and answers, it looks like more people seeing question share buttons balances out people being less enthusiastic about questions.}

Extending Type Inferencing in C#

Posted: 2012/06/10 | Author: kevinmontrose | Filed under: code | Comments Off

A while ago I used the first Roslyn CTP to hack truthiness into C#. With the second Roslyn CTP dropping on June 5th, now pretty close to feature complete, it’s time think up some fresh language hacks.

Roslyn What Now?

For those unfamiliar, Roslyn is Microsoft’s “Compiler as a Service” for the VB.NET and C# languages. What makes Roslyn so interesting is that it exposes considerably more than “Parse” and “Compile” methods; you also have full access to powerful code transformations and extensive type information.

Roslyn is also very robust in the face of errors, making the most sense it can of malformed code and providing as complete a model as possible. This is very handy for my evil purposes.

Picking A Defect

Now I happen to really like C#, it’s a nice, fast, fairly rapidly evolving, statically typed language. You get first class functions, a garbage collector, a good type system, and all that jazz. Of course, no language is perfect so I’m going to pick one of my nits with C# and hack up a dialect that addresses it using Roslyn.

The particular defect is that you must specify return types in a method declaration, this is often pointless repetition in my opinion.

Consider this method:

static MvcHtmlString ToDateOnlySpanPretty(DateTime dt, string cssClass)
{
  return MvcHtmlString.Create(String.Format(@"<span title=""{0:u}"" class=""{1}"">{2}</span>", dt, cssClass, ToDateOnlyStringPretty(dt, DateTime.UtcNow)));
}

Is that first MvcHtmlString really necessarily? Things get worst when you start returning generic types, oftentimes I find myself writing code like:

Dictionary<string, int> CalcStatistics()
{
  var result = new Dictionary<string, int>();
  // ...
  return result;
}

Again, the leading Dictionary<string, int> is really not needed the type returned is quite apparent so we’re really just wasting key strokes there.

This pointless repetition was addressed for local variables in C# 3.0 with the var keyword. With Roslyn, we can hack up a pre-processor that allows var as a method’s return type. The above would become the following:

var CalcStatistics()
{
  var results = new Dictionary<string, int>();
  // ...
  return results;
}

The Rules

Wanton type inferencing can be a bit dangerous, it makes breaking contracts pretty easy to do for one. So I’m imposing a few constraints on where “var as a return” can be used, for the most part these are arbitrary and easily changed.

var can only be used on non-public and non-protected methods
if a method returns multiple types, a type will be chosen for which all returned types have an implicit conversion with the following preferences
- classes before interfaces
- derived types before base types
- generic types before non-generic types
- in alphabetical order
- with the exceptions that Object is always considered last and IEnumerable (and IEnumerable<T>) are considered before other interfaces
a method with empty returns will become void

The last rule may be a little odd as C# doesn’t have a notion of a “void type” really, which is something of a defect itself in my opinion. Which type is chosen for a return is well-defined so it’s predictable (always a must in a language feature) and attempts to match “what you meant”.

I made one more handy extension, which is allowing var returning methods to return anonymous types. You can sort of do this now either passing around Object or using horrible grotty hacks (note, don’t actually do that); but since there’s no name you can’t do this cleanly. Since var returning methods don’t need names, I figured I might as well address that too.

Let’s See Some Code

Actually loading a solution and parsing/compiling is simple (and boring), check out the source for how to do it.

The first interesting bit is finding all methods that are using “var” as a return type.

// eligibleMethods is a List<MethodDeclarationSyntax>
var needsInferencing =
 eligibleMethods
 .Where(
  w => (w.ReturnType is IdentifierNameSyntax) &&
  ((IdentifierNameSyntax)w.ReturnType).IsVar
 ).ToList();

This literally says “find the methods that have return type tokens which are ‘var'”. Needless to say, this would be pretty miserable to do from scratch.

Next interesting bit, we grab all the return statements in a method and get the types they return.

var types = new List<TypeInfo>();
// returns is a List<ReturnStatementSyntax>
foreach (var ret in returns)
{
  var exp = ret.Expression;

  if (exp == null)
  {
    types.Add(TypeInfo.None);
    continue;
  }

  var type = model.GetTypeInfo(exp);
  types.Add(type);
}

Note the “model.GetTypeInfo()” call there, that’s Roslyn providing us with detailed type information (which we’ll be consuming in a second) despite the fact that the code doesn’t actually compile successfully at the moment.

We move onto the magic of actually choosing a named return type. Roslyn continues to give us a wealth of type information, so all possible types are pretty easy to get (and ordering them is likewise simple).

var allPossibilities = info.SelectMany(i => i.Type.AllInterfaces).OfType<NamedTypeSymbol>().ToList();
allPossibilities.AddRange(info.Select(s => s.Type).OfType<NamedTypeSymbol>());
// info is an IEnumerable<TypeInfo>
foreach (var i in info)
{
  var @base = i.Type.BaseType;
  while (@base != null)
  {
    allPossibilities.Add(@base);
    @base = @base.BaseType;
  }
}

Rewriting the method with a new return type is a single method call, and replacing the “from source” method with our modified one is just as simple.

And that’s basically it for the non-anonymous type case.

Anonymous Types

For returning anonymous types, everything up to the “rewrite the method” step is basically the same. The trouble is, even though we know the “name” of the anonymous type we can’t use it. What we need to do instead is “hoist” the anonymous type into an actual named type and return that instead.

This is kind of complicated, but you can decompose it into the following steps:

Make a note of returned anonymous types
Rewrite methods to return a new type (that doesn’t exist yet)
Rewrite anonymous type initializers to use the new type
Create the new type declaration

Steps #1 and #2 are easy, Roslyn doesn’t care that your transformation doesn’t make sense yet.

Step #3 requires a SyntaxRewriter and a way to compare anonymous types for equality. The rewriter is fairly simple, the syntax for an anonymous type initializer and a named one vary by very little.

Comparing anonymous types is a little more complicated. By the C# spec, anonymous are considered equivalent if they have the same properties (by name and type) declared in the same order.

Roslyn gives us the information we need just fine, so I threw a helper together for it:

internal static bool AreEquivalent(TypeSymbol a, TypeSymbol b, Compilation comp)
{
  var aMembers = a.GetMembers().OfType<PropertySymbol>().ToList();
  var bMembers = b.GetMembers().OfType<PropertySymbol>().ToList();
  if (aMembers.Count != bMembers.Count) return false;
  for (var i = 0; i < aMembers.Count; i++)
  {
    var aMember = aMembers[i];
    var bMember = bMembers[i];
    if (aMember.Name != bMember.Name) return false;
    if (aMember.DeclaredAccessibility != bMember.DeclaredAccessibility) return false;
    var aType = aMember.Type;
    var bType = bMember.Type;
    var aName = aType.ToDisplayString(SymbolDisplayFormat.FullyQualifiedFormat);
    var bName = bType.ToDisplayString(SymbolDisplayFormat.FullyQualifiedFormat);
    if (aName == bName) continue;
    var conv = comp.ClassifyConversion(aType, bType);
    if (!conv.IsIdentity) return false;
  }
  return true;
}

Notice that Roslyn also gives us details about what conversions exist between two types, another thing that would be absolutely hellish to implement yourself.

The final step, adding the new type, is the biggest in terms of code although it’s not really hard to understand (I also cheat a little). Our hoisted type needs to quack enough like an anonymous to not break any code, which means it needs all the expected properties and to override Equals, GetHashCode, and ToString.

This is all contained in one large method. Rather than reproduce it here, I’ll show what it does.

Take the anonymous type:

new
{
A = "Hello",
B = 123,
C = new Dictionary<string, string>()
}

This will get a class declaration similar to

internal class __FTI88a733fde28546b8ae4f36786d8446ec {
  public string A { get; set; }
  public int B { get; set; }
  public global::System.Collections.Generic.Dictionary<string, string> C { get; set; }
  public override string ToString()
  {
    return new{A,B,C}.ToString();
  }
  public override int GetHashCode()
  {
    return new{A,B,C}.GetHashCode();
  }
  public override bool Equals(object o)
  {
    __FTI88a733fde28546b8ae4f36786d8446ec other = o as __FTI88a733fde28546b8ae4f36786d8446ec;
    if(other == null) return new{A,B,C}.Equals(o);
    return
      (A != null ? A.Equals(other.A) :
      (other.A != null ? other.A.Equals(A) : true )) &&
      B == other.B &&
      (C != null ? C.Equals(other.C) : (other.C != null ? other.C.Equals(C) : true ));
  }
}

The two big cheats here are the lack of a constructor (so it’s technically possible to modify the anonymous type, this is fixable but I don’t think it’s needed for a proof-of-concept) and using the equivalent anonymous type itself to implement the required methods.

Conclusion

I’m pretty pleased with Roslyn thus far, it’s plenty powerful. There are still a bunch of limitations and unimplemented features in the current CTP though, so be aware of them before you embark on some recreational hacking.

In terms of complaints, some bits are a little verbose (though that’s gotten better since the first CTP), documentation is still lacking, and there are some fun threading assumptions that make debugging a bit painful. I’m sure I’m Doing It Wrong™ in some places, so some of my complaints may be hogwash; I expect improvements in each subsequent release as well.

If it wasn’t apparent, the code here is all proof-of-concept stuff. Don’t use in production, don’t expect it to be bug free, etc. etc.

You can check out the whole project on Github.

Kevin Montrose