Jim Nielsen’s Blog
Preferences
Theme: This feature requires JavaScript as well as the default site fidelity (see below).
Fidelity:

Controls the level of style and functionality of the site, a lower fidelity meaning less bandwidth, battery, and CPU usage. Learn more.

Podcast Notes: “Measuring Design” by Clearleft

A couple weeks ago, I listened to the “Measuring Design” episode of the Clearleft podcast.

When I say “listen”, what I was really doing was this:

  1. Listen for 30-90 seconds.
  2. Hear an incredibly insightful, spot-on comment that resonated in my bones.
  3. Hit pause, rewind, and take notes from what I heard and the impression(s) that came to mind.
  4. Repeat steps 1–3 for the entirety of the episode.

Fortunately, there’s a transcript of the episode so I didn’t have to transcribe the excerpts that stood out to me while listening—thank you × 1,000,000 to whoever does that!

Now I’ve come around to organizing my notes from the podcast and putting them into a blog post.

I’m going to have to try really, really hard to not just copy/paste the entire transcript of this podcast. It‘s that good. Don’t miss it.

Honestly, just skip reading this post and go listen to the episode yourself.

Why are you still reading?

Ok, here are my notes from when I first listened to the podcast.


[MAITE] The thing about just conducting quantitative research, like AB testing, is that they tell you how many, but they don’t tell you why. So you might run an AB test and see that some elements perform better than others, but you don’t really know why. Then if you don’t know why it performs better how do you make that decision again for it to be successful if you don’t really know what it is?

This is a great point. One-off quantitative research like this, A/B testing being a common one I've seen, can easily become a the “helicopter parenting” of product design: they make all the decisions for you. Nothing is left to your judgement, which will atrophy your instincts. “Please data, just tell me what to do.”

You should be looking for insights that help you answer why do a thing, not decision making that tells you what thing to do.


Chris makes the point that users live in a very complex, interconnected world and it’s impossible to pinpoint causality with any degree of certainty for a single change, i.e. “we changed this button color and that made people click it more”. But that’s how a lot of AB testing is practiced.

I think we live in a much more nuanced, complicated and interesting world where the sum of the parts of a website or an app all contribute to the results that you will get from the AB testing.

Given the incredible nuance of the real world, this is how science works. Your conclusions have to be repeatable across varied circumstances. Chris points this out:

If you were a scientist running a test you would repeat that test and you would repeat that experiment. And the confidence in the results would come through the repeatability and knowing that the conditions that you’ve got can be repeated and the result stays the same.

A lot of AB testing, on the other hand, is a “one and done” endeavor. It looks like science, but that’s suspect. Imagine how many AB test conclusions wouldn’t pass a series of repeatable tests, let alone a peer-reviewed journal of scientific conclusions.


[RADHIKA] We’re so focused on metrics.

We’re either saying, you know, let’s measure everything, AB test everything, or we’re so focused on optimizing for metrics, moving things up and to the right, but those aren’t necessarily helping us build better products.

You’ve realized that fundamentally you haven’t really moved the needle despite having optimized for metrics.

Unfortunately, it’s so easy to get caught in the trap where, as employees, we end up viewing our primary product as the metrics we deliver to our bosses. After all, that is what gets us results for why we work (better pay, promotions, bonuses, etc.).

In that scenario, the product we actually deliver to the customer becomes a secondary consideration. Its success can't always be easily defined or immediately measured. So we optimize for quantifiable, abstract metrics to stand as representations of a great product—the numbers and graphs the boss sees in their presentation—rather than an actual great product.

Building a great slide deck with numbers that stand as proxies for a great product often gets rewarded more than building an actual product customers would call great.

[CHRIS] I think there’s often in an organization, a cultural issue around measurement. And that is that organizations have invested in a project or invested in an initiative and they’re looking for good news. And then you get analysts searching around the numbers to find something that looks positive that they can report on.

And that’s very different from using numbers to inform your decisions and to look at the opportunities in the future that you might want to be investigating. And if you just start having that culture of "find me the good news in this" then the numbers just become a fashion parade.


[CHRIS] I think in many organizations, people start looking at the numbers the tool that they’re using can give them. The law of the instrument. So if you’ve got a hammer in your hand, then very quickly the solution to everything is get a nail and start bashing it.

This was my beef with putting an explicit limit on DOM nodes: you look to the numbers the tool gives you and they inform how you think. Everything should now conform to the numbers without understanding why the numbers were established in the first place or what goals they were designed to achieve.

[CHRIS] if you’re looking to measure the experience of something, you should be looking beyond just the numbers that the tools can give you and being wider and more holistic in your view. And by doing that, you’re probably getting less precise with the numbers, but [that’s ok].

A quantifiable limit on DOM nodes is meant to help make the user experience better and faster. If you break that metric but still have an outstanding and fast experience, who the hell cares what the metric says—or what your Lighthouse score is.


[ANDY] We don’t seem comfortable with a degree of ambiguity or a degree of using language or emotion as a way to express certainty or confidence.

Design should be helping pioneer how to contribute to an organization in qualitative ways because the business already knows quantitative.

There’s something appealing about the empiricism of numbers, but as designers we don’t have to speak the language of business exclusively. We can teach them to speak ours: design as emotion.

[ANDY] Again, I think there’s this risk that measurement is just about mathematics. Like something that you can convert into numbers. I don’t think that’s the case.

Qualitative insights around how people feel in sentiment that are expressed through words that can’t be put into numbers are still ways of measuring. It’s just a lot more complicated and harder to feed back what you’ve learned in that process of measurement, because actually it’s about insight.


Final observation:

[JEREMY] The world cannot be understood without numbers but it also can’t be understood with numbers alone.