The best experiments surprise you

Craft → Analytical Sense

Defining

A surprising experiment is one where the estimated result beforehand and the actual result differ by a lot. So that absolute value of the difference is large.

Ronny KohaviThe ultimate guide to A/B testing

Watch at 00:17:32

Supporting

If you're very pessimistic, you might miss the surprising result that pops out of an experiment. You might force yourself to do a bunch of experiments grudgingly, but you're like, 'You know what? I hate this. I'm doing four experiments today because I have to do it because I want to be an entrepreneur, but it sucks and everything's miserable and black.' And then you won't notice that, oh, this thing didn't work, but it didn't work in an interesting way.

Sam SchillaceHow to be more innovative

Watch at 00:20:15

Supporting

Let's just pretend we ran that experiment. What do you think it'll come back with? Or let's pretend we ran that user study. And the PMs who have the ability to imagine those outcomes, I think, it helps us be much more efficient, too, because we're like, well, if we all think that it's going to go there and that's not going to compel us to take any action, why do it at all?

Yuhki YamashitaAn inside look at how Figma builds product

Watch at 00:50:43

Nuanced

Twyman's law, the general statement is if any figure that looks interesting or different is usually wrong. If the result looks too good to be true, your normal movement of an experiment is under 1% and you suddenly have a 10% movement, hold the celebratory dinner.

Ronny KohaviThe ultimate guide to A/B testing

Watch at 01:00:51

With caveats

A P value is a statistical measure used in A/B testing to determine if experimental results are statistically significant, commonly set at 0.05 (5%).

Many people assign one minus P value as the probability that your treatment is better than control. That is wrong.

Ronny KohaviThe ultimate guide to A/B testing

Watch at 01:02:30

With caveats

The "success rate" refers to the historical percentage of A/B tests at Airbnb that showed positive results.

At Airbnb, where the success rate is only 8%, if you get a statistically significant result with a P value less than 0.05, there is a 26% chance that this is a false positive result. It's not 5%, it's 26%.

Ronny KohaviThe ultimate guide to A/B testing

Watch at 64:54

The best experiments surprise you

Add to Home Screen

The Missing Stamp