Philip Jama

Articles /Decision Science /Part 2

How the Bayesian Approach Improves Sample Efficiency

Prior knowledge, adaptive designs, and why you need fewer participants

BayesianRCTA/B TestingExperimentation

In online randomized controlled trials (RCTs), the Bayesian approach offers significant advantages in sample efficiency, meaning you can often reach reliable conclusions with fewer participants or observations compared to traditional frequentist methods. This efficiency stems from two core features of Bayesian statistics: the ability to incorporate prior knowledge and the flexibility of adaptive designs. This article expands on the foundations introduced in Online Experiments with a Bayesian Lens.

The primary ways Bayesian methods lead to greater sample efficiency are:

Incorporation of Prior Knowledge

Before an experiment begins, you often have existing information from previous studies, expert opinion, or related data. The Bayesian framework allows you to formally incorporate this "prior" information into your statistical model. This means the experiment doesn't start from a state of complete ignorance. By providing a head start, this prior information can reduce the amount of new data needed to reach a confident conclusion.

Adaptive Designs and Early Stopping

Bayesian methods are naturally suited for adaptive trial designs. This means you can monitor the results as the data comes in and modify the experiment on the fly. A key advantage here is the ability to stop the experiment early if the evidence strongly favors one variation over another. This is in contrast to frequentist approaches, where peeking at the data and stopping early can invalidate the results. The ability to conclude an experiment as soon as a meaningful result is apparent can save significant time and resources. For example, if a new feature is clearly underperforming, you can stop the experiment and avoid exposing more users to a negative experience.

Dynamic Allocation of Resources

In more advanced adaptive designs, you can dynamically allocate more participants to the better-performing variation. This "multi-armed bandit" approach can maximize the positive impact of the experiment while it's still running.

What Is a Prior in Bayesian Experiments?

A prior is a probability distribution that represents your beliefs about an unknown parameter before you've seen the data from your experiment. It's a way to quantify your initial understanding. Priors can be:

  • Informative: Based on strong evidence from past experiments or historical data.
  • Weakly informative: Expressing some general knowledge without being overly restrictive.
  • Uninformative: Designed to have minimal influence on the results, letting the data speak for itself as much as possible.

The process involves combining this prior with the likelihood (the evidence from your current experiment's data) to produce a posterior distribution. The posterior represents your updated beliefs about the parameter, incorporating both your prior knowledge and the new evidence.

Prior vs posterior: the posterior is narrower and shifted, illustrating how data tightens the credible interval
Prior vs posterior: the posterior is narrower and shifted, illustrating how data tightens the credible interval
Show Python source
import math
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

FT_BG = '#FFF1E5'
FT_CLARET = '#990F3D'
FT_OXFORD = '#0F5499'
FT_TEAL = '#0D7680'

plt.rcParams.update({
    'figure.facecolor': FT_BG,
    'axes.facecolor': FT_BG,
    'savefig.facecolor': FT_BG,
    'font.family': 'sans-serif',
    'font.sans-serif': ['Helvetica Neue', 'Arial', 'sans-serif'],
    'axes.spines.top': False,
    'axes.spines.right': False,
})

def beta_pdf(x, a, b):
    x = min(max(x, 1e-6), 1-1e-6)
    lg = math.lgamma
    logB = lg(a) + lg(b) - lg(a+b)
    logpdf = (a-1)*math.log(x) + (b-1)*math.log(1-x) - logB
    return math.exp(logpdf)

xs = [i/1000 for i in range(0, 1001)]

# Prior: Beta(3, 3) -- wide, centered at 0.5
y_prior = [beta_pdf(x, 3, 3) for x in xs]

# Posterior: Beta(28, 18) -- narrower, shifted toward 0.6
# As if we observed 25 successes out of 40 trials starting from the prior
y_post = [beta_pdf(x, 28, 18) for x in xs]

fig, ax = plt.subplots(figsize=(8, 4))
ax.fill_between(xs, y_prior, color=FT_OXFORD, alpha=0.2)
ax.fill_between(xs, y_post, color=FT_CLARET, alpha=0.25)
ax.plot(xs, y_prior, color=FT_OXFORD, linewidth=2, label='Prior')
ax.plot(xs, y_post, color=FT_CLARET, linewidth=2, label='Posterior')
ax.set_xlim(0, 1)
ax.set_ylim(0, max(max(y_prior), max(y_post)) * 1.15)
ax.set_xlabel('p', fontsize=11, color='#333333')
ax.set_ylabel('density', fontsize=11, color='#333333')
ax.legend()

fig.text(0.5, 0.97, 'Prior vs Posterior',
         ha='center', fontsize=14, fontweight='bold', color='#333333')
fig.text(0.5, 0.935, 'Beta(3,3) prior updated with 25/40 observed successes',
         ha='center', fontsize=10, color='#666666')
fig.text(0.02, 0.01, 'Source: Philip Jama via pjama.github.io',
         fontsize=8, color='#999999', ha='left')
fig.tight_layout(rect=[0, 0.03, 1, 0.92])
fig.savefig('prior_posterior.png', dpi=150, bbox_inches='tight')

print('wrote prior_posterior.png')

Other Comparative Benefits of Bayesian RCTs

Beyond sample efficiency, the Bayesian approach offers several other advantages for running online experiments:

Intuitive and Direct Probabilistic Results

Bayesian analysis provides results that are often easier for non-statisticians to understand. Instead of p-values and confidence intervals, which can be counterintuitive, a Bayesian experiment can give you a direct probability statement, such as "There is a 95% probability that variation B is better than variation A." This clarity helps stakeholders make more confident and informed decisions.

Greater Flexibility in Experimental Design

The Bayesian framework is highly flexible and can be applied to a wide range of experimental designs. It can handle complex scenarios with multiple variations, and it excels at adaptive designs where the experiment is modified in real-time based on incoming data. This adaptability allows for more efficient and ethical testing.

Quantifying Evidence for the Null Hypothesis

A common challenge with frequentist methods is that a "non-significant" result doesn't necessarily mean there's no difference between variations; it just means you failed to find one. Bayesian methods, on the other hand, can provide evidence in favor of the null hypothesis, which can be valuable for understanding when a change has no meaningful effect.

More Efficient for Small Sample Sizes

The ability to incorporate prior knowledge makes Bayesian methods particularly powerful when dealing with limited data, such as in experiments on low-traffic pages or with rare events.

In essence, Bayesian methods provide a more dynamic and intuitive framework for online experiments, often leading to faster and more efficient learning cycles.

Sequential testing decides when to stop collecting evidence. A related but distinct problem: when to stop searching and commit to a choice, given a sequence of options you can't revisit.

View all articles in Decision Science

Collaborate

If you're exploring related work and need hands-on help, I'm open to consulting and advisory. Get in touch