
Jonny Longden on Strategic Experimentation That Scales
A practical expansion of Jonny Longden's viral post on moving from random A-B tests to strategy-led experimentation systems.
Jonny Longden recently shared something that caught my attention: "Many CRO/experimentation programmes are an exercise in cycling through random ideas and best practices to see if any move a metric." That single line captures a problem I see everywhere: teams run lots of tests, ship plenty of changes, and still struggle to explain what they actually learned or what it means for the business.
Jonny also added an important nuance: there is "nothing wrong with this per se" because accidental wins and unexpected insights do happen. But his real challenge is sharper: are you learning strategically, or just spinning the wheel?
The hidden cost of "random ideas" experimentation
A pipeline filled with disconnected A-B tests can look productive. Dashboards update, stakeholders see activity, and there is always another "best practice" to try. The trouble is that activity is not the same as progress.
When experimentation becomes a grab bag of ideas, you typically get:
- Local optimizations that do not compound
- Lessons that cannot be generalized beyond one page or one campaign
- A backlog dominated by opinions, not evidence
- A team that measures output (tests shipped) instead of outcomes (decisions improved)
"True experimentation begins with strategy and theory."
That line from Jonny is the pivot. Strategy-led experimentation uses tests to reduce uncertainty around the decisions that matter, not just to chase incremental lifts.
The Apollo lesson: breakthroughs come from systems, not isolated tests
Jonny’s Apollo analogy is doing a lot of work, and it is spot on. The moon missions were not a collection of random propulsion experiments. They were an orchestrated program with a clear objective, coordinated teams, shared standards, and a disciplined learning loop.
Translate that to growth and CRO: the "genius" is not any single experiment idea. It is the operating system that ensures every test contributes to a coherent map of what is true, what is changing, and what you should do next.
If you have ever felt like your org is running experiments but not moving, you likely have a tooling capability without a system capability.
From business challenge to theory: start where it hurts
Jonny referenced a real retail example: the brand wanted to appeal to a younger demographic and did not know how to achieve it. That is exactly the kind of messy, high-level goal where random testing fails.
A better approach is to turn the goal into a theory of change.
Step 1: Define the strategic question
Instead of "What should we test next?" start with:
- What must become true for younger customers to choose us?
- What beliefs or barriers are stopping them today?
- Which levers could plausibly change that?
This creates a decision-oriented problem statement, not a tactics-oriented backlog.
Step 2: Create strategic hypotheses (the big levers)
Jonny described breaking the challenge into strategic hypotheses first, such as:
- The shift happens through changes to product
- Or it happens through media and targeting
This is not semantics. Those two paths imply different teams, budgets, timelines, and risks. Experiments here are not just about conversion rate. They are about choosing where to invest.
Strategy-led experimentation is about "bigger strategic decisions and pivots," not just micro-optimizations.
Step 3: Break strategic hypotheses into functional hypotheses
Once you pick the big levers, you define what would need to be true operationally.
If "product" is the lever, functional hypotheses might include:
- Younger shoppers respond to specific styles, fits, or bundles
- Price architecture is misaligned with their willingness to pay
- The brand story and value proposition do not match their identity goals
If "media and targeting" is the lever, functional hypotheses might include:
- Current channels over-index on older audiences
- Creative does not signal relevance to younger segments
- Landing experiences do not match the promise of the ad
Now you have testable statements that can guide what to build, measure, and learn.
Step 4: Design experiments that fit the question (not the tool)
Jonny explicitly notes that these experiments are "not necessarily A-B tests" and can run across the whole business. This is a critical unlock.
Different questions require different methods:
- Message-market fit questions: run concept tests, ad creative tests, qualitative interviews, on-site surveys
- Product desirability questions: limited drops, waitlists, pre-orders, concierge tests
- Channel viability questions: geo tests, incrementality tests, media mix experiments
- Experience questions: A-B tests, multivariate tests, personalization pilots
The method should match the uncertainty you are trying to reduce.
What an experimentation operating system actually includes
Jonny’s point about "careful centralised management, coordination and operating systems" is where most programs break. Many companies buy an A-B testing platform and assume they bought experimentation. They did not.
A scalable experimentation system typically includes:
1) A clear strategy and an explicit theory
Document the goal, constraints, and the logic of how change will happen. If your program cannot articulate its theory in plain English, your backlog will default to "best practices."
2) A hypothesis hierarchy
Maintain the chain from:
- Strategic hypothesis (which lever matters)
- Functional hypothesis (what must be true)
- Experiment design (how we will test it)
- Decision rule (what we will do if we see X)
This is how learning compounds. Without the hierarchy, every result is isolated.
3) Portfolio management, not a queue
Treat experiments like a portfolio across:
- Time horizons (quick wins vs foundational bets)
- Risk levels (incremental vs disruptive)
- Domains (product, marketing, pricing, UX, retention)
A single team should be able to explain why the current portfolio matches the business strategy.
4) Central governance with distributed execution
Centralization does not mean one team runs everything. It means:
- Shared standards for rigor and measurement
- A single source of truth for results and learnings
- Consistent prioritization criteria
- Guardrails around brand, ethics, and customer impact
Execution can and should be distributed across product, marketing, and engineering, as long as the learning system is coordinated.
5) Learning loops that end in decisions
Every experiment should answer: "So what?"
- What did we learn about customers or the market?
- What does it change about our strategy or roadmap?
- What will we stop doing because of this?
- What new hypothesis does it unlock?
If your experiment write-ups end with "stat sig: yes" and do not lead to a decision, the program is drifting.
The common anti-pattern: platform plus a lonely login
Jonny’s closing critique is painfully familiar: organizations think they "tick the box" because they have an A-B testing tool and "someone in the corner of a marketing team with a login for it."
That setup produces two predictable outcomes:
- Testing becomes disconnected from product and strategy
- The team is incentivized to run small, safe tests that avoid cross-functional coordination
"This is not experimentation, and you will not learn or grow with this approach."
If you want experimentation to be a growth engine, it must be treated like an organizational capability, not a marketing tactic.
A practical way to start next week
If Jonny’s post makes you suspect your program is too random, here is a simple reset:
- Pick one strategic business question (not ten). Example: "How do we become relevant to younger shoppers?"
- Write 2-3 strategic hypotheses that could plausibly solve it (product vs media is a strong starting split).
- For each, list 3-5 functional hypotheses that would need to be true.
- Design a mix of experiments that can validate or invalidate those hypotheses quickly.
- Create a single learning repository and a recurring forum where results turn into decisions.
Do that for one quarter and you will feel the difference: fewer random tests, more directional clarity, and better conversations with leadership.
This blog post expands on a viral LinkedIn post by Jonny Longden, Chief Growth Officer @ Speero | Growth Experimentation Systems & Engineering | Product & Digital Innovation Leader. View the original LinkedIn post →