Designing the experiment - Part 1

Part 1 - Designing the product change

Design the experiment

There are many good books and articles about this subject so for now we’ll just cover the key ideas, while focusing more on the hypothesis and experimentation aspects. The aim for this step is to use the knowledge and insights you’ve build up to propose a change that will improve your product in a way that can be measured. The aim is to make a prediction that implementing a specific change will cause a specific impact.

Begin with the vision

This is the driving force behind your product and business. The ultimate aim for every change you make is to move your product closer to achieving your vision. Or to achieving objectives, key results, and goals that have been identified as milestones towards your vision.

What problem are you trying to solve?

The most valuable thing you can do is learn as much about your customers as possible. (You know exactly who your target market is, right?). As Steve Blank says: “get the heck out of the building and talk to your customers”. More importantly: ask them questions and listen to them!

Discover their problems and pains, their needs and desires. Understand their context, their environment and the situation they’re in when they experience all those emotions. Work out their jobs-to-be-done.

What do you know so far?

Review all the data you have collected. Data can be qualitative or quantitative. It can be observations and trends from usability testing, prototype testing, and user research. Or it can be numerical values from analytics tools, industry reports, and previous experiments. Or it can come from many other sources. The crucial point is that the data must be objective, and not subjective.

These priceless insights will help guide you towards the most valuable problems to tackle. They’ll also help you generate potential product improvements and give you new ideas to try. In fact, you’ll probably end up with more ideas than you can deal with concurrently.

Choose a starting point

Choosing which problem or idea to test first can be tough. You’ll have a higher chance of success, though, if you base your decision on objective data rather than opinion. This also helps when having to decide between ideas or mediate discussions. As Jim Barksdale, the former CEO of Netscape, once said:

“If we have data, let’s look at data. If all we have are opinions, let’s go with mine.” - Jim Barksdale

Try and keep 3 things in mind:

  • the vision (and the North Star metrics that measure progress towards it)
  • which ideas or problems are backed up with the most insights
  • where the riskiest assumptions or largest unknowns are

You could also consider:

  • which is the biggest customer problem?
  • which has the largest potential impact?
  • which has the largest addressable market segment?

Or you could target the one with potentially the least complex solution to implement. However, beware of making technical estimates upfront. Until you get into the reeds it’s hard to know how deep the marsh is. You should also be wary of tackling low hanging fruit. It can give you a sense of progress, like ticking items off a To Do list, but may not move you substantially closer to your strategic objective. Don’t forget to keep asking yourself whether a proposed product change has the potential to result in your business and customers achieving their desired outcomes.

Another key consideration is that it’s more important to do something and learn quickly from it, rather than agonise over what to do. Put the focus on learning, rapid experimentation and hypothesis testing to quickly find the best path to your objective.

Hypothesis, hypothesis, hypothesis

A hypothesis is a prediction or statement that we believe to be true (based on the data and insights we already know), expressed in a way that it can be accurately tested. It is often written as “if we do X then Y will happen” or “Q happens because R”

An aside:

In scientific research, scientists want to understand why things behave in the way they do. A scientist takes a specific question and studies all the existing research and knowledge about that area to form a possible explanation. She can then express this as a hypothesis and test it to see if the data supports or refutes it. In Growth and Product Development we want to help people achieve their needs and desires so we are more interested in changing behaviour than understanding why it happens. We want to find solutions to problems. However, we are still answering a similar question, we may just stop short of a scientific explanation when we have reached our desired outcome. For example, “I want more people to click this button” is effectively “why aren’t more people clicking this button”. We may do some usability testing and conclude that the text of our button (“Buy now”) is too committing and test “Add to basket”. If this increases the rate of button clicks then we have achieved our outcome. We didn’t find out why it worked, just that it did work. Perhaps this is good enough, or perhaps a deeper understanding may guide us to an even better solution…

If you are at the start of the product discovery phase, with an unproven problem, you can break it down into the underlying assumptions and frame each one as a hypothesis. These can be individually tested to rigorously assess whether the assumptions are true or not and therefore whether the problem represents a valuable opportunity that is worth investing in.

If you already have enough insights and data that demonstrate a problem is worth solving (for both your customers and your business) you can express potential solutions and MVPs as hypotheses. Each hypothesis captures the effect we believe a feature will have in the form of testable outcomes that we can learn from.

If you are working on a mature product with plenty of ideas in the pipeline, good hypotheses keep you honest. They help you prioritise, and ensure the only changes you make to your product are ones that improve the performance of your key metrics.

Hypotheses enable us to conduct thorough investigations but they need to be well-constructed.

A solid hypothesis is the foundation of a high-quality experiment. It is the cornerstone of our ability to discover valuable solutions and navigate towards our vision.

A weak hypothesis merely states what you want to believe with no way of checking it. It risks you pulling the wool over your eyes. It won’t help you get the data that would give you a clearer understanding and move you towards a better solution.

“If you do not know how to ask the right question, you discover nothing.” – W. Edward Deming

Anatomy of a hypothesis

It’s important to spend time crafting a strong hypothesis because it is crucial to performing a quality experiment and making meaningful, trustworthy conclusions.

There are 3 essential components to a solid hypothesis. It must be:

  • Falsifiable
  • Testable
  • Based on objective data and insights.

The most important of these is being falsifiable. This means it is possible for the hypothesis to be proven false. In other words, it’s plausible that we could observe examples that contradict the prediction. For example, “All tennis balls are yellow”. You may have only ever seen yellow tennis balls but it is possible that orange, red, blue or other coloured tennis balls exist. We only need to see one tennis ball that is not yellow to prove this statement is false, and reject the hypothesis. Another falsifiable statement is “No Scotsman wears underwear beneath his kilt.” You might have to be very charming to disprove this but it is certainly possible to imagine that some Scotsmen may wear underwear. We only need to find one Scotsman who prefers less breezy nether-regions to show the statement is false. On the other hand, “The Loch Ness monster exists” cannot be proved false. We cannot conceive of a scenario where we can unequivocally say “no Loch Ness monster exists”, so this statement is not falsifiable. The best way to be falsifiable is to state the precise change you want to test and the specific outcome you predict it will have. For example, the statement “improving the registration process will make it easier to sign up” is not falsifiable but if we make it more specific it becomes a better hypothesis: “reducing the number of inputs fields on the registration form to just ‘Email’ and ‘Password’ will increase the sign-up rate by 15%.”

It must also be practically possible to carry out a test of the hypothesis. “It is not possible to boil an egg on Mars” is falsifiable in principle (fly to Mars with a chicken and a camping stove), but not in practice. We need to be able to measure the outcome of the experiment. A strong hypothesis often appears easy to disprove!

The more your hypothesis is based on data and prior learning, the greater your chances are of improving your key metrics. We want to make an educated guess, not a leap in the dark. Robust hypotheses are based on facts, not speculation. They mean we’re more likely to test the right things, and waste less time and resources on ideas plucked from thin air.

Beware the HiPPO!

The Highest Paid Person’s Opinion can sometimes carry more weight than it deserves. When a team feels pressure to test an idea because of who suggested it, the HiPPO has been at work.

Most of the time the HiPPO is trying to be helpful. So, suggest they fill out a job story, or a hypothesis statement. Ask them what their insights are that led them to their recommendation. If it was just a gut feeling they should realise their case is a weak one.

The HiPPO shouldn’t necessarily be dismissed out of hand though. Sometimes they have years of domain knowledge and they have thought the hardest about the problem. So, explore the problem together, plumb their knowledge (separating objective data from subjective opinions), and use that to guide you to a potential experiment.

Remember: every company has a HiPPO or two, if you can’t think who it is, it could be you..!

Simple is smart

When crafting a hypothesis try to get to the heart of exactly what you want to test. Identify the variables that could muddy the water. Don’t change too many things at once or it will be impossible to know exactly what it was that improved (or worsened) your product’s performance. Focus on one thing only, isolating it from other factors that could affect the experiment. For example, if you’re testing the performance of a landing page with a paid acquisition channel, then you need to consider things like:

  • Market
  • Device (mobile, tablet, desktop?)
  • The creative (will the expectation it gives people match what they see on the landing page?)
  • The campaign spend (and whether the channel uses any optimisation algorithms)
  • Demographics (does the channel allow you to set them, or target specific ones, or will it attempt to optimise who sees the creative itself?)
  • etc

The smaller and simpler the experiment, the more experiments you can run and the more things you can test. Bear in mind, though, that you could be optimising around a local maximum (like reaching the top of a small hill in the mist and not realising there is a far bigger summit close by). Changing many elements at once is a tempting way to break out of the local maxima but sometimes the overall effect is neutral because some elements improved your key metric while others worsened it. Even if you test many elements individually and they all improve your key metric, you need to run a single experiment containing all of them, to be sure they work together.

An alternative approach is to try to identify a key change that could reveal a new direction or a step-change opportunity. The change itself is not necessarily chosen to improve your product, but to expose a new peak to climb. This is best achieved when done in conjunction with qualitative research, prototype testing, and investigating people’s jobs to be done.

Attention-seeking vs. the Novelty Effect

When designing a UI element or creative it is worth considering how noticeable you want to make it. On one hand, it should sit naturally within your product, with a prominence equal to its value. However, there is nothing worse than wondering whether nobody engaged with your feature in the experiment because they weren’t interested, or because they didn’t notice it. At this point it’s hard to avoid the Ikea Effect, where you like your wonky cupboard more than other people because you built it and you put a lot of effort into it. It’s better to over-emphasise the new feature for the experiment (within reason) and if you decide to accept the change, tone it down afterwards. This removes questions around the feature’s discoverability and helps you remain impartial.

You also need to bear in mind that returning customers could click on it because it’s new and they’re curious about it (this is known as the Novelty Effect). So it’s a tricky balance. You can compare the data from new visitors with returning visitors and look for large differences in behaviour to help you understand how much of an impact the Novelty Effect has had.

A hypothesis can never be proven true

A common misconception is that we set out to prove a hypothesis. Unfortunately, it’s impossible to prove a hypothesis. We can find data that supports a hypothesis but we can never conclusively prove it. However, we can disprove a hypothesis, and conclusively reject it.

As an example, let’s use the tennis balls hypothesis from above: “all tennis balls are yellow”. Imagine we had ten brand new canisters of tennis balls and after opening them we see that all the tennis balls inside are yellow. This supports our hypothesis but it’s possible that if we opened even more canisters we might find a different colour. On the other hand, if we find a blue tennis ball in one of the canisters we have disproven the hypothesis. We only needed 1 observation to the contrary to show it was not true.

This may seem like a simplistic example but even accepted scientific theories can be overturned. Newton’s Laws of Motion were tested many times over three hundred years and all the experiments supported them. However, one day Albert Einstein showed Newton’s laws didn’t work under certain circumstances and proposed a new theory: General Relativity. Many experiments support General Relativity but it’s possible that some day it, too, could be disproved.

You can never prove a hypothesis, but you can disprove a hypothesis and reject it. This is an important concept. It’s the reason why we approach experiments in a slightly counter-intuitive, backwards manner. Instead of trying to prove that your change will have a specific effect, we try to disprove the hypothesis that your change will have no effect. This hypothesis is called the null hypothesis. If we can reject the null hypothesis, then your change has had an effect. We will go into this in more details in later sections.