When Marketing Trumps Science

Recently I’ve run across a couple of New Zealand companies that sell therapeutic products – one a weight loss pill, the other a jet lag drink – that seem to put marketing first and let science take the back seat. This is by no means new behaviour, but I want to use them as examples to illustrate this widespread problem and suggest what can be done to combat it.

Before promoting a therapeutic product, you should first have good reason to believe that it works. This, I hope, is common sense, but it’s also enshrined in the Fair Trading Act and the Therapeutic Products Advertising Code as they prohibit unsubstantiated claims. This means you have to test the product, and do so rigorously. Rigorous clinical trials are expensive to undertake though, so they’re quite a prohibitive first step.

Instead of jumping straight into the deep end, a useful first step can be to undertake a smaller and less rigorous (and therefore less expensive) experiment. In order to answer the question of whether or not a product actually works you need to conduct a more rigorous trial. There’s a great cost involved in doing this, but if the results of a preliminary trial are optimistic then you have reason to expect a more rigorous trial might give similar results, so the expense might be worth it. You may even be able to get some funding to help with a more rigorous trial on the basis that the preliminary results were positive.


This is what Tuatara Natural Products has been doing with their weight loss pill “Satisfax”. They have completed a low quality preliminary trial on their product, which has been colloquially dubbed the “Fat Mates” trial. Although I don’t believe it has yet been published in a journal at the time I’m writing this article, it was registered retrospectively in the Australian New Zealand Clinical Trials Registry: Effect of the dietary supplement Satisfax (Registered Trademark) on weight loss in overweight volunteers. Some information on the data in the trial can be found on their website as part of an analysis by Dr Chris Frampton: Analysis of Satisfax® Clinical trial

As you can see on the trial registration page, this was an uncontrolled trial on overweight adults. The original plan was to recruit 100 volunteers with the hope that at least 60 will complete the trial. I have to say I’m a bit confused about how many people were in the trial, as apparently the recruitment was increased to 200 applicants after applications opened (a change that has not been reflected in the trial’s registration) yet apparently about 400 people applied. One article claims there were 200 participants, a later media release from Tuatara Natural Products seems to imply 100 were recruited, and the analysis of the trial says there were only 81 participants. Either way, 81 participants completed the full 8 weeks, and 52 of them took the recommended dose for the whole duration of the trial.

If there were 19 or 119 participants who didn’t complete the trial, the statistical analysis seems to ignore them with no justification given. This is unusual – a 19% drop out rate is significant and shouldn’t be swept under the rug. A lot of the time Tuatara also seems to ignore the 29 participants who didn’t drop out but also didn’t take the full dose for the whole duration.

The Science Media Centre posted the responses of 2 experts, Associate Professor Andrew Jull and Professor Thomas Lumley, to a press release from Tuatara Natural Products in February. It’s a good analysis of some of the weaknesses with the study, and I recommend you read it: Kiwi diet pill claims – experts respond

This trial was uncontrolled, and therefore also unblinded and unrandomised. As Professor Lumley explains, this is a problem if you want to draw strong conclusions from its results. It is of low methodological quality, but that’s okay. There is no problem with doing less rigorous trials first if they’re done in order to determine if more rigorous trials are necessary. Dr Glenn Vile, Chief Technical Officer of Tuatara Natural Products and the principal investigator for this study, wrote the following in a comment on a post on the “Fat Mates” trial by Dr John Pickering:

The Fat Mates trial was designed by clinical trial specialists to generate information about the Satisfax® capsules that would help Tuatara Natural Products plan a larger and longer double blind, cross over, placebo controlled trial.

We will use this information to proceed with the next clinical trial, but in the meantime we were so excited the weight loss achieved by most of our Fat Mates was much greater than the placebo effect seen in other weight loss clinical trials that we decided to launch the product so that anybody who is overweight can try Satisfax® for themselves.

Dr Glenn Vile

I think the first part of what I’ve quoted above describes exactly what Tuatara Natural Products should be doing. They’ve conducted their low quality trial, and intend to use its results to proceed with a larger, longer, and more rigorous clinical trial. This is the right way to proceed – they now have an indication that their product might be effective, so they should do the research to find out.

The problem is that that’s not all they’re doing. After performing only a small low-quality trial, they’ve released their product for sale online and have been making a lot of noise about it. In my opinion, they’ve been significantly overstepping the results of their clinical trial. For example, in his comment Dr Vile also said:

our initial trial has shown [Satisfax] to be extremely effective in some overweight people.

Dr Glenn Vile

In their media release on the 20th of February, they reported the average weight lost only by the 52 participants who took the full dose to completion (rounded up from 2.9 kg to “close to 3kg”) but not the average weight lost by all participants. They then reported in bold that the top 26 participants lost more weight, and the top two participants lost even more weight than that!

This cherry picking of the best results appears to have been part of Tuatara Natural Products’ marketing strategy for at least a few months now. In January, Stuff published an article on the trial highlighting the single person who lost the most while participating in it: Blenheim ‘fat mate’ loses 13.5kg in 8 weeks.

That article particularly highlights the person who lost the most weight out of all those in the trial, at 13.5 kg (confusingly, the maximum weight loss reported in the analysis of the trial’s results is 13.3 kg). However, she was one of only 2 participants who lost over 10 kg, and on average the 52 participants who took the recommended dose for the full eight weeks lost 2.9 kg. Losing 13.5 kg is very far from a representative example. I’m not surprised that they didn’t choose instead to focus on the participant who gained 1.2 kg despite taking the recommended dose for the whole duration, but that is actually much closer to the mean change in weight.

The article is, for all intents and purposes, one big testimonial in favour of Satisfax. It was an article, not an advertisement, which is important because in New Zealand it’s illegal to publish any medical advertisement that:

directly or by implication claims, indicates, or suggests that a medicine of the description, or a medical device of the kind, or the method of treatment, advertised… has beneficially affected the health of a particular person or class of persons, whether named or unnamed, and whether real or fictitious, referred to in the advertisement

Medicines Act 1981 Section 58(1)(c)(iii)

This effectively bans all health testimonials from advertisements. I think this is a good part of the law, as testimonials can be both very convincing and completely misleading; a quack’s dream. Banning them should force businesses to instead focus on the results of research on their products, but this hasn’t stopped Tuatara Natural Products from getting stories written about the most extreme testimonials they could find from people who have lost weight at the same time as they were taking Satisfax.

More recently, Tuatara Natural Products has put out a press release multiple times (at least on the 20th of February and again on the 4th of March) that I think rather oversteps the results of their small preliminary trial:

A NEW ZEALAND SOLUTION TO A GLOBAL PROBLEM A little pill is providing an exciting answer to one of the worlds greatest and fastest growing problems: Obesity.A NEW ZEALAND SOLUTION TO A GLOBAL PROBLEM

A little pill is providing an exciting answer to one of the world’s greatest and fastest growing problems: Obesity.

Press Release – Tuatara Natural Products

I simply don’t think they are at all justified in saying that their new product is “providing an exciting answer to… Obesity”. They are putting marketing ahead of science, and that’s not okay.


Another company that seems to put marketing before research is 1Above. They make a drink which they claim can help you recover faster from jet lag, and have recently been in the news for signing a sponsorship deal with the fantastically successful golfer Lydia Ko.

At the end of that article about their sponsorship deal the reporter, Richard Meadows, made some comments regarding the science behind jet lag relief products and asked some good questions of 1Above’s CEO, Stephen Smith (emphasis mine):

[1Above’s] product contains a mixture of vitamins B and C, electrolytes, and Pycnogenol, a pine bark extract.

The efficacy of flight drinks to combat the effects of jetlag is unproven.

Late last year pharmacists were warned after the Advertising Standards Authority upheld a complaint against an ad saying a homeopathic anti-jet lag pill really worked.

[1Above CEO Stephen] Smith said 1Above would not be doing clinical trials, which were highly expensive and not necessary.

“What we tend to use is testimonials from people who have used the product and swear by it.”

Smith said the key ingredient, Pycnogenol, had itself had been tested in dozens of trials, including its effects on reducing jetlag.

Kiwi startup 1Above signs golf No 1 Lydia Ko

Yes, you read that correctly. The CEO of 1Above literally said that they won’t be doing clinical trials because they are “not necessary” and that they use testimonials instead.

As I said before, using testimonials to promote a therapeutic product, like a drink to help you recover faster from jet lag, can be both very convincing and completely misleading. There’s a reason why testimonials implying health benefits are illegal in New Zealand, and I hope that 1Above’s marketing will not violate this regulation.

Not all testimonials are prohibited, of course. It’s entirely acceptably to provide a testimonial from someone who thinks their drink tastes great, or that they provide great service. Basically anything for which a single person’s experience can provide a useful insight. Therapeutic effects, almost without exception, do not fall into this category, which is a big part of why we need to do clinical trials in the first place. If they quote someone in saying that their product helped them recover faster from jet lag, they may be in danger of breaching the Medicines Act.

For example, I’d expect they probably shouldn’t use a testimonial that says this:

Testimonial on the 1Above website, collected 2015/03/05
Testimonial on the 1Above website, collected 2015/03/05

On their website, 1Above currently does refer to research on one of the ingredients in their product, “pycnogenol”. Professor Lumley recently wrote a post about this on his other blog, Biased and Inefficient, regarding these studies and how they are used by 1Above: Clinically Proven Ingredients

I recently contacted 1Above to ask about some discrepancies I found between the abstract of the study they cited for showing pycnogenol reduced the duration of jet lag and their description of it on their website:

I was interested to see the claim your company made that Pycnogenol® has been shown to support circulation and reduce the length and severity of jet lag.

I have found the study “Jet-lag: prevention with Pycnogenol. Preliminary report: evaluation in healthy individuals and in hypertensive patients” that is mentioned on your website as the source for this claim, but I am only able to access the abstract of this preliminary report. Unfortunately, as far as I can tell, the study protocol didn’t involve blinding of participants or researchers.

The participants in the study took 50 mg Pycnogenol 3 times per day, but I haven’t been able to find out how much is contained in your products. Is this information available anywhere on your website? I notice the study also says the participants took this regimen for 7 days, starting 2 days prior to departure. Is this comparable with how your product is intended to be used?

I also noticed some differences between the description of the study and its results between the abstract and your website, I would be grateful if you could explain to me the source of these differences.

The abstract states the control group took, on average, 39.3 hours to recover and the experimental group took, on average, 18.2 hours to recover. However your website reports these as 40 and 17 hours respectively.

Also, your website states that the study involved 133 passengers (it’s not clear from the description on your website if they all took Pycnogenol or if some of them were in the control group) who reported the time it took them to recover from jetlag. However, the study’s abstract states that in the first experiment, which is the only one that involved the reporting of the time taken to recover from jetlag, only involved 68 participants – 30 in the control group and 38 in the experimental group.

I would be grateful if you could explain these differences to me, and if you could send me any other relevant scientific information that supports this claim.

To their credit, since receiving my message they did update their website to fix the discrepancies in the reported number of participants and times taken to recover from jet lag, and their CEO replied to thank me for pointing these discrepancies out.

However, they didn’t respond to my other questions about the amount of pycnogenol in their products or the study involving the participants taking pycnogenol for 7 consecutive days, starting 2 days before their flight, which is inconsistent with how 1Above recommends their products be used.

This is just one more company basing their marketing on preliminary trials instead of using them as the basis for research that could actually answer the question of whether or not a product is useful. Worse than Tuatara Natural Products, they even go so far as to consider clinical trials “not necessary” and apparently intend to rely on testimonials instead. It would be much more appropriate for them to spend some of their $2.4 million annualised income on researching their product rather than paying for a sporting celebrity to endorse them.


I try to make my rants constructive, so I want to end this article with the question “What can we do about this?”. If you have any suggestions, I’d love to hear them in the comments section.

I think the most important thing that anyone can do to address this problem is to ask for evidence. If you see a claim made about a product that you think you might buy, then get in touch with the company selling it to let them know you’re considering buying it and to ask for evidence. If they don’t have a good enough answer, then let them know that’s why you won’t be buying their product. If they give you evidence to back up their claim, then great!

The UK organisation Sense about Science has created a website for just this purpose: www.askforevidence.org

Asking for evidence doesn’t have to be a big deal, involving a formal letter or anything like that. When you see a weight loss product advertised on a one day deal site, a copper bracelet that apparently offers pain relief advertised on a store counter, or a jet lag cure promoted on Twitter, make your first response be to politely ask for evidence.

This isn’t a problem that’s going away any time soon. As consumers, we deserve to be able to make informed decisions about the products we buy, and when companies put marketing before research it becomes harder to make these informed choices. But if we work together then we can encourage companies like Tuatara Natural Products and 1Above to improve their behaviour and attitudes toward marketing and research.

Let’s turn “what’s the evidence?” into a frequently asked question for all companies that sell therapeutic products.

Why Testimonials Aren’t Enough

The business of “natural health” rests heavily on the use of testimonials. They are used in advertisements by people selling therapeutic products and services, and you’ll hear them as anecdotes from people that you know telling you what worked for them. Intuitively, it makes sense to trust in this sort of experience, but unfortunately testimonials and personal experience are not good ways of evaluating a treatment option.

I don’t expect you to take my word for this. Maybe you were told by a doctor that you’d need an operation, then you had reiki therapy and after that your doctor said the problem was no longer there. Perhaps your first child had terrible teething troubles, but on your second child you used a Baltic amber teething necklace and they didn’t have the same problems, but you swear if you forget to put it on them they become agitated. Or maybe you’ve been spraying a colloidal silver solution onto the back of your throat whenever you feel a cold coming on and you haven’t been sick in years. Who am I to doubt or deny your experience?

These are all testimonials that I have heard personally, not from advertisements but from individual people relating their own experiences to me. But still, I remain unconvinced that reiki is any more than an exotic twist on faith healing (that is just as ineffective), that Baltic amber teething necklaces are anything but expensive yet inert jewellery, and that colloidal silver is much good for anything other than causing argyria.

In this series of blog posts, I intend to explain to you why I don’t consider anecdotes like these to be useful in drawing any conclusions about therapeutic interventions. But first, I’d like to point out that I am not trying to be dismissive of personal experience. I don’t think anecdotes are all lies, or anything of that nature, and personal experience can certainly be useful in drawing all sorts of conclusions in everyday life. The only conclusion I am arguing for here is that anecdotes are not useful for evaluating the efficacy of therapeutic interventions.


In searching for any truth, we have to be very careful not to jump to conclusions. There will always be a vast number of potential explanations for any observation, and if we really care about the truth then we can’t just pick the explanation that we like the most, or even the one that we think is most likely. Some possible explanations can be ruled out right from the start, if they’re impossible to test, but the explanations that can be tested are known as hypotheses. If we want to determine whether or not one particular hypothesis is correct, we should design and carry out a test that will rule out every other potential cause of our observation.

Note that this method of testing does not prove anything. Instead, it focuses on ruling out everything else, until only one idea is left standing. The key to designing a good test of an intervention is to make sure anything you observe is as unlikely as possible to be due to anything other than the intervention. This means that, in order to design a good test of an intervention, it is important to have a good understanding of what these other potential causes are.


After This, Therefore Because of This

There’s a formal logical fallacy that’s usually known by its latin name post hoc ergo propter hoc, which translates to “After this, therefore because of this”. The fallacy is of the form:

  1. A happened, then B happened
  2. Therefore A caused B

Of course, the reason why this is a logical fallacy is that it’s entirely possible that something other than A was the cause of B. This doesn’t mean that the conclusion is false, but it does mean that it is not necessarily true.

Anecdotes take the same form as the above example: “I tried treatment X and I got better”. Although experiences like this can result in strong beliefs, the fact that the improvement happened after the treatment does not mean the treatment necessarily helped at all. Instead, the improvement could have been due to a few different things.

Self-Limiting Conditions

Many common health conditions are self-limiting. This means that, left to their own devices, they will almost always go away in time. The common cold is an example of a self-limiting illness. Unless you are seriously immunocompromised, if you catch a cold you will be fine again after a few days. This includes things like the flu, teething, colic, and acne. Pretty much everything that isn’t a chronic illness and won’t kill you is self-limiting.

Regression to the Mean

Even when nothing external seems to be changing, your health is not constant. Instead, it fluctuates over time around a baseline level of health that itself changes over longer amounts of time. This baseline is basically your average health over a certain period of time; the mean. The tendency for your wellbeing to return to this mean after a fluctuation is known as regression to the mean.

This is a picture of 300 random data points generated in Microsoft Excel. Starting with 0, I added a random number between -0.5 and 0.5 to the running total 310 times, and then took a 10 point running average to smooth the resulting curve.

Regression to the mean

As you can see, even though the changes are all random, trends do form and the data oscillate around a particular mean. Especially over longer periods of time, the data will tend to return to that mean.

I’ve indicated the 2 most prominent downward trends with arrows. As you might imagine, such low points in a person’s health could motivate a person to take a therapeutic intervention in order to reverse this trend. After the intervention, they’ll likely start to feel better, but as you can see by this graph such variations can happen randomly, and it can be very hard to say whether an improvement was caused by something in particular or if it was just the result of regression to the mean.

For example, I get frequent headaches. However, the frequency and intensity of those headaches varies from day to day, just due to random chance. I’d be more likely to decide to seek a therapeutic intervention on a particularly bad day. However, considering that my wellbeing is fluctuating around a mean value I’d expect my headaches to return to their “normal” level, unless of course something has changed to make them worse on average. If I take an intervention and then the next day my headaches are better, how can I know whether it’s due to the intervention or regression to the mean?

Spontaneous Remission

Even with illnesses that are not self-limiting, spontaneous remission that has no obvious cause is something that does happen occasionally. I’m not familiar with the data on this, so I won’t go into it in too much depth, but it is worth knowing that even some serious illnesses can get better on their own, so even some sudden recoveries from serious illnesses can happen on their own, whether an intervention has recently been used or not.


As you may have noticed, these things all have a common theme. They describe ways in which health can improve on its own, which make it difficult to tell whether a particular improvement is due to an intervention or if it would have happened anyway. Ideally, in order to tell the difference, we’d travel back in time in order to try without the intervention and see what would have happened in that case, but unfortunately that’s not an option. The next best method is to have what is known as a control that has the same problem but doesn’t get the treatment.

However, as I discussed earlier, health fluctuates on its own. If the person receiving the intervention improves and the person acting as the control stays the same or gets worse, we still can’t be too sure that the intervention was helping. Variations between different people can make outcomes difficult to interpret as well. Like how random fluctuations will tend to return to the mean over longer periods of time, testing more people will smooth over these random variations. The more people we include in both the treatment group and the control group, the better, as having more observations will help us to tell whether any effect we observe is due to random variation or due to the intervention itself.

Having a control group and a large sample size are 2 aspects of a good test of a therapeutic intervention, but that’s not all there is to it. In my next post, I’ll discuss some other potential confounding factors, and how we can modify our test in order to account for them.

Why is Replication so Important?

One of the most important principles of the scientific method is reproducibility. A valid result should be able to be replicated independently, whereas an invalid result (originally achieved due to some error or perhaps just chance) will not be able to be consistently reproduced.

This is a concept that I didn’t fully understand for a long time. I had reasoned that, say, doubling the sample size of an experiment should be just as good a way of confirming its results as performing the same experiment a second time with a different sample of the same size. This seemed intuitive to me, but eventually I came to understand why it is not the case.

The reason why this is the case has to do with researcher degrees of freedom. In an original experiment, the experimenters have the freedom to make certain choices. Some choices may have been made beforehand, whereas others are made after the study has been started. These choices may not all be made consciously, or they may be made consciously with only unconscious bias, but the fact that the choices are made at all after the experiment has begun affects the reliability of the results.

In contrast, when a well-designed experiment is being replicated, the choices have all been made beforehand, as the replication follows exactly the same protocols as the original study. This includes making all the same measurements and undergoing the same analysis. This reduces the researcher degrees of freedom for the replication experiment, so if the same results can be reproduced that’s a good indication that the original results were accurate, whereas if they can’t then it likely means they were due to some combination of bias and chance.


In some ways this can be pretty intuitive. For example, if the experimenters were to carry out a variety of statistical analyses on their data and select the one that was most favourable to their hypothesis, then replicating the experiment with that same method analysis selected beforehand will reduce the bias from the initial decision.

Another great use for replication is in confirming the results of subgroup comparisons. For example, if I were studying the effect of a new drug on reducing blood pressure, I might perform comparisons between several subgroups and find it to be particularly effective in, say, people with type 1 diabetes. However, as more comparisons are made, the bar for statistical significance gets higher and higher. If I need to be 95% confident that my result is not due to chance for it to be statistically significant, then I can expect 1 in 20 results to appear significant by chance alone. There’s a great xkcd strip that demonstrates how this can lead to unreliable results (remember to read the strip’s alt text while you’re there) – xkcd: Significant


I find a scenario known as the “Monty Hall problem” (or the “Three door problem”) to be an illustrative analogy of the importance of researcher degrees of freedom, especially in showing how unintuitive this importance can be. The problem goes something like this:

Imagine you’re a contestant in a game show. In front of you there are 3 doors, and you have to pick one of them. You have been told that behind 1 of these doors there is a car, but behind each of the other 2 doors there is a goat. You get to keep whatever is behind the door you open and, of course, you want to win the car.

After you have made your initial choice, the host of the game show opens one of the 2 doors that you didn’t choose and shows you that there is a goat behind it. Now, after seeing this you are given the opportunity to change your choice.

Intuitively it feels as though changing your choice would not affect your chances of winning. After all, you know that the door you picked has a 1 in 3 chance of having the car behind it, and if you’d picked the remaining door first you’d have the same chance.

However, changing your choice at this point will double your odds of picking the car.


I find the easiest way to understand this is to run through each possibility in order to show the outcomes.

When you first pick a door, there are 2 possibilities – either you’ve picked the door with the car, or you’ve picked one of the doors with a goat. In 1 of every 3 attempts you will pick the correct door first and, if you don’t change your decision, you’ll get the car. In the remaining 2 attempts you will pick an incorrect door first and get a goat. So, the chance of picking the car if you don’t change your decision is 1/3.

What if you do change your choice, though? 1/3 times you will have picked the car originally, so when you change your decision you will lose. However, what if the first door you picked had a goat behind it? In this case, the host will open the other door that has a goat behind it, so the only remaining door is the one with the car. This means that if you change your choice you are twice as likely to win, because your chance of winning would be 2/3.

This same power applies to researcher degrees of freedom. Decisions made once some or all of the data are known can have an effect on the reliability of the result.


In order to show the power of replication, let’s re-imagine the Monty Hall problem. This time, you know there are 3 doors, and that behind each of them is either a car or a goat, and the same objects aren’t always behind the same doors. However, you don’t know if there are 2 goats and 1 car or if there are 2 cars and 1 goat.

Now, let’s imagine you want to test the hypothesis that there are 2 cars and 1 goat behind the doors. In order to test this, you pick a door that you think has a car behind it. Once you’ve made that choice, however, you somehow find out that one of the other doors has a goat behind it, and knowing this makes you (consciously or unconsciously, it doesn’t matter) change your decision to the remaining door.

In order to determine the probability that you’d pick the car, you’d need to repeat this many times. Using this approach, you’d pick the car about 2 out of every 3 attempts. This could lead you to conclude that there must be 2 cars behind the doors, instead of just one. However, we know that this is not the case! We would be able to show this by replicating the experiment.

In this case, an experiment to replicate these results might pick the same doors as the first experiment had eventually picked. Because this experiment has reduced the researcher degrees of freedom by making those choices beforehand, the result of the original experiment will be found not to be reproducible.


Of course, this analogy is exaggerated so as to make my point very obvious. In reality, the biases involved are much more subtle, but they are still there. The important thing to realise is that it is possible, even with the absolute best of intentions, to come to an unreliable result due to unconscious bias. In fact, every single decision can be made honestly and seem justified at the time, but the accumulated effect of many such choices will have an effect on the results. It’s because of this that replication is so important in science.

Common choices that can affect the reliability of results by being made after the experiment has started include when to stop the experiment, how to analyse the data, and which subgroup comparisons to carry out.

The explanation that really helped me realise this was by Steve Novella in episode 373 of the SGU podcast (the relevant segment starts at 12:04), discussing replications of the Psi Research done by Daryl Bem. He’s written a post about the original research on his own blog, Neurologica Blog: Bem’s Psi Research, and a post on the replications on Science-Based Medicine: The Power of Replication – Bem’s Psi Research.

In a nutshell, Bem’s experiments were well-designed (essentially they carried out some classic psychological experiments in reverse order) and the results were statistically significant and seemed to imply that the subjects exhibited precognition: the ability to predict supposedly random future events. However, when some of his experiments were replicated with all of the decisions made beforehand the results showed no ability better than what would be expected by chance.


Sometimes, good science gives strange and unexpected results. In some cases, such as in Bem’s psi research, the results could even be called extraordinary. However, false positives can and do occur for a multitude of reasons, so in cases like this it’s important to remember that “extraordinary claims require extraordinary evidence”. In the face of such claims the correct course of action is neither to jump on the bandwagon nor to discard the results as false but to be on the lookout for quality replication. We live in an honest universe, and with time truth will out.