Sir Iain Chalmers: for his critical contribution to EBM
Presenting the 2009 HealthWatch Award, Nick Ross said, “Iain Chalmers has saved more people’s lives than anyone else I can think of.” Iain Chalmers, editor of the James Lind Library, has for the last 30 years championed the need for health professionals and patients to have access to unbiased evidence on which to base clinical decisions. His talk on the development of fair tests of treatments in health care was illustrated with examples going as far back as 1500BCE, all taken from the James Lind Library’s extensive and publicly searchable online archives.
Iain Chalmers has devoted the last 30 years to efforts to help ensure that health professionals and patients have access to unbiased evidence on which to base their treatment decisions, most famously through his work as one of the co-founders of the Cochrane Collaboration. He was at the first ever HealthWatch meeting back in 1992, and since then he’s been a friend and valued critic, and prepared to make a pointed comment if he ever believes HealthWatch has failed to apply to itself the standards it expects of others. “It’s important to be even-handed, for us all to be judged by the same rules. If we depart from this, then we’ll be open to the accusation of double standards,” he said by introduction to his talk at the 2009 HealthWatch AGM. Chalmers, who received a knighthood in 2000 for services to healthcare, applies his passion for fairness now as editor of the James Lind Library , created to help people understand fair tests of treatments in health care.
The article below is based on the talk given by Iain Chalmers at the HealthWatch AGM 2009, and is a fuller version than the one which appeared in the print version of the newsletter, which was edited for reasons of space.
The subject of my talk is explaining fair tests of treatments in health care, and in this we have much unfinished business. I’d like to begin by introducing some of my special heroes in the field. Margaret McCartney, the Glasgow GP who writes a health column for the Financial Times every week , was a worthy recipient of last year's HealthWatch Award. I’d include the writers of some of my top books: Smart Health Choices by Judy and Les Irwig . Judy Irwig is a mother, Les Irwig a professor of clinical epidemiology, and their book explains clearly and authoritatively how not to be bamboozled by what you read in the media about health. Know Your Chances , in which American doctors Steve Woloshin and Lisa Schwartz explain how to interpret health statistics, is special because an early draft was itself subjected to a randomised controlled trial to see if it actually increased its readers’ knowledge of the subject. Another young British doctor, Ben Goldacre (HealthWatch’s 2006 Award winner), has shaken things up for science knowledge in this country with his Guardian column “Bad Science” . He writes, “Evidence-based medicine, the ultimate applied science… has saved millions of lives, but there has never once been a single exhibit on the subject in London’s Science Museum.”
However his fellow Guardian writer and HealthWatch Award winner, Polly Toynbee, did not make my hero list. As I pointed out in the April 2004 HealthWatch Newsletter , she once wrote that randomised clinical trials should be abandoned. “It may be a little less accurate scientifically,” she had written, “but if patients are allowed to choose which treatment they want and every detail of their condition, lifestyle, character and circumstances is fed into the trial data, I doubt if the results would be seriously distorted,” making clear her reluctance to agree to be a “guinea pig”. She completely failed to confront the fact that you often get very different results depending on design of the trial.
A few journalists—among them Nick Ross—understand evidence based medicine, and are prepared to battle against the stereotypes. On the 2nd April 2001 Nick was amongst fifty people who met to consider how to get the public to appreciate randomised controlled trials. There’s a problem with the name: it has so many negatives associated with it. “Randomised” suggests haphazard. “Controlled” implies controlling. “Trials” has legal connotations. It was Nick Ross who suggested, “why not call them fair tests?”
James Lind, a pioneer of fair tests, was a naval surgeon in the 18th Century and a member of the Society of Naval Surgeons (whose members went on to found the Medical Society of London). Like many who favour quantifying outcomes, he was something of an outsider. It’s harder to ask the question, “Is the Emperor wearing clothes?” when you’re a member of the Emperor’s establishment. It’s a problem that remains with us today. No matter how fair the test itself, the interpretation of science continues to be distorted by those who have a vested interest in the results, other than the well-being of patients.
The James Lind Library was launched by the Library of the Royal College of Physicians in Edinburgh in 2003. It has an online archive of illustrative records, from 1550 BCE to the present, illustrating how fair tests developed . These make clear that many of the principles of fair tests that we still use today go back hundreds, even thousands of years.
Conceptualising fair tests of treatments
In an extract of a letter written in 1364 , the Italian poet Francesco Petrarca wrote, “I solemnly affirm and believe, if a hundred or a thousand men of the same age, same temperament and habits, together with the same surroundings, were attacked at the same time by the same disease, that if one half followed the prescriptions of the doctors of the variety of those practising at the present day, and that the other half took no medicine but relied on Nature’s instincts, I have no doubt as to which half would escape.”
Treatments with dramatic effects
Even earlier, we have a surgical papyrus dated from around 1550 BCE which has been translated to reveal an explanation of how to reduce a dislocated mandible. It describes exactly what we do today, yet it was written more than 3,000 years ago. You don’t need carefully controlled trials to prove a treatment which is so clearly effective.
Recognizing the needs for controls
In the 10th Century CE, the Baghdad doctor Abu Bakr Muhammad ibn Zakariyya al-Razi (Rhazes), wrote on his experience of treating meningeal inflammation, noting the characteristic symptoms of photophobia, neck stiffness and headache. He wrote, “So when you see these symptoms, then proceed with bloodletting. For I once saved one group [of patients] by it, while I intentionally neglected [to bleed] another group. By doing that, I wished to reach a conclusion.” If this sounds rather barbaric remember it’s the way of thinking that’s important—he realised that he needed an untreated group in order to make an inference about the effects of his treatments.
The James Lind Library records a 16th Century example of a within-patient prospective controlled trial. “A kitchen boy fell into a cauldron of almost boiling oil…” wrote the French royal surgeon Ambroise Paré in 1575. “I went to ask an apothecary for the refrigerant medicines that one was accustomed to apply to burns. A good old village woman, hearing that I was speaking of this burn, advised me to apply for the first dressing raw onions crushed with a little salt… I was agreeable to trying the experiment and, truly, the next day, the places where the onions had been had no blisters or pustules, and where they had not been, all was blistered.”
During 18th Century naval campaigns more sailors were being killed by scurvy than by the fighting. One of several recommended treatments at the time was vitriol (sulphuric acid), which was favoured by the Royal College of Physicians of London. Of one of the earliest known reports of a clinical trial, the naval surgeon James Lind wrote in 1753, “…I took twelve patients in the scurvy… Their cases were as similar as I could have them. They all in general had putrid gums, the spots and lassitude, with weakness of their knees. They lay together in one place, being a proper apartment for the sick in the fore-hold; and had one diet common to all.” Lind allocated two sailors with scurvy to each of: “a quart of cider a day; twenty-five gutts of elixir vitriol three times a day; two spoonfuls of vinegar three times a day; a course of sea water… half a pint each day; two oranges and one lemon every day; the bigness of a nutmeg three times a day.” The most sudden and visible effects were seen amongst the seamen taking the fruit.
“Blinding” assessment of outcomes
The report of the homeopathic salt trials in Nuremburg in 1835 contains a detailed description of a randomized double-blind experiment in which participants were given either a homeopathic salt solution or pure distilled snow water. The details of which numbered bottles had contained which liquid were kept sealed until the end of the experiment. The experiences of the participants in the two groups were indistinguishable. One should bear in mind that homeopathic care in the late 18th and early 19th century was almost certainly safer than the bleeding, purging and use of heavy metals by orthodox practitioners.
Recognising the “law of large numbers” and the “limits of oscillation”
The idea of using numerical data to justify conclusions about treatments goes back at least three centuries. Pioneering work on how to apply inferential statistics to therapeutic data in order to make critical judgments on the value of therapies was published in Paris in 1840 by Louis-Dominique-Jules Gavarret. According to his beautifully written Principes Généraux de Statistique Medicale, “Average mortality, as provided by statistics, is never the exact and strict translation of the influence of the test medication but approaches it all the more as the number of observations increases. To be able to decide in favour of one treatment method over another, it is not enough for the method to yield better results: the difference found must also exceed a certain limit, the extent of which is a function of the number of observations.” Hence, the need to estimate what he calls “the limits of oscillation” (confidence intervals).
Confidence in results can be increased by examining the results of multiple trials. A key paper in the history of meta-analysis is Karl Pearson’s 1904 report in the British Medical Journal on “certain enteric fever inoculation statistics” which looked at correlations between typhoid and mortality and the inoculation status of soldiers serving in various parts of the British Empire.
In the early 20th Century important advances in study design were implemented in the USA in a programme of research to assess serum treatments for pneumonia. The trials in the programme demonstrated many of the important features of fair tests, involving large numbers of patients, allocation to treatment or control groups using an unbiased process (alternation), an assessment of the likelihood that observed differences could be explained by chance, and meta-analysis of the results of similar studies.
Recognising reporting bias
The English philosopher and statesman Francis Bacon, in his 1620 “New Instrument for the Sciences” commented, “It is a proper and perpetual error in Human Understanding, to be rather moved and stirred up by affirmatives than by negatives…” This is still as true today, and it can kill. Dr Cowley and his colleagues wrote in 1993 how, in an unpublished study done 13 years before, nine patients had died among the 49 assigned to an anti-arrhythmic drug (lorcainide) compared with only one patient among a similar number given placebos. “We thought that the increased death rate that occurred in the drug group was an effect of chance… The development of the drug was abandoned for commercial reasons, and this study was therefore never published; it is now a good example of ‘publication bias’. The results described here…might have provided an early warning of trouble ahead.”  In his 1995 book Deadly Medicine , the American author Thomas J Moore estimated that at the peak of their use in the late 1980’s, these widely-used anti-arrhythmic drugs killed as many Americans every year as were killed during the whole of the Vietnam war.”
Recognising the need for a cumulative science
In 1884 Lord Rayleigh, professor of physics in Cambridge and President of the British Association for the Advancement of Science, said, “If, as is sometimes supposed, science consisted in nothing but the laborious accumulation of facts, it would soon come to a standstill, crushed, as it were, under its own weight… The work which deserves, but I am afraid does not always receive, the most credit is that in which discovery and explanation go hand in hand, in which not only are new facts presented, but their relation to old ones is pointed out.”
In 1965 the English epidemiologist and statistician, Austin Bradford Hill, framed the four questions to which readers want answers when reading reports of research: Why did you start? What did you do? What answer did you get? And, what does it mean anyway?
An example that lives up to Bradford Hill’s expectations is the CRASH research into the effects of systematic corticosteroids in acute traumatic brain injury. The research was started because practice varied and a systematic review of existing studies (some of which had never been published) revealed important uncertainty about whether systematic steroids did more good than harm. To address this important uncertainty a large publicly-funded, multi-centre randomized trial—called the CRASH trial—was organised. The results, which were published in the Lancet in 2004  revealed that this treatment had been killing people since it was first used nearly 40 years previously.
The report of the CRASH trial is exemplary because it referred to current uncertainty about the effects of a treatment, manifested in a systematic review of all the existing evidence, and in variations in clinical practice; it noted that the trial was registered and the protocol published prospectively; it set the new results in the context of an updated systematic review of all the existing evidence; and it provided readers with all the evidence needed for action to prevent thousands of iatrogenic deaths.
In summary, science is cumulative, so researchers must cumulate scientifically, using methods and materials to reduce biases and the play of chance. Because researchers still do not do this routinely, people continue to suffer and die unnecessarily.
Editor, James Lind Library
- The James Lind Libraryhttp://www.jameslindlibrary.org/
- Margaret McCartney writing for the Financial Times http://blogs.ft.com/healthblog/
- Smart Health Choices by Les and Judy Irwig was published November 2007 by Hammersmith Press Ltd, paperback £12.99.
- Know Your Chances by Steve Woloshin was published November 2008 by California Press, paperback £11.95.
- Bad Science by Ben Goldacre was published in paperback edition April 2009 by HarperPerennial at £8.99.
- Chalmers I, HealthWatch Newsletter, issue 53, April 2004.
- The James Lind Library’s archives can be browsed on http://www.jameslindlibrary.org/trial_records/published.html
- This and the following texts can be accessed on the James Lind Library website by browsing the records, listed in chronological order.
- Cowley AJ, Skene A, Stainer, Hampton JR (1993). The effect of lorcainide on arrhythmias and survival in patients with acute myocardial infarction. International Journal of Cardiology 40:161-166.
- Moore TJ (1995). Deadly Medicine. New York: Simon and Schuster
- The Lancet, Volume 364, Issue 9442, Pages 1321 - 1328, 9 October 2004.
Tim Harford on behalf of the BBC More or Less team for their clear, honest and entertaining way of educating the public about the meaning of numbers
Junk, jigsaws and zombies: misleading stats in the news
The 2012 HealthWatch Award went to Tim Harford and the team behind BBC Radio 4’s “More or Less” programme. Tim received his award at the October AGM, and gave an entertaining presentation to HealthWatch members and patrons on the subject of misleading medical statistics. The article below is prepared from his presentation.
I’d like to begin by setting you a little test. Imagine you’re a doctor discussing a type of cancer screening with a patient. You see that test A increases the patient’s 5-year survival rate from 68 to 99%. Put your hands up if you think that would be a benefit? [audience hands were, hesitantly, raised] Test B, however, will reduce deaths from 2 per 1,000 to 1.6 per 1,000. Most people faced with comparing test A with test B would opt for test A.
In fact, only test B unambiguously saves lives: to be precise it saves 0.4 lives per 1,000 people. But what about the huge survival rate benefit of test A? To explain how this is, imagine a cancer that always strikes at the age of 60, but that shows no symptoms until age 68 … then kills at age 70. A screening test that accurately diagnoses the cancer in 62-year-olds would give them a 5-year survival rate of 100%. Yet they would still die at 70 if there was no treatment.
US psychologists recently put this test to 400 doctors. Eighty-two per cent thought that test “A” saved lives—which it didn’t. Only 60 per cent thought that test “B” saved lives, and fewer than one-third thought the benefit was large or very large—which is intriguing, because of the few people on course to die from cancer, the test saves 20 per cent of them. In short, the doctors simply did not understand the statistics on cancer screening.
In the course of this talk I’d like to share with you some of the things I’ve learnt about statistics in the news.
Mistakes are not always difficult to spot
An advertisement for “U-switch” internet service claims that 49% of British broadband customers are getting below-average broadband speed. Think about it … it’s like saying, “49% of NHS patients are getting below-average treatment”. Of course they are, and the rest are getting average or aboveaverage.
It’s saying nothing.
It’s easy to be wowed by big numbers
In 1997 Gordon Brown pledged to spend £300 million on pre-school provision over the following five years. Now, you need to peer beneath the numbers here. The need for pre-school provision affects about 1 million children every year. That figure boils down to about £1.08 per child per week. What exactly is going to be provided for £1 a week? So, don’t be impressed by the big numbers that aren’t.
Be wary of averages
We have reports of economic upturns—talking about average inflation, average incomes. Remember, on average a rainbow is white, yet it’s the colours that are important. The average is not the only thing you want to know to get the true picture.
A famous example of the misuse or misunderstanding of averages was when the financial crisis broke in August 2007, and the chief financial officer of Goldman Sachs commented that 25-standard deviation events had occurred several days in a row. We asked a professor of finance to calculate for us the likelihood of that actually occurring. He worked out that you might expect to see a 2-standard deviation event once ever 4-5 days. A 3-standard deviation event happens only every 3-4 years. A 4- standard deviation event once every 126 years. There could only have been one 5-standard deviation event since the last ice age. A 25-standard deviation event would be expected to happen only once every x years, where x is a number with 67 zeros.
Watch out for shifting definitions
A recent US report said that one in five students self-harm. Alarming to consider the possibility that a fifth of young people might hurt themselves by burning, slashing wrists, and attempting suicide. But is it true? Read the study behind the news and we find that 8,300 students were surveyed. Of those 3,000 chose to respond—could there be a bias here towards young people who already have an interest in self-harming? But it is the definitions that interested me. They included things like tugging at your own hair, scratching yourself. No doubt smashing your head onto your keyboard when reading an idiotic news story would have been included. According to the same study, only 49 of the responders reported causing themselves serious harm. That is 0.5% of the original sample. The story was not exactly false, but not really true either.
Sometimes you get statistics that have no merit whatsoever
We call them junk stats. We came across this one on the internet—four million US women are battered to death by their husbands or boyfriends. Four million? Common sense tells you that can’t possibly be true. We often find that the more serious the claim and the worthier the issue, the stupider the stats.
The dangers of junk stats
But it’s not only important issues that generate junk stats. We noticed an advertisement for a product that made lips 25% fuller. What does that mean? And a product that is 25% more “berrylicious” … we assume these have been cleared by the Advertising Standards Authority, but on what basis they proved the claims I can’t imagine.
Every year there is a Blue Monday. It’s a concept created to generate publicity for a charity that supports people with depression. A PR firm invented an equation that involves how many days you’ve just had off work and the length of time before your next holiday, and comes up with the most depressing day of the year. Is that OK? I’m inclined to think it makes a mockery of statistics, maths, and journalism. Someone with real evidence can be chucked in the same bucket as this kind of nonsense. Journalists using this kind of material don’t give their readers the tools they need to interpret stories properly.
There’s something we call “zombie stats”
These are the figures that can be shot to pieces over and over again and yet they still keep coming back. Here’s one. Public sector spend could be cut by 20% by reducing waste. Well, the biggest factor in public sector spending is salaries, not waste. So how can cutting waste reduce it by 20%? The figure came from a procurement consultancy that claimed to be able to get better deals—such as cheaper mobile ’phone bills. Of course they wouldn’t be able to make savings like that in all areas of spending, yet the government seemed happy to repeat the claim without any evidence. Ben Goldacre shredded the claim  we at More or Less shredded it, but like that zombie it just won’t lie down and die.
Recognise patterns that can fool us
In the cancer screening example that I began with, you could have known that I was going to try to fool you. But you don’t always see it coming. To take speed cameras, for instance. If the government try to put cameras at accident black spots, will they have an effect on the number of accidents that occur?
Some accident black spots are indeed dangerous places. But in some cases a cluster of accidents is just back luck. Bad luck doesn’t last. Put a speed camera there and, chances are, the run of bad luck ends, the accidents are reduced, and it seems to be down to the camera. I’m not saying that speed cameras don’t help, but they might help less than we thought.
What do we in the media need to do?
Sometimes we need expert help. We in the media can always find an expert who is generous enough to explain if we ask. And we need to know when we need to ask.
We need to get better at explaining risk. To take another example from recent news, we hear that if you eat a bacon sandwich every day, you have your risk of cancer increased by 20%. The questions we need to be answering here are, what kind of cancer are we talking about? How likely are you to get it anyway? In this case, we’re talking about cancer of the bowel. Under normal circumstances, 4 people in 100 get bowel cancer. If you eat a bacon sandwich every day, the risk increases to 5 in 100. Explaining risk in this way is helpful.
The BBC website’s top story at the moment is that old people can prolong their useful life not by going for more walks, but by doing jigsaw puzzles. Whether this results from a study or a systematic review, we don’t know. You need context in order to judge this kind of story. And there’s the recession. A deficit of £150 billion—it’s a meaningless big figure for most people. Until you calculate that it adds up to a bill of £2,500 per person per year. Is that higher or lower than the deficit for other European countries? Greece for example? A good party trick during party political conference season is to take all the numbers that appear in the media reports and divide them by the population of the country and see what you get.
And are the sources sound? Is it true? What exactly is being said? What is being compared? If we’re talking of treatments, were they tested on animals or people or in a petri dish? Get a sense of scale and context. If we can do that, we can use maths to tell stories that people can understand.
Broadcaster, author and journalist
Reference  http://www.guardian.co.uk/commentisfree/2011/jun/24/bad‐science‐local‐govermentsavings‐ ben‐goldacre
More about tim harford and “more or less”
Tim Harford is an author, columnist for the Financial Times and presenter of BBC Radio 4’s “More or Less”. The Royal Statistical Society has commended More or Less for excellence in journalism in 2010, 2011 and 2012; and the programme has won an award from Mensa. As a senior columnist for the Financial Times, Tim’s long-running “Undercover Economist” column reveals the economic ideas behind everyday experiences, while a new column, “Since You Asked”, offers a sceptical look at the news of the week. His first book, “The Undercover Economist” has sold one million copies worldwide in almost 30 languages. His writing has been published by the leading magazines and newspapers on both sides of the Atlantic. Tim won the Bastiat Prize for economic journalism in 2006 and has been named one of the UK’s top 20 tweeters by The Independent.
In BBC Radio 4’s “More or Less” programme, Tim Harford and his team investigate numbers in the news and try to make sense of the statistics which surround us. The half-hour programme is broadcast at 16:00 on Friday afternoons and repeated at 20:00 on Sundays on Radio 4.
For more about “More or Less” see the programme's page on the BBC website.
For Tim’s articles and blog, see his website.
For outstanding leadership in the pursuit of medical truth
What’s wrong with the medical literature and what can we do about it?
This is an enormous honour. HealthWatch is a charity of which I’ve been fully aware for a very long time. As Peter Wilmshurst said to me earlier, now I’ve joined the “awkward squad”!
Many of my heroes are big parts of HealthWatch. Everything I say tonight is absolutely on the shoulders of giants, many of whom are in this room, and who have been unafraid to upset people and to challenge vested interests in the cause of health and scientific endeavor. HealthWatch shares many noble aims with the BMJ. We had a strategy session at the BMJ just yesterday and amongst our ground rules the very first was that all statements will be based on evidence, and another one that came up was, there shall be no sacred cows. I genuinely think that we should have no sacred cows.
The BMJ has for a long time been a campaigning journal, and that is something we are continuing. I’d like to talk about two campaigns in particular tonight, one is, open access to clinical trial data, and the other is “too much medicine”, which again fits very well with HealthWatch’s aims. This is one of many areas in which we find that the evidence base is distorted, and it ends up pushing us to over treat and over diagnose - this conspiracy of enthusiasm for treatment which I think as medics we have to constantly rein ourselves back on. And the other thing I’m going to speak about tonight is the theme of patient partnership - trying to bring patients into everything we do.
With these campaigns we feel we can bring to bear quite a unique mix of science and journalism, and we have the original research, we have our commentaries, we have investigative journalism which has allowed us to dig that bit more and hopefully to pull down some of these sacred caws that otherwise might go unchallenged.
So to my theme. We have a problem. We want to practice evidence based medicine, but the evidence on which our decisions are based is flawed. It is incomplete and it is of poor quality and based on hidden data. There is a huge loss of trust. There have been too many examples of bad practice, bad faith and out-and-out misconduct, especially but by no means only on the part of industry, over the past 30 years or so. Journals must accept a good chunk of the blame, as must the medical profession, and the research establishment. So we’ve got growing evidence of the problem of misleading, misreported, incomplete data. We’ve got hidden trial data, and the tendency that this distortion has to lead us to over diagnose and over treat, which is how these two campaigns find themselves coming together.
There was a review in 2010 in the journal Trials,1 done by Germany’s IQWiG group (the German equivalent of the UK’s NICE), and they looked at a huge number of conditions, and found underreporting and misreporting of trial data across the entire array of medicine, and there was no method that was spared. The problem was very much associated with pharmaceutical company trials, but also non-pharmaceutical company trials, and overwhelmingly they found evidence of benefits being over-stated and harms being understated. In addition we now know that only about half of clinical trials end up getting published2 and the US legislation, the 2007 FDA Amendments Act that everyone felt was such a great thing, and is a great thing, is being widely ignored, in the sense that the summary data that is supposed to be published within a year of the study finishing, is not being published within that time frame.3 So we’ve got things in place which should be helping, but they’re not.
I’ll give you two examples, both of which have become “poster children” of this problem.
The ’flu pandemic of 2008-9 gave Roche’s drug Tamiflu an enormous boost. Initially a Cochrane review found it was effective in reducing the complications of influenza, such as pneumonia.4 The Cochrane team were asked to update their review, they thought it would be a simple job, just looking at the most recent trials and most likely coming to the same conclusion, but a Japanese pediatrician had noticed that actually the data5 on which their review was based was entirely industry-funded, that out of 10 trials that had been summarized in an industry-funded systematic review, only two of these had been published in peer-reviewed journals, in JAMA and the Lancet; the other eight were apparently either unpublished or published only in abstract form. Now these Cochrane reviewers, being the people they are, asked the industry funded reviewer if they could see the data, and were told, “You’ll have to go to the authors of the original trials.” Those authors in turn sent them to the drug company, who said “We can’t give you this.” At which point they came to the BMJ and Channel 4, and we had some articles about this.6 Interestingly, while the European Medicines Agency has approved the label claim that Tamiflu reduces the complications of influenza, Roche’s US website says, with a footnote pointing out that the advice is for American audiences only, that it is not proven to reduce complications.6 So you’ve got the same evidence base but the authorizing bodies have come to completely different conclusions.
But what is deeply shocking is that this drug, on which governments are spending billions of pounds, is apparently not yet proved to be any better than paracetamol and may have adverse effects. Moreover, the evidence base on which these decisions are being made is entirely in the hands of the manufacturers of the drug.
Thanks to the intervention of Ben Goldacre’s book7 and the huge interest that caused, and building on the work that Iain Chalmers has been undertaking, we are making progress. The AllTrials campaign,8 the BMJ’s “open correspondence” page,9 which has been publishing all the letters between the Cochrane Collaboration and with Roche, and likewise similar activities of the Cochrane Collaboration with GSK, WHO, and so on, are slowly making an impact. GSK have distanced themselves from the pack and have made some very important concessions.10 Now Roche, four years after their original promise to make their data available, have done so.11 We hope that we will soon know the truth about Tamiflu. It may be marvelous, we will have to wait to see – the Cochrane Collaboration are planning to publish their revised review in a few month’s time.
The next case is the antidepressant Reboxetine, which used to be quite widely used. Germany’s IQWiG was going to re-license it, in order to keep it on the list of drugs that are reimbursed by health insurance companies, and they asked the drug company Pfizer for the data, and the drug company said, “No.” So they said, if you don’t, we can’t keep your drug on the list. It turned out, when they finally gave up the data, that about 65% of it had never been published. When they took the unpublished data and considered all the data together they found that a drug which was previously considered effective and safe, actually had no benefit.12
The difficulty we have is that we just don’t know how much of the current evidence base is similarly flawed.
But it is not just pharma that is at fault. If you have a melanoma you may end up having your lymph nodes dissected out. The procedure, a sentinel node biopsy, is invasive and can result in complications. A study in the New England Journal of Medicine published five year results of the Multicenter Selective Lymphadenectomy Trial (MSLT-I) and these data suggested any survival advantage was marginal.13 They were supposed to publish 7-year data, then 10-year data, but the dates for those publications has passed and the authors are not coming up with the goods. We published an article saying, this is a problem, people are having this procedure and the data are not available.14
So it’s not just industry who are doing this. It is, however, true to say that industry is the largest funder of pharmaceutical trials, and the funder with what we can only call an irreducible conflict of interest. They are there to make money for the stakeholder and, ideally, also to help patients, but that is their conflict of interest which they will never escape.
I think it’s quite important to say that things are changing. Ideally we hope to get patient groups saying, “We’re not going to support treatments without the data.”
There is a long list of initiatives, but the trouble is, many of them are partial solutions, and there’s a sense in which industry in particular has been saying, it’s all solved. And when you look closely at the problem, it’s not solved. One drug company, AbbVie, which produces a drug called Humira for arthritis, is suing the European Medicines Agency to stop them making the summary data around their drug available, and the European Federation of Pharmaceutical Industries and Associations, of which AbbVie is a member, is supporting AbbVie in this.15 AbbVie’s lawyer has made clear that the company considers even the data on adverse events to be commercially confidential. So, words on one hand, and inaction on the other. There’s no doubt that patient confidentiality is one issue, there are many others, but we can get through them. In fact a lot of the time, if you look at individual examples of information that people are saying they can’t share, it’s actually already being shared, so it’s not as much of a problem as it’s being made out to be.
I expect many people in this room are on statins. The current guidance is to prescribe statins if your 10-year risk of cardiovascular disease is more than 20%. It scoops up a lot of people on these drugs, at enormous cost to the health service and with an enormous revenue for the industry. If that’s what the evidence says, that’s fine. Last year the Cholesterol Treatment Trialists (CTT) Collaboration published in the Lancet an enormous meta-analysis of individual patient data, which concluded that people at even lower risk could benefit from statins,16 and that led to the suggestion that maybe everyone over the age of 50 should be on these drugs. Now a paper in this week’s BMJ17 questions that. A new analysis of the data doesn’t find an overall mortality benefit, it also says that none of the trials in the meta-analysis really looked at harms – and statins can cause diabetes, myopathy, and all sorts of minor effects along with some serious harms. They also point out that all the trials in the Lancet meta-analysis are funded by the producer of the statins. Usefully the BMJ paper also lists what low risk patients need to know, and top of the list is the need for lifestyle change. This, in a society dominated by pharmaceutical-funded research, is something that tends to be underplayed and obscured.
Which leads me on to “too much medicine”. A long-standing interest of the BMJ is overdiagnosis and overtreatment – why does it happen? The reasons are very complex: more and more sensitive diagnostic technology, finding that little thing that would otherwise go unnoticed; increased patient expectations; the belief that more medicine is better medicine. And then some less worthy reasons: personal financial gain; doctors paid for doing more, so-called “fee for service”; commercial gain by drug manufacturers and medical device companies; a change in diagnostic criteria so more and more people are labeled as being at risk of or actually having a disease; and conflicted guideline panels, so that those diagnostic criteria are being decided on by people with conflicts of interest.
So it’s no longer just diabetes and hypertension and dementia but we’ve now got pre-diabetes, pre-hypertension, pre-dementia – for which good evidence is lacking but increasing proportions of the population are encouraged to have monitoring and preventive treatment, all of which has side effects, and all of which has costs.
The poster child for overdiagnosis is perhaps breast cancer screening. We’re just beginning to understand why this is a real risk for women, and Michael Baum has been a great advocate for a much clearer understanding of those risks. Thyroid cancer is an example that has received less publicity. It’s one of those cases where you detect something and, of course, it’s very hard to leave something there once you’ve found it, but after surgical removal come all the other problems of treatment - thyroid monitoring , radiation to the neck, so it’s not just one procedure, it’s a lifetime of taking drugs, and being a patient in a way that you wouldn’t otherwise have been.
The BMJ has had overdiagnosis in its sights for some years, and a few years ago we commissioned a news piece to go out on April 1st – a small touch whose meaning some of our readers not in the UK might have missed – it was about the discovery of a new disease called motivational deficiency disorder and it was said to affect one in five Australians, characterized by a severe loss of motivation, so much so, that some of them lost the motivation to breathe.18 And the great news was that a new drug had been developed, by a Professor Lethargos. One of the recipients of this drug had been so confined by this condition that he hadn’t got off his sofa for two years, but thanks to the treatment is now an investment banker in Sydney. I think most people understood it was a spoof piece. Except for the editor of one New Zealand newspaper who was taken in, and wrote the story up, and when he found out he’d been spoofed he was very upset and wrote me a very rude e-mail – he said , “Credibility is hard-won, Dr Godlee, and you have damaged yours and mine.” Fortunately most of our readers got the joke, and we got some marvelous rapid responses, including one from someone who said, “We discovered this disease two years ago but couldn’t be bothered to write it up.”
Amongst “the thousand natural shocks that flesh is heir to” (a quote from Hamlet) we can daily add new ones, and you can be sure that there is money being made on the way. Cosmetic medicine, having made as much money as possible from women’s faces and breasts, is heading south in more ways than one. We published an article in the BMJ critical of the growing practice of cosmetic genitoplasty19 by which women have surgery to their labia to make them conform to these new expectations of perfection – so called “designer vaginas”. The day after it was published I received an e-mail.
“What is your problem with women Ms Godlee? This article’s hysterical claims that a lack of testing and rigor in these procedures could result in permanent genital damage are nothing more than misogynist propaganda. Tell you what, the women of the world can keep their hands off your genitals if you keep your hands off theirs. Best wishes, Laurence Shandy, Feminist. I replied:
“Dear Laurence Shandy, Thank you for this. I have no strong opinions on the matter, except that I hope to get through life without surgery to my genitals and think it appropriate for a medical journal to point out the potential dangers of surgery and the alternatives, at a time when reliable evidence is currently lacking.”
All best wishes, Dr Fiona Godlee, Editor in chief.
Dear Dr Godlee, I feel I owe you an apology. I am sorry your genitals were dragged into this debate. Best wishes, Laurence Shandy, gentleman.
The list of things we want is long. The meat of it is captured this week by another member of this august body, Richard Lehman in his journal review blog in the BMJ. He is often critical of industry and of the types of studies being published. Here is his very brief solution:
“All phase 3 trials to be designed and conducted independently of manufacturers, using the best available comparator. Research priorities to be determined by patients (James Lind Alliance). Value-based pricing. All data available from all trials, with meta-data: IPD [individual patient data] level for qualified independent centres. Big increase in comparative effectiveness research, much more research into non-pharmacological treatments.”20
In conclusion, the evidence base is clearly flawed. Research is a human activity - we can’t expect perfection, but we are so far from perfection that there’s a great deal more to do.
I think what I and others have confronted increasingly, and are trying to come to terms with, is that there are so many systematic forces at work, pushing us in the wrong direction, and with vast amounts of money at their disposal, that we have a big fight on our hands. There are huge implications in terms of cost and waste and error, and human life.
The system is broken. I have reached a firm conclusion, and I am not alone in concluding, that there is a need to extricate medicine and research from industry. It will be a challenge.
1 McGauran N, Wieseler B, Kreis J, Schüler Y-B, Kölsch H, Kaiser T. Reporting bias in medical research - a narrative review. Trials 2010, 11:37 (13 April 2010). http://www.trialsjournal.com/content/11/1/37
2 Ross JS, Mulvey GK, Hines EM, Nissen SE, Krumholz HM. Trial publication after registration in clinicaltrials.gov: a cross-sectional analysis. PLoS Med2009;6:e1000144. http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.1000144
3 Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM. Publication of NIH funded trials registered in ClinicalTrials.gov: cross sectional analysis. BMJ. 2012;344:d7292. http://www.bmj.com/content/344/bmj.d7292
4 Jefferson TO, Demicheli V, Di Pietrantonj C, Jones M, Rivetti D. Neuraminidase inhibitors for preventing and treating influenza in healthy adults. Cochrane Database Syst Rev2006;3:CD001265.
5 Kaiser L, Wat C, Mills T, Mahoney P, Ward P, Hayden F. Impact of oseltamivir treatment on influenza-related lower respiratory tract complications and hospitalizations. Arch Intern Med2003;163:1667-72. http://archinte.jamanetwork.com/article.aspx?articleid=215903
6 Doshi P. Neuraminidase inhibitors—the story behind the Cochrane review. BMJ 2009;339:b5164. http://www.bmj.com/content/339/bmj.b5164
7 Goldacre B. Bad Pharma. 4th Estate. London, 2012.
8 AllTrials http://www.alltrials.net/
9 Tamiflu correspondence with Roche. http://www.bmj.com/tamiflu/roche
10 Kmietowicz Z. GSK backs campaign for disclosure of trial data. BMJ 2013;346:f819 http://www.bmj.com/content/346/bmj.f819
11 Cohen D. Roche offers researchers access to all Tamiflu trials. BMJ 2013;346:f2157. http://www.bmj.com/content/346/bmj.f2157
12 Wieseler B, McGauran N, Kaiser T. Finding studies on reboxetine: a tale of hide and seek. BMJ 2010;341:c4942. http://www.bmj.com/content/341/bmj.c4942
13 Morton DL, Thompson JF, Cochran AJ, Mozzillo N, Elashoff R, Essner R, et al. Sentinel-node biopsy or nodal observation in melanoma. N Engl J Med2006;355:1307-17. http://www.nejm.org/doi/full/10.1056/NEJMoa060992
14 Torjesen I. Sentinel node biopsy for melanoma: unnecessary treatment? BMJ 2013;346:e8645. http://www.bmj.com/content/346/bmj.e8645
15 Kmietowicz Z. Drug firms take legal steps to prevent European regulator releasing data BMJ 2013;346:f1636. http://www.bmj.com/content/346/bmj.f1636
16 Cholesterol Treatment Trialists’ (CTT) Collaborators. The effects of lowering LDL cholesterol with statin therapy in people at low risk of vascular disease: meta-analysis of individual data from 27 randomised trials. Lancet 2012;380( 9841):581-590 http://www.thelancet.com/journals/lancet/article/PIIS0140-6736%2812%2960367-5/fulltext
17 Abramson JD, Rosenberg HG, Jewell N, Wright JM. Should people at low risk of cardiovascular disease take a statin? BMJ 2013;347:f6123. http://www.bmj.com/content/347/bmj.f6123
18 Moynihan R. Scientists find new disease: motivational deficiency disorder. BMJ 2006;332:745.2 http://www.bmj.com/content/332/7544/745.2
19 Godlee F. Promoting cosmetic surgery BMJ 2012;345:e7535. http://www.bmj.com/content/345/bmj.e7535
20 Richard Lehman’s journal review—28 October 2013. BMJ. http://blogs.bmj.com/bmj/2013/10/28/richard-lehmans-journal-review-28-october-2013/
Complementary medicine: the good the bad and the ugly
Honouring me with this year's HealthWatch Award seems a doubly courageous act, began Edzard Ernst, as he accepted the 2005 HealthWatch Award. It is courageous of HealthWatch as I am a researcher of the very subject this organisation often criticises, and it is courageous of me to accept the award as it is unlikely to result in praise from the proponents of complementary medicine (CM).
Courageous or not, it is definitely timely to put the spotlight on CM. Patients love it, the media and many people in power promote it, yet few people seem to understand it. In the following discussion I will try to highlight some of those aspects of CM which, I feel, are currently plagued by confusion, lack of transparency and sometimes even wilful deceit. Using the headings 'good, bad and ugly' inevitably requires a degree of simplification. In reality things are rarely black or white but different shades of gray.
It has always puzzled me how anyone could be for or against something like a medical intervention. Does it make sense to be in favour of appendectomy or anticoagulants? I don't think so! Why then do people hold emotional views on CM? It seems to me that, when it comes to healthcare, likes and dislikes should matter far less than evidence. Healthcare is not a fashion where one might legitimately have this or that opinion, nor should it be confused with religion in which one either believes or doesn't. Medical treatments either demonstrably and reproducibly work or they don't. Therefore reliable evidence on what is effective and safe must always be "good" - to view a trial of spiritual healing, homeopathy, for example, which fails to show that the tested intervention works (e.g. is better than placebo) as "negative" seems ludicrous to me.
Examples include the recent (first ever) trial of shark cartilage for cancer . Its results showed that it has no beneficial effects. Surely this must be good news all around. Sharks will not die needlessly, cancer patients will not attach false hopes to a bogus treatment, money can be saved for effective treatments. The only people who could possibly perceive this finding as "negative" are those involved in peddling bogus cancer cures and swindling desperate patients and their families of their savings.
Whenever we demonstrate that CM does work, the situation could quickly reverse. Examples for this scenario can also be found easily. Compelling evidence now suggests that real acupuncture is better than sham acupuncture for a range of pain-related syndromes, e.g. back pain . If the findings are based on good science, it must be good news: it could help millions who suffer from back pain, particularly as conventional medicine is not very successful in dealing with this problem. Similarly, there are now several systematic reviews of rigorous clinical trials demonstrating that certain herbal medicines are efficacious for certain indications  (see table 1, below). Making more general use of these options could benefit many patients - provided that the risks of these remedies do not outweigh the benefit.
It follows, I think, that finding the truth (arguably this is what science should be about) is always a good thing in medicine. As long as the results are reliable, they can only further our knowledge and will eventually improve healthcare. It also follows, I hope, that the incessant criticism directed at the work of my unit by enthusiasts of CM is based on a profound misunderstanding: we may have shown that this or that form of CM is not effective or not safe, but I fail to see that this was in any way negative for those who, in medicine, matter most: our patients.
In CM, many researchers seem to use science to prove that what they believe is correct. This is not what I was taught. Science is not for proving but for testing. The former approach does not only reveal an unprofessional attitude, it is prone to seriously mislead us all. Emotions and strong beliefs can lead to bias , and bias leads to bad science.
Sadly poor science is rife in CM. Here I could cite hundreds of examples. A recent study of anthroposophy  may suffice. Its aim was "to compare anthroposophic treatment to conventional treatment". Patients elected to consult either an anthroposophic or a conventional doctor. The results showed more favourable outcomes for the former approach and the authors concluded that "anthroposophic treatment... is safe and at least as effective as conventional treatment". Because of numerous sources of bias and confounding, many other conclusions are just as likely. The type of patients who elect to see an anthroposophic doctor may differ in many ways from patients who consult a conventional physician.
This example highlights much of what can be (and frequently is) wrong with CM research. It typifies how the aims of a study can be mismatched with the methodology and how the results may not justify the conclusions. If I had to name the characteristic that I find most disturbing in published CM research it would be this frequent inconsistency. Wishful thinking is of course only human, but the regularity of this incongruence in CM is nevertheless most remarkable.
What follows is, I believe, more than obvious: not only is good science good but bad science is bad. It is not bad because some 'out-of-touch' scientists in the 'ivory towers' think so. It is bad because it leads to wrong decisions in healthcare. Ultimately this will be detrimental to those who we should care for most: our patients.
The bad is bad enough, but the ugly is worse. I define ugly here as directly or indirectly preventing (future) patients from receiving the best available healthcare. I could lament about many aspects of CM that fall into this category: dishonesty, neglect of medical ethics, exploitation of vulnerable patients, political interventions are themes that come to my mind (see table 2, below).
The over-riding principle in all this is, I think, the application or promotion of one standard for conventional medicine and another for CM. Double standards are typified, I fear, in the new and increasingly popular movement (its proponents would probably say 'philosophy') of 'integrated medicine'. Its two basic tenets are that integrated medicine cares for the individual as a whole rather than looking at a diagnostic label ; and integrated medicine uses "the best of both worlds"6. Both claims look superficially convincing and plausible. At closer inspection they are, however, neither . Caring for the whole individual has always been and will always be a hallmark of any good medicine . It is thus not legitimate to adopt it as a main characteristic that differentiates 'integrated medicine' from conventional healthcare - on the contrary, conventional healthcare professionals who work towards optimising patient care must feel insulted by it. Using "the best of both worlds" (i.e. CM and mainstream healthcare) sounds fine until one realises how crucially it hinges on the definition of "best". In modern healthcare, this term can only describe those treatments that demonstrably and reproducibly do more good than harm. But this is precisely what evidence based medicine (EBM) is all about. Either 'integrated medicine' is synonymous with EBM (in which case the term would be redundant) or it applies a different standard for the term "best". Considering what 'integrative medicine' in the UK currently promotes  (see table 3), one has to conclude that the latter applies. This discloses integrative medicine as an elaborate smoke screen for adopting unproven treatments into routine healthcare . In the long run, this strategy will therefore turn out to be detrimental to everybody, including patients and even CM itself.
I am convinced that CM has much to offer. In the past 12 years, we have identified numerous CM interventions that generate more good than harm . Many more therapies need scientific testing and some of them will turn out to be useful. The only way to find out is to conduct rigorous research. Poor science will inevitably mislead us. And double standards are detrimental for everyone. In a nutshell, good science is good, bad science is bad and increasing the risk of patients not receiving the best available healthcare is ugly.
E Ernst, MD, PhD, FRCP, FRCPEd
Professor of Complementary Medicine
Peninsula Medical School, Universities of Exeter & Plymouth
Table 1: Systematic reviews suggesting efficacy of herbal medicines
Andrographis: Upper respiratory tract infection
Cranberry:Urinary tract infection
Devil's claw: Osteoarthritis, back pain
Ginkgo: Intermittent claudication, dementia
Ginger: Morning sickness
Hawthorn: Chronic heart failure
Horse chestnut: Chronic venous insufficiency
Kava: Anxiety, menopausal symptoms
Nettle: Benign prostatic hyperplasia
Peppermint: Abdominal pain, non ulcer dyspepsia, IBS
Saw palmetto: Benign prostatic hyperplasia
St John's Wort:Depression
Yohimbe: Erectile dysfunction
Data extracted from reference 3
Table 2: Preventing patients from receiving the best available healthcare
Administering unsafe treatments
- Asian herbal mixtures are sometimes contaminated with toxic heavy metals
- Upper spinal manipulation has been repeatedly linked to arterial dissection followed by stroke
Using invalid diagnostic techniques
- Iridology has been frequently tested and not found to be reliable.
- 'Live blood analysis' is used without evidence that it is valid
Not using CM that has been shown to do more good than harm
- Saw palmetto is effective and safe for BPH, but in the UK it is hardly used
- St John's wort is effective for depression, but in the UK it remains under-used
Misleading consumers through irresponsible advice
- Millions of web sites, hundreds of books, weekly columns in the print media, and even a UK government-sponsored patient-guide all fail to provide responsible advice
- The scarce research funds by the DoH were not used for studying efficacy and safety as recommended by the 'Lord's Report'. Despite the lack of reliable data, the 'Smallwood Inquiry 2005' recommended that large sums of money could be saved if more homoeopathy was used in the NHS.
Unethical behaviour in clinical practice
- A survey showed that the majority of UK chiropractors fail to adhere to their own ethical code (e.g. regarding informed consent).
- A 'Dr Foster' study demonstrated that many CM practitioners fail to comply with 5 very basic "best practice criteria".
Unethical behaviour in research
- Despite the wide-spread use of CM, funds for researching issues such as safety and efficacy of CM remain largely unavailable
Table 3. Selected statements from a recent (government-sponsored) patient guide (a)
...the risk of a stroke [after upper spinal manipulation] is between 1 and 3 in 1 million manipulations.
There are many published estimates that suggest much higher incidence figures. However, due to extreme under-reporting, the risk remains undefined.
Acupuncture is being increasingly used for people trying to overcome addictions...
A Cochrane review fails to demonstrate efficacy of acupuncture for this indication
Craniosacral therapists treat a wide range of conditions from acute to chronic health problems...
There is no trial evidence at all to suggest that craniosacral therapy is effective
Healing is used for a wide range of... conditions. Research has shown benefit in many areas, including healing of wounds, ... migraine or irritable bowel syndrome...
The best evidence available to date fails to demonstrate effects beyond a placebo response
"Homoeopathy is most often used to treat chronic conditions such as asthma"
A Cochrane review fails to demonstrate efficacy of homoeopathy for asthma
(a) Its aim was to "give [you] enough information to help you choose a complementary therapy that is right for you"
(b) The guide does not contain anything else by way of evidence on effectiveness (but was commissioned by the DoH to provide such evidence)
(c) Evidence extracted from reference 3
1. Loprinzi CL, Levitt R, Barton DL, Sloan JA, Ahterton PJ, Smith DJ et al. Evaluation of shark cartilage in patients with advanced cancer. Cancer 2005; 104: 176 - 82.
2. Manheimer E, White A, Berman B, Forys K, Ernst E. Meta-analysis: acupuncture for low back pain. Ann Intern Med 2004; 142: 651-63.
3. Ernst E, Boddy K, Pittler MH, Wider B. The desk top guide to complementary and alternative medicine. 2nd Edition. Edinburgh; Mosby. 2006.
4. Ernst E, Canter PH. Investigator bias and false positive findings in medical research. TRENDS in Pharmacological Sci 2003; 24: 219-21.
5. Hamre HJ, Fischer M, Heger M, Riley D, Haidvogl M, Baars E et al. Anthroposophic vs. conventional therapy of acute respiratory and ear infections: a prospective outcomes study. Wien Klin Wochenschr 2005; 117: 256-68.
6. Rees L, Weil A. Integrated medicine. BMJ 2001; 322: 119-20.
7. Ernst E. Disentangling integrative medicine. May Clin Proceed 2004; 79: 565-6.
8. Calman K. The profession of medicine. BMJ 1994; 309: 1140-3.
9. The Prince of Wales's Foundation for Integrated Health: Complementary Healthcare: a guide for patients. 2005.
10. Smallwood C (led by). The Role of Complementary and Alternative Medicine in the NHS. An investigation into the potential contribution of mainstream complementary therapies to healthcare in the UK, 2005.