This is a third talk in a series that began with Relationship Based Medicine , continued with Beware of Doctors Bearing Gifts and concludes with this talk, which could called History of a Medical Psychosis, Medical Neoliberalism, Evident versus Evidence Based Medicine, A Lutheran Moment, or Does Objectivity Come from using Chance to Control Bias or Bias to Control Chance?
It is the most important talk I have ever given.
The first lecture was delivered to clinicians in New York with a Q and A afterwards.
The second was delivered to the public in Lethbridge Alberta, thanks to Jennifer Williams and Dan Johnson but owing to tech difficulties at the venue (See In Memory of Dexter Johnson), it was difficult to record the Q and A with the public. Suffice to say though between the technical difficuties, the lecture and the Q and A, we were all there for the better part of 3 hours and the discussion was great.
This third lecture was delivered to Aaron Kesselheim’s PORTAL group – Program on Regulation, Therapeutics and Law. There are two versions. The History of a Medical Psychosis was recorded by Bill James the day before in case of glitches – same day as Putin and Biden gave speeches. The second was recorded by Aaron – Faulty Evidence and Moral Hazard.
There are slight differences between them. The text and slides below add some detail to both talks but the tone of voice and gestures in the talks likely convey things not in the text.
Slide 1: Faulty Evidence and Moral Hazard
Welcome to a very conservative talk – based on a belief in the medical model and in evaluating the drugs we use thoroughly.
Slide 2: These quotes are a precis of key points in the deposition of Ian Hudson, Chief Safety Officer of GlaxoSmithKline (GSK) in 2000 in the Tobin v SmithKline trial.
Forty-Eight hours after starting Paxil Don Schell shot his wife, daughter and granddaughter and then himself. Hudson is being asked – Can SSRIs cause Suicide?
The jury dismissed Hudson’s Evidence Based Medicine view in favor of Evident Based Medicine and in this Civil trial found GSK guilty of negligence that resulted in the death of this family.
Hudson’s view, however, remains ensconced at the top of Britain’s drugs regulator, of which he was later the CEO – as well as FDA, EMA, TGA, Health Canada, WHO, and Boston institutions like Harvard, MRCT, and Vivli. Joe Biden and the Pope’s advisers will also endorse and tell their bosses to say – Yes RCTs are the Way the Truth and the Light.
Slide 3: Hudson’s views originate 70 years earlier in the work of a strange man – Ronnie Fisher.
Here you see Fisher smoking a pipe. He dismissed the later link between smoking and lung cancer, saying personality types predisposed to both cancer and smoking. Evidence was not Fisher’s strong point.
He had nothing to do with medicine and never ran an RCT. Controlled trials and randomization were there before Fisher and were no big deal but for no clear reason his book the Design of Experiments transformed what came next.
Fisher ran a thought experiment to characterize expert knowledge. He mentioned randomization as a means to control for any trivial unknown unknowns. Randomization later became semi-mystical.
Fisher’s expert knew parachutes worked so if we set up two groups, one with parachutes and the other not, we might randomize in case there was someone with webbed feet who might behave differently when falling. Otherwise, we would expect those wearing parachutes to live and those not to die – unless a chance strong wind lands a person in snow covered trees.
If randomization eliminated webbing as a factor, the only thing that could get in the way of an expert being right was chance and this could be assigned a statistically significant value. If 1 in 20 of those without parachutes lived we wouldn’t say the expert didn’t know what he was talking about. Fisher was characterizing expertise rather than characterizing an exploration of the unknown.
Randomization can’t control for ignorance.
Slide 4: Fisher’s expert is a Robin Hood who 19 times out of 20 can split a prior arrow lodged in the Bull.
Slide 5: But the trials done to license drugs especially antidepressants look more like this. A mismatch on this scale indicates medical RCTs are nothing like what Fisher had in mind.
Slide 6: The first RCT in medicine was a trial of streptomycin for tuberculosis. Tony Hill used randomization as a method of fair allocation – he was not managing mystical confounders. Hill helped put the effects of smoking on the map. He had no time for Fisher. He also knew doctors were not experts. His trial was not a demonstration of expertise.
Hill’s RCT found out less about streptomycin than a prior non-randomized trial in the Mayo Clinic, which showed it can cause deafness and tolerance develops rapidly.
Slide 7: Twenty years later, here is Tony Hill taking stock of controlled trials. In this 1965 lecture, he mentions that it is interesting that the people who are most heavily now promoting controlled trials are pharmaceutical companies.
Hill didn’t think trials had to be randomized. He thought double-blinds could get in the way of doctors evaluating a drug. He was a believer in Evident Based rather than Evidence Based Medicine.
Hill said we needed RCTs around 1950 to work out if anything worked. By 1960 he figured we had lots of things that worked – none of which had been brought on the market through an RCT – and he thought the need was to find out which drug worked best. This is not something RCTs can do – there is no such thing as a best drug. RCTs have instead become a way for companies to get weaker drugs on the market.
He said that RCTs produce average effects which are not much good in telling a doctor what to do for the patient in front of them.
All drugs do 3000 + things – one of which might be useful for treatment purposes. In focusing on one element, by default, Hill is saying RCTs are not a good way to evaluate a drug. All RCTs generate ignorance. But we can bring good out of this harm if we remain on top of what we are doing. Hill never saw RCTs replacing clinical judgement.
Slide 8: This 1960 RCT run by Louis Lasagna makes Hill’s point well. Thalidomide has therapeutic efficacy as a sleeping pill but the trial missed the SSRI-like sexual dysfunction, suicidality, agitation, nausea and peripheral neuropathy it causes.
Two years later, Lasagna was responsible for incorporating RCTs in the 1962 Food and Drugs Act Amendments – in order to minimise the chance of another thalidomide. By doing this, more than anyone else, Lasagna was the man who got us using RCTs
This trial would have licensed thalidomide today. The 1938 Act had no requirement for RCTs.
Slide 9: Many claim RCTs demonstrate cause and effect in a way no other study design can.
The 1950s was a golden age of new drugs that gave us the best antihypertensives, hypoglycemics, antibiotics and psychotropic drugs we have ever had without RCT input into any discoveries.
Imipramine was the first antidepressant. It and other antidepressants beat SSRIs in later RCTs. It can treat melancholia – SSRIs can’t. Melancholia comes with a high risk of suicide.
Imipramine was launched in 1958. At a meeting in 1959, European experts made clear that while it was a wonderful treatment imipramine made some people suicidal. Stop the drug and it clears. Re-introduce and it comes back. This was Evident Based Medicine showing this drug can cause suicide.
Like Fisher, let’s do a thought RCT of imipramine versus placebo in melancholia. The red dots here are suicides or suicide attempts.
Even though it can cause suicide, we would expect it to reduce the number of suicides because it treats this high risk condition. If you didn’t know better, this RCT would look like evidence antidepressants do not cause suicide.
Slide 10: Here is the data on the trials in mild depression that brought the SSRIs to market – mild depression because SSRIs are no use in melancholia. You see an increase of suicidal events compared to placebo in people at little or no risk of suicide.
Slide 11: This is what the data for imipramine look like in the same mild depressions. This is not a thought experiment – it was used as a comparator in SSRI trials. Now it too causes suicides.
RCTs can give us diametrically opposite answers. This is because these are not Drug Trials. They are Treatment Trials and if the condition and treatment produce superficially similar effects, randomized trials cause confounding rather than solve it. This is true for most medical conditions and their treatments.
People evaluating drugs in traditional clinical trials, before RCTs, knew this. When a patient becomes suicidal in a trial you have to use your judgement to work out what is happening but in RCTs clinicians are not supposed to use their judgment. RCTs are more objective than our judgments – supposedly.
Slide 12: Here is what a Drug Trial looks like. In healthy volunteer studies in the 1980s, companies found SSRIs cause volunteers to become suicidal, dependent and sexually dysfunctional. We heard nothing about these problems when the drugs launched in part because Drug Trials enabled companies to engineer Treatment Trials to hide these problems.
Slide 13: If you break a limb and get recruited to an RCT randomly applying casts to one limb – not necessarily the broken one – the trial will show random application beats placebo. Practicing Evidence Based Medicine rather than Evident based Medicine here would clearly be crazy.
Slide 14: Here is a James Webb telescope image. James Webb is marvellously bringing out the infinite individuality of stars.
In addition to randomization, Fisher put a premium on Statistical Significance. By 1980 every leading medical statistician was saying we need to get rid of statistical significance in favor of Confidence Intervals.
Confidence Intervals had been introduced by Gauss around 1810. Because of measurement error, the telescopes in use often failed to establish whether there was one or two stars in a location. Measurement errors should distribute nornally and so constructing confidence intervals could help us distinguish individual stars.
We have moved a long way forward in this respect with the James Webb telescope you see here.
Slide 15: Confidence intervals rushed into medicine in the mid-1980s. All the authorities on the right – many linked to Boston – argued they were much more appropriate than significance testing. They are appropriate for measurement error but are they any more a cure for ignorance than statistical significance?
Slide 16: Confidence intervals we are told allow us to estimate the size of an effect and the precision with which it is known. We have much more precise details on the likelihood of the Red Drug here killing you than we have for the Yellow Drug. The best estimate of the lethal effect for the Yellow Drug however is greater. The standard view is that if we increase the size of the Yellow Drug Trial we will have greater precision and know better what the risks are. As we shall see, this is wrong.
As things stand, if you are asked to take one of these drugs, should you be guided by precision or effect size? Ian Hudson, FDA and WHO say the only dangerous drug here is the Red One. This is because more than 95% of the data, more than 19 out of 20 lie to the right of the line through 1.0 – confidence intervals have defaulted into statistical significance.
I would take the Red rather than the Yellow one. This is not measurement error and we don’t know what confidence intervals represent when they are not representing measurement error.
Slide 17: Faced with claims Prozac causes suicide, Lilly analysed their clinical trials and claimed there is no evidence their drug causes suicide. Confidence Intervals are being spun here as indicating we don’t know Prozac causes suicide as nothing is statistical significant. This is Ian Hudson thinking – at odds with all statistical expertise. It’s wrong. The consistency across young and old, depression and eating disorders strongly suggests in real life there is an excess of suicidal events.
Slide 18: There is an intriguing mystery behind these figures. Here you see a representation of suicidal events that happened in the trials that brought Prozac, Paxil and Zoloft to market around 1990. You’ll note there are events under the word screening here. There is a 2 week washout period before a trial starts where people are whipped off their prior drugs before being put on the new treatment or placebo. This is a highly dangerous phase where people are in withdrawal and very likely to go on to a suicide attempt.
Slide 19: And here you see the moves companies made to avoid having a confidence interval excess of suicidal events on treatment. Companies only moved the events – not the people.
These moves were justified on the basis that people in the run in phase were not on active treatment – which is equivalent to being on placebo – but they often were withdrawing from active treatment which is highly dangerous. Some who stopped treatment at the end of the active phase of the trial committed suicide and were designated placebo too. Some on placebo, put on active treatment in the follow up period, committed suicide and were designated as placebo suicides on an intention to treat basis.
There are two articles from 2006 that bring out this point Did Regulators Fail and The Antidepressant Tale: Figures Signifying Nothing. The Antidepressant Tale gives other examples of confidence interval abuse.
After all these maneuvers, there was still an excess of suicidal events on these SSRIs but the confidence interval was no longer entirely to the right of 1.0. Confidence intervals have degenerated into statistical significance tests because regulators need a Stop-Go mechanism and statistical significance provides this. But doctors don’t need an external Stop-Go mechanism to replace their clinical judgement, so why do they go along with this?
Slide 20: Nobody noticed these maneuvers around 1990, but fourteen years in a crisis about children becoming suicidal on antidepressants, questions began to be asked. GSK and Pfizer responded:.
‘GSK did not intentionally submit any erroneous or misleading information to FDA. The suicide data submitted to FDA explicitly identified when events occurred during the placebo run-in period. FDA had all this information right from the beginning.’
“Pfizer’s 1990 report to FDA plainly shows … that 3 placebo attempts as having occurred during single blind placebo phases… FDA has neither criticized these data or the report as inappropriate, nor required additional analyses”.
These maneuvers breach FDA regulations and FDA staff noted this in memo’s at the time. But not only did FDA ignore these breaches of regulations senior figures, like Tom Laughren, put their name to articles that embraced these breaches of regulation – in one case in the cause of showing it was not unethical to have placebo controls in RCTs, as those on placebo were not at any greater risk than those on treatment.
There was much back and forth between FDA and companies in 1990. Was it criminal? Perhaps. I prefer the idea of strategic ignorance.
What I think we are seeing are events circling around a major crisis in knowledge production. This is not something you can expect FDA to take a lead on – they are not political actors, they are bureaucrats. Companies create knowledge or were creating the appearances of knowledge at this point, but doctors are surely primarily responsible for the creation of medical knowledge and doctors were missing in action around 1991– other than as spokespeople for companies.
Slide 21: The Sacred Mantra is that randomization controls for all possible confounders in all possible universes. The reality is randomization introduces confounders into clinical trials.
The images for the next 3 slides come from a GSK paper prepared in 2006 for submission to FDA. The small print is hard to read – the bold at the bottom gives you the key details.
The data for suicidal events on Paxil in Major Depressive Disorder trials in this first slide show it causes suicidal events. Even Ian Hudson would have to agree and these data were available at the time of the Tobin trials. But randomization is about to come to GSK’s rescue.
Slide 22: Faced with a problem like this, had GSK consulted me I’d have said do a trial in Intermittent Brief Depressive Disorders (IBDD). They might have said but there are trials of SSRIs in IBDD and they don’t work. I’d have said do one. They did and it had to be terminated early, Paxil did so poorly. I’d have said do another. Why – the figures for Paxil still look bad in this group?
Slide 23: But when you add the IBDD data to the MDD data, all of a sudden the figures say Paxil protects against suicidal events.
This scenario can happen every time a condition we are treating is heterogenous – that is dementia, diabetes, parkinson’s disease, breast cancer, back pain, hypertension – pretty well everything in medicine. In these cases randomization will act to hide effects good and bad and leave us able to use a problem a drug causes to hide a problem a drug causes.
Slide 24: Graphically this is what it looks like. The Red Drug here is the MDD curve alone – more than 95% of the data are to the right of the 1.0 line. The traditional wisdom is that adding some more events to the Red Drug above should give us a more precise version of the same estimate
In fact when you add a few more people, about 3%, we have shifted the curve to the opposite side of the 1.0 line. Its far a more precise confidence interval but this is a precision that speaks to our ignorance rather than to better knowledge. No medical statistics book ever hints at this possibility.
We could add 40 suicidal events to the paroxetine IBDD arm before Ian Hudson would have to admit paroxetine causes a problem – on the basis that the results are now statistically significant.
IBDD patients could be admitted to MDD trials – we have no way to distinguish them. Some patients become IBDD by virtue of a poor response to an SSRI.
Randomization in heterogenous conditions will hide effects drugs cause. It allows us to use an adverse effect a drug causes to hide the same adverse effect that drug causes. Confidence intervals do not help us work out what is going on in these cases.
Nor do they help in heterogenous drug responses. Lets take 20 Aarons who are all sedated by a Red Drug and 20 Davids all stimulated by it. The best estimate in the confidence interval in this case will lie on the 1.0 line, showing the drug has no effect. A method to distinguish between one and two stars should not produce an answer that there are no stars here. Algorithmic judgements cannot substitute for a human judgement.
Slide 25: Here is another problem with Confidence Intervals. Young men take Finasteride to restore a thick head of hair. We could count hairs and build confidence intervals around before and after hair follicle numbers.
Finasteride also causes suicide and permanent sexual dysfunction and like most drugs has 3,500 other effects. Confidence intervals for hair numbers before and after is one thing, but applying them to suicidality or sexual function, which were not measured in the trial, and for Merck to then claim on this basis that the science does not support a link between finasteride and suicide on the basis that not all the data lie to the right of the 1.0 line isn’t managing measurement error. It’s a confidence trick – that happens all the time.
Slide 26: There are more dead bodies on antidepressants in trials than on placebo, yet the RCTs as Ian Hudson told you show the drugs work. This is because most RCTs have a surrogate outcome. For antidepressants its the Hamilton Rating Scale for Depression.
Fifteen years after its creation, Max Hamilton commented on his scale:
It may be that we are witnessing a change as revolutionary as was the introduction of standardization and mass production in manufacture. Both have their positive and negative sides
Hamilton saw this scale as a checklist of things to ask about in an interview – a mixed blessing.
Slide 27: Checklists are now viewed as more scientific than David Healy in a clinic asking you about your family. They will produce standardized but possibly disastrous interviews.
For instance, on this scale, there is a suicide item. Suicidality can stem from the illness or the drug. This needs a judgement call. If caused by the drug you should rate a Zero. If caused by the illness you might rate 3 or 4. If you just check yes for suicidality, the default is to the illness. Ditto for sex, and for sleep.
In the case of sleep, the illness can produce too much sleep or not enough sleep and each of the medicines can inhibit sleep or heavily sedate. There are 3 sleep questions. A scientific interview has a multitude of options requiring judgement calls.
In the 1980s, we brought problems to doctors needing help to get on with the lives we wanted to live. Since then, for drug companies, rating scales, sometimes left in the waiting room, ensure you do an interview that produces figures for which a company drug might seem an answer. Your interview will help you to help your patient to live the life Pfizer want him to live. Do that and you are no longer practicing medicine.
Slide 28: Many think RCTs are fine if only they were done by angels.
Study 329 was conducted in the very best university centres in North America. It has an authorship line to die for, starting with Marty Keller and including a Canadian Liberal Party Senator – Stan Kutcher. It was published in the Journal with the highest impact factor in child psychiatry. The article claims Paxil works wonderfully well and is safe for depressed teens.
What I am about to tell you applies to all industry trials across medicine.
Slide 29: Three years earlier, in 1998, GSK concluded Paxil didn’t work in Study 329 and was not safe. That could not be published so they were going to pick out the good bits of the data and publish them. The good bits formed the Keller et al 2001 paper.
This 1998 internal SKB document led New York’s Attorney General to file a fraud action against GSK. As part of the resolution of this, GSK agreed to make their Paxil trial data public. A decade later, GSK resolved a Dept of Justice action, which also involved Study 329, for $3 Billion dollars.
Slide 30: These actions gave a team of us an incentive to Restore Study 329 and we now had more raw data from this study than FDA or other regulators had seen for this or any company study.
Slide 31: In contrast to Keller, we found the 8-week acute phase showed no difference between Paxil or placebo. We found the same for the never published 6 month continuation phase – never published till we published it 18 years after the trial ended.
Slide 32: Keller noted 6 emotionally labile events in the trial, some of which might have been suicidality, 4 on paroxetine. But in our hands a fifth of the children on Paxil had a behavioral event mostly suicidality – 18 out of 93 children.
Suicide is not what I want to focus on. It’s the ability of company studies to hide adverse events. Our paper lists 10 ways to hide things. Coding – as in calling suicidality emotional lability, is top of this list – this is the first act of authorship but no reviewer or journal pays any heed to it.
Slide 33: In a Pfizer trial, at the same time, a man on active drug got agitated, poured gasoline/petrol on himself and set fire to it intending to kill himself but he only died from his burns 5 days later. Pfizer coded him as death by burns. Once the coding is done, the paper is all but written.
There is some chance FDA found out about this man because if you have to go to hospital or you die companies had to file a report outlining what happened and did so for this man.
Slide 34: But in Study 329, FDA know nothing about a 15 year old boy, 2 weeks after being put on Paxil, who was out on the street waving a gun, threatening to kill people. He was brought to hospital by the police. There was no report to tell FDA what happened. Thirty years ago companies found a way to legally avoid filing these reports. Companies are still using this trick in trials published this year in all major journals and regulators either don’t spot or are not bothered to close a very obvious loophole. In Study 329, 4 children vanished through this loophole.
Slide 35: The sentences on the right are the 3 sentences with which this article ends – the message is companies have created an impression that RCT articles are like tablets of stone brought down from the mountain top, commanding doctors to prescribe and us to take. But when we have access to RCT data, this raises questions – as science should – rather than issues commands.
In addition to Coding, Grouping is also an act of authorship. If you have 500 events in 93 children on Paxil, rather than list them all, cardiac events are usually grouped in a Cardiac group etc. Behavioral events are usually grouped in a Psychiatric group. GSK grouped all behavioral events under Neurological. This groups emotional lability with headaches and dizziness, which are very common. Grouped this way the behavior problems disappear. Grouped as Psychiatric, the problem is immediately clear.
The Restoring Study 329 article took over a year to get it published. What was fascinating was the BMJ did not contest the data but they were very exercised by the act of interpretation. They appeared to assume that the data had spoken and GSK faithfully transmitted what they had heard. They found it heard to grasp that GSK used a coding dictionary that even FDA had never heard of.
Any scientific analysis inevitably involves an act of authorship or interpretation. But BMJ found it hard to let us author the behavioral events out of the neurological group into a Psychiatry group. There is no such thing as data without an interpretation. Ideally the interpretation should command consensus but for BMJ this appeared to mean that we should adopt what GSK had done without question.
Slide 36: Everyone knows Prozac was approved for children who are depressed but not that Paxil was too. A year after the Keller paper came out, this is part of an FDA approvable letter for Paxil.
It says GSK have told FDA Study 329 is negative. FDA agree its negative – in fact all 3 trials are negative – but FDA will still approve Paxil for kids. FDA also agree with GSK’s suggestion not to mention the negative trials in the label of the drug. Why would FDA agree to this?
Before answering that, let me note FDA also viewed the Prozac trials in teens as negative.
Slide 37: This slide from Erick Turner’s 2008 article shows published adult ‘trials’ on various antidepressants, almost all indicating the drugs work well and are safe. Look at the sertraline column – 3 from the right. It shows two studies – the minimum needed for approval.
Slide 38: Another slide shows the trials as FDA viewed them. 46% of these trials are negative. Many published as positive were negative to add to the unpublished negative trials. Look at the sertraline column – only one positive study.
Why do FDA say nothing about this? Well if FDA said trials are negative – the companies might get sued for fraud or fined – as happened for Study 329.
Slide 39: Here you see the PTSD page of a 30 page document listing Zoloft articles in progress. These papers aim at capturing markets not at informing us on how to use Zoloft safely.
Pfizer did 4 Zoloft PTSD trials. All negative. FDA approved it on the basis of 2 trials with a minimal benefit for women. These good bits plucked out are what’s being published. You see under Status on the right two articles are complete and will be sent to the very best journals. On the left you see TBD – to be determined – when Pfizer decide which names would sell most Zoloft.
You saw a 24 person authorship line for Study 329 but the real author is not there. Across medicine studies of on-patent drugs are ghostwritten.
In the case of children’s antidepressant trials the entire literature was written by ghosts and there is a complete mismatch between the published claims and the data – the greatest mismatch in all of science. On the basis of published claims the use of these drugs is escalating rapidly in teenagers with predictably bad results.
Slide 40: Fifty years ago, Britain joined the EU and ran into trouble. Cadbury’s chocolate, their favorite chocolate, they were told, could not be called chocolate. It didn’t have the right quota of cocoa solids. British consternation over chocolate led to Brexit some decades later.
What FDA do is in their name – they regulate Food and Drugs. Faced with butter or chocolate or drugs, companies must meet an assay standard – so much cocoa solids, animal fats, or so many points on a Depression rating scale in 2 trials. Meet that and FDA let you use the words chocolate, butter, or antidepressant. It’s not FDA’s job to decide if this is good butter, or if chocolate is good for you, or to police the medical literature.
Sllide 41: Since 1990, however, regulators increasingly say they approve drugs on the back of a supposed positive Benefit-Risk ratio. This is Ian Hudson thinking. If there are no proven adverse effects and just a benefit then of course there is a positive Benefit-Risk ratio.
The medical act of bringing good out of the use of a poison is incompatible with all this.
We would all agree there is a positive benefit-risk ratio for parachute approval in terms of lives saved versus lives lost – even though some men might have difficulties making love in the weeks afterwards, owing to harness effects. If things aren’t clear enough for us all to endorse, regulators are de facto getting us to live the lives companies want us to live when they make Benefit-Risk claims.
Unlike parachutes, SSRI RCTs have more dead bodies on SSRIs than placebo. In addition. the commonest effect of an SSRI is to cause genital numbness in close to everyone who takes one within 30 minutes of a first tablet. Almost everyone will have the way they make love changed while on an SSRI and they may later find themselves unable to make love ever again, either because they can’t stop or because the drugs can wipe out sexual function for ever. This may be far more important to a person than any mood benefit.
But the focus on the mood effect, means the sexual effect was missed entirely in the trials regulators scrutinized both because that’s how trials work but also with a little extra gaming from companies.
Some years ago treating a man with OCD, I tried an SSRI – the first line treatment and then more heavy duty drugs when the SSRI didn’t work. All made him worse. One day he came in much better – he had stopped all his drugs but he was cured by going back smoking. He had also googled nicotine and OCD and found studies showing nicotine and related drugs can help OCD.
When I say the Art of Medicine lies in Bringing Good out of the Use of a Poison, people hiss at me but everyone would likely agree this man was bringing good out of the use of a poison. SSRIs however are prescription-only because we expect them to be more dangerous than over the counter alcohol and nicotine.
The important thing is that this man (perhaps with input from me) is the only person in a position to make a meaningful Benefit Risk call. I can’t see what role FDA could have in this. Benefit-Risk calls are an individual matter. Making the claims FDA now make puts them in a role of getting people to live the life Pfizer want them to live.
Am I making all claims on the basis of Citizen Research more than Expert input? No – among the articles this man found about nicotine and OCD was one whose significance passed him by. One of the authors was Arvid Carlsson, who created SSRIs and won a Nobel Prize for Medicine.
But when you have Skin in the Game, Motivation can be worth just as much as Expertise.
Slide 42: As a result of Ian Hudson’s views, as I wrote 25 years ago, everyone who participates in a company trial today puts all the rest of us in a state of Legal Jeopardy. We should boycott trials, until this changes. See Clinical Trials and Legal Jeopardy.
Slide 43: That article was 25 years ago, this is 25 days ago and argues everyone entering a trial now are deceived by consent forms that promise coverage for injuries, unaware that there are no injuries on modern treatment, or no injuries that can be admitted. See The Coverage of Medical Injuries in Compary Trial Informed Consent Forms.
Slide 44: However, since 2010, the US Supreme Court in the Matrixx case made it clear that Ian Hudson’s views do not apply to investors wanting to make up their mind about the Benefits and Risks of investing. We who are investing our lives in these treatments still do not have such rights.
Slide 45: The beating Tell Tale Heart of this talk came with the publication of this article 33 years ago this month, in which 3 Boston clinicians claimed fluoxetine caused 6 people to become suicidal. Analyzing the cases closely and following traditional clinical approaches for determining causality, this article nailed beyond doubt that fluoxetine could cause some people to become suicidal.
Lots of other groups reported similar findings. I published 2 cases of men, who were challenged, dechallenged and rechallenged with an SSRI. There was no other way to explain what happened them except that fluoxetine had caused it. This was Evident Based Medicine .
Slide 46: Almost the same week as my article came out, BMJ published an article in which Lilly claimed an analysis of their clinical trials showed no evidence fluoxetine made people suicidal. The cases being reported, therefore, were sad but anecdotal – and the plural of anecdote is not data. Depression was the problem not fluoxetine. Clinical trials are the science of cause and effect. Doctors, the public, media, and politicians were being asked – are you going to believe the science or the anecdotes?
This was a knowledge creation moment that likely had input from all companies and perhaps FDA. This article created Evidence Based Medicine and just as with RCTs 30 years earlier, the people most commonly exhorting doctors to practice EBM today are Pharma companies.
In fact, the original phrase is the plural of anecdotes is data – otherwise Google wouldn’t work.
The idea the disease is responsible for suicide attempts and suicides in healthy volunteers is hard to believe but companies can wheel out experts to say just that.
My key point is that the Teicher paper is the science – the Lilly data is an artefact. My challenge to you is which are you going to believe the Science or the Artefact?
The Science of Medicine lies in making hard judgement calls. The made by algorithm approach, combined with inappropriate statistics, creates artefacts not science.
You’ve seen earlier how Lilly cooked the books. When you get the trial data, the Evident Based Medicine and Evidence Based Medicine approaches here can be reconciled – as you might expect with real science.
But even there was an incompatability there isn’t a problem. Resolving discrepancies is how we do science.
This points to a deep problems with Lilly’s argument. They are not in the business of being scientific – resolving discrepant observations. Lilly’s argument is a religious one – a dogmatic one – they forbid us to believe the evidence of our own senses.
This is papal infallibility riding again.
Peter Drucker, the doyen of marketing gave us a secular update – the goal of marketing is not to increase the sales of Prozac, its to own the market. This was the moment Pharma took ownership of the market.
This ownership allows companies to dictate what the risks, the benefits and the trade-offs of drugs are. Allows them to force us to live the lives they want us to live rather than engage with the risky and unprofitable business of producing products that will help us to live the lives we want to live. Following this Artefact is profoundly alienating.
Slide 47: This faces us with a what is science question? The usual histories start with the foundation of The Royal Society in 1660, which established the ground rules for Science. Science would deal with matters that could be Settled by Data. Participants could be Xtian, Hindu, Jew, Muslim, or Atheist, but participants were called on to leave these badges at the door and make a consensus based judgement call about the best way to explain the experimental outcome in front of them.
The histories of science emphasize the word Data. Settled is the more important word. Statistics played no part in this science. The experiments were events and didn’t need the descriptions statistics can provide. Science was emphatically not about replacing judgment calls with a statistical artefact. It only became so 33 years ago.
Slide 48: This account of our history overlooks an earlier event. In 1618, Walter Raleigh was executed – for being too close to those pesky Europeans. Raleigh was convicted on the basis of things said about him by people who did not come into court to be cross-examined.
Legal systems worldwide recognized the injustice of this and introduced Rules of Evidence. Hearsay could not be used as evidence. Jurors – a group of 12 people, Xtians, Hindus, Muslims, Atheists and Jews, can only base a verdict on material put in front of them that can be examined and cross-examined. The process of forcing 12 people with very different biases to come to a Verdict about what is in front of them is the essence of science.
Verdicts and diagnoses are provisional – the view that best fits the current facts. This might appear to contrast with the objectivity of science, but scientific views are similarly provisional. Scientists attempt to overturn verdicts with new data.
Let’s say I gave Aaron fluoxetine 33 years ago and he became suicidal. I could examine and cross-examine him, run labs and scans, raise the dose, stop the drug, add an antidote, check with colleagues has anyone else seen anything like this or can they explain it in any other way. Aaron is the data – all of the data. He is the apparatus in which the experiment is taking place.
If Aaron and I conclude fluoxetine made him suicidal and report this to FDA, the first thing FDA does is to remove his name. No-one can now examine or cross-examine him and come to a scientific view about whether there is a link or not. His injury has been made Hearsay – indeed misinformation.
If you are later injured in the same way and see tens of thousands of reports of suicidality on SSRIs on FDA’s adverse event reporting system, you cannot bring this into court because no-one can be brought into court. It’s Hearsay not Evidence.
Company RCTs are equally hearsay and should not be let into Court as evidence. Accessing the data in this case means accessing people – like Aaron or me – and we cannot do that with the people in company trials, who often don’t exist. Except rarely, the authors on the articles have seen none of these people and cannot speak to what happened either.
In contrast, if Aaron and I report his case in he New England Journal or the American Journal of Psychiatry as a Case Report, with our names on it, we can both be brought into Court.
Slide 49: By 1983 the view was emerging that RCTs offered the scientific and sophisticated way to establish if a drug had adverse effects as this quote by Rossi et al indicates:
Spontaneous reporting is “the least sophisticated and scientifically rigorous . . . method of detecting new adverse drug reactions.
A mid-career Lasagna, the man who more than anyone introduced RCTs, responded:
This may be true in the dictionary sense of sophisticated meaning ‘adulterated’ . . . but I submit spontaneous reporting is more ‘worldly-wise, knowing, subtle and intellectually appealing’ than grandiose, expensive RCTs.
Slide 50: Here you have an older Louis Lasagna saying:
In contrast to my role in the 1950s which was trying to convince people to do controlled trials, now I find myself telling people that it’s not the only way to truth.
Evidence Based Medicine has become synonymous with RCTs even though such trials invariably fail to tell the physician what he or she wants to know which is, which drug is best for Mr Jones or Ms Smith – not what happens to a non-existent average person.
Slide 51: Here is James Webb again to remind you that confidence intervals were a step on the way to revealing the individuality of stars. In medicine, statistical approaches operate against individuality.
Using Chance to control Bias does not foster clinical science, especially when we allow a mindless algorithm to replace clinical judgement. Clinical medicine, like law, and the first 300 years of science uses Bias to Control Chance and both medicine and law need to assert the validity of this approach.
Slide 52: Using Bias to control Chance rather than some algorithmic method of controlling Chance is critical when numbers enter the frame. This is our only defense against medical neo-liberalism.
Around 1980 Pharma began treating healthy people. They discovered that numbers for our peak flow rates, bone densities, blood pressure, lipids, or sugar provided opportunities to sell drugs. Up to 1980, we brought our problems to healthcare – seeking help to live the lives we wanted to live. After that health services began to give us problems and the amount of medicines consumed rose dramatically. We began treating numbers rather than people.
Remaining on top of data like this is difficult. Just after weighing scales for people were introduced in the 1860s, we got the first descriptions of anorexia nervosa. In the 1920s, weighing scales in drug stores came with norms for our ideal weight given our height and sex and eating disorders mushroomed. When scales migrated into our homes in the 1960s eating disorders became epidemic – in the countries that had weighing scales. Measurements can make both us and our doctors neurotic.
Slide 53: There is an extra element to the equation. The service industries emerged in the 1950s. Through to 1980, no-one viewed health as a service industry – doctors were professionals who exercised judgement the way a Judge might. But service industries have managers and health got managers. With this the exercise of clinical discretion, the jewel in the crown of Health Care became a problem for those who manage services.
The idea of bringing good out of the use of a poison does not compute for managers, insurers, politicians or increasingly the public.
Before 1980, clinicians mobilized the resources of the organization they worked to handle the risks your condition posed to you. Now instead you can palpably feel the clinicians you meet are managing the risks you pose to the organization we work for.
Slide 54: Managers manage what they can measure. For them figures have a sheen of scientific gold. We are re-running the King Midas story – this gold coating is incompatible with Human Care and Life.
This governance by numbers is the essence of the neoliberalism that began in Chile and Britain – treat the money supply numbers or inflation numbers regardless of what is happening a country. Medicine is the best place to see this and its deleterious effects in action – aggravated by the fact that bowing down before a golden algorithmic idol inhibits anyone from leading us out of this desert in which we now wander.
Slide 55: When the pilot here reports problems, safety systems pay heed because they know she won’t fly if they don’t because of the consequences for her.
Jane Frazer is the CEO of Citibank. Since the financial crisis, bankers have an Early Warning System. Who knows if it helps? The financial crisis was linked to a moral hazard. Bankers were outsourcing risk, knowing that if things crashed you and I would suffer but they would continue to collect their bonuses. This made it hard for them to do the right or brave thing.
If the doctor on the left reports a problem, no-one pays any heed. She too outsources risk putting pills that like mortgages look too good to be true in our mouths. This is morally hazardous. Like a mortgage, if a drug looks too good to be true it probably is. If we blow up, she continues to be well paid. There is no incentive for her to do the right thing.
Slide 56: This moral hazard is leading to a pharmaceutical crisis that maps onto the financial crisis of 15 years ago. Here is a recent New York Times image of Life Expectancy in the US. You’ll see it began dropping in 1980, when we began treating numbers rather than people and converted health into a service industry. This Fall cannot be attribued to COVID. My view is that it is most likely linked to polypharmacy. The UK has similar falling Life Expectancy data – again pre-COVID.
Slide 57: Drugs like guns are techniques – amoral. The morality of their use lies in us. If we stop thinking about what we are doing when we use them, we are highly likely to be diminished.
Like Guns, Drugs create an arms race. The country with the best Medical Techniques and Guns wins wars and both armament and medical developments have been driven forward by military needs – to keep men able to fight in the case of drugs.
There is difference between Guns and Drugs. The chemicals in drugs are always risky. The information that transforms those chemicals into medicines has become increasingly dangerous. At the moment, the Drugs Race is not a better Chemical Race – it’s about creating more effective propaganda. The best propaganda is invisible – in this case it masquerades as science. The greatest concentration of fake literature on earth now centers on the reports of RCTs on the Drugs our doctors give us.
With both Guns and Drugs there is a limit to effectiveness. In the case of the Atom Bomb it is so effective that it cannot be used. It is the same with Drugs, if you are on more than 3, the effectiveness of each falls off as you add more meds into the mix.
To get the most effectiveness you need to be on 3 or less. As of 2016, over 40% of over 45s in the United States were on 3 or more drugs every day of the year – this figure includes the people who never come to see doctors. Over 40% of over 65s are on 5 or more drugs every day of the week. Knowing what is happening teenagers, this can only increase.
We know that reducing medication burdens can increase life expectancy, reduce hospitalizations, and improve quality of life.
Slide 58: Reducing a medication burden is not easy – as this image from the movie The Hurt Locker illustrates. Many of these drugs explode on attempting to withdraw them. This is the primary medical task of our age and there will never be any RCTs to help us out. The best evidence will likely lie in clinical experience of tackling similar situations. Great if I have a walkie-talkie to clinical colleagues but my key partner in this is you – you bring cues from missing doses of some of these drugs, and your sense of what they are doing that I can only access through you. And of course you ultimately dictate which risks we take.
In the 1940s and 1950s, RCTs had a role when we didn’t know if things worked. From the 1960s we had so many good drugs that worked – brought on the market without an RCT in sight – a new role beckoned for RCTs – to work out what worked best. RCTs cannot do this and besides it did not suit company interests. Companies instead created Randomized Controlled Assays which among other things allow weaker and weaker drugs on the market.
The pressing medical need now is to get people off the meds they are on and RCTs and what is called EBM have little or no role to play in helping us with this.
Slide 59: If a doctor tries to modestly reduce medication burdens or recognize that in some cases a treatment might have become a problem, current public health systems will not accommodate her. In the US, it is current culture that will mobilize against this. The doctor will be told this would be a good private practice offer that people can choose, but the public health system expectation is that people want and should get more diagnoses and drugs.
This is because getting treatment to save our lives was once a privilege and wealth and public health systems want everyone to be able to access treatment. They cannot now see that these good intentions are killing people. Now we have to be wealthy to get off medicines to save our lives.
Canada now leads the world in MAiD – Medical Assistance in Dying. In places like Belgium and Holland young women are getting MAiD because they have drug induced treatment resistant depression. While there must be concerns when young women in their 20s get MAID for treatment resistant depression – an antidepressant induced illness – I’m not quibbling about the morality of MAiD – any good doctor will almost certainly have cases where MAiD is the caring thing to do.
What I am quibbling about is the morality of a system that encourages us to have any service we want, including MAiD, but denies us the option of having less services. Denies us a Greener, more sustainable HealthCare. At the moment, not even Green parties have got a handle on this.
Slide 60: This lady comes from an Arthurian Legend. Arthur has been out-fought by a Black Knight who spares his life if he can answer a riddle – What do Women Most Desire. He has a year to find the answer. He and his court hunt desperately for it. The day he is due to die, Arthur and his troop meet this woman who tells him that she has the answer to the riddle but one of his knights must become her husband. Gawain jumps down and offers himself up. Arthur answers the riddle, and a furious Black Knight lets him go.
Slide 61: Gawain gets married. Everyone at the Court is unhappy for him.
Slide 62: In the bedchamber Gawain can’t bear to look at her. She takes control and asks him – do you want me to look like this by night with you and the way I was by day in court or like this by day in court. He has no idea and says – whatever you want. This is the right answer.
The answer to both riddles is she, like us, wants to control her own life. There may be a disease that needs treating – but she doesn’t want us to tell her how to live life, or want her negative emotions eliminated with a pill. She may be doing better at living life than you or I.
The evidence based medicine we now practice creates a False We – a non-existent average person – a fairy tale.
Rather than paying heed to the non-existent average person who comes out of clinical trials, when we relearn that we can learn much more from the person right in front of us, she and others who come to see us will seem more interesting and as they sense that we will be more attractive to them – easier to work with.
A relationship based medicine is the only validly scientific form of clinical practice. If you can’t build up a relationship with people because you and they see a different doctor every time, a relationship in which you are looking closely at and listening attentively to them – perhaps even detecting if there is a change in their smell, you are not doing science. The person in front of you is the apparatus in which the experiment is taking place. The computer screen is not.
Both science and morality depend on collaboration. Collaboration creates a virtuous circle – an Us – that leaves us all better placed to live the life we want to live. It creates Social Capital.
Redesignate Company Trials as Assays
Government of the People by the People has been replaced by governance.
If it is not to perish entirely from the earth…
We need to do…
Footnotes
This may be the most important lecture I have ever given – it’s the longest at least. It has been heavily shaped by Dee Mangin, Peter and Julie Wood and everyone linked to RxISK – Bill James, Johanna Ryan, Peter Selley, Sarah Tilley, Mary Hennessey, Annemarie Kelly and many others who have worked behind the scenes but don’t want to be named and others whose comments on posts are often more illuminating than the posts themselves.
It has been shaped over a 25 year period by Andy Vickery, Cindy Hall, Skip Murgatroyd and Michael Baum who in the legal cases they involved me in brought me face to face with the many issues covered here.
It has been shaped by Jon Jureidini, Melissa Raven, Joanna Le Noury, and Elia Abi-Jaoude, who along with Mickey Nardo and Catalin Tufanaru, both now dead, were the team behind the Restoration of Study 329 – see the final article at Restoring Study 329.
It would not be possible to leave Peter Goetzsche out of the frame and an intense struggle to restore the Prozac trials in adolescents – along with the bravery of Ralph Edwards in publishing this paper. See Flat as Kansas.
Finally to complete a set of Peters, Peter Doshi has been one of the most remarkable people working on all these issues extraordinarily effectively.
There have been any number of fabulous media people like Shelley Jofre and Andy Bell who brought key issues to light, along with Ariane Denoyel and others who have grappled with the issues outlined here.
More recently, Dan Johnson, along with Yoko Motohama and Vincent Schmitt who have lost teenage sons to the drugs mentioned here, triggered the series of lectures noted above of which this is the third in the series. Jon Thompson and his colleagues in the math department in the University of New Brunswick, along with Peter Selley and colleagues in the Devon and Exeter Medical Society allowed me to dress rehearse and improve the talk.
I have stolen ideas from lots of people such as Steve Lanes – too many to acknowledge. As Steve’s example shows, some of the best help has come from people working in industry.
The Q and A after this talk in Boston reveals a tendency we all have to say things would be fine if industry just weren’t involved in trials. This is not my view. Industry don’t help but they are primarily exploiting medical failures to get to grips with the faultlines in RCTs – and a medical willingness to accept a simplistic solution to the problem of objectivity rather than engage with others in establishing what is objective or at least the best provisional version of objectivity.