Progress in evidence-based medicine: a small reply to Djulbegovic and Guyatt

Thanks to Richard Lehman’s blog this week, I’ve been reading a fascinating review of the advances made by EBM written by two of the co-founders of the movement. Both Benjamin Djulbegovic and Gordon Guyatt have made serious contributions to the growth of EBM, as they charmingly acknowledge in their declaration of interest section from their review paper:

Both authors have devoted significant aspects of their careers to the development of evidence-based practice.

(all the blockquotes in this post are taken from this new paper, unless otherwise stated)


The article as a whole has lots of juicy stuff for those of us interested in the history of EBM, as well as its prospects. Here, I thought that I’d try and pull out a few of the most interesting for a philosophical audience. In no particular order, then:


1. Is EBM a revolution or a cumulation?

In the new paper, Djulbegovic and Guyatt write that they think that medicine in general is cumulative – that is, new research findings build on older ones, in the way that e.g. Henri Poincaré suggested when he wrote that “Le savant doit ordonner ; on fait la science avec des faits comme une maison avec des pierres” (Science is built up with facts, as a house is with stones):

The view that “science is cumulative, and scientists should cumulate scientifically” reflects the second EBM principle: health claims should be based on systematic reviews that summarise the best available evidence.

But, for those of us that know Guyatt’s earlier work well, this is something of a surprise – as part of the group that wrote the 1992 paper that was largely responsible for bring EBM to the mass market – Guyatt has previously endorsed a strongly revolutionary view of the role of EBM in medicine, which seemed to work along broadly Kuhnian lines. For instance, the 1992 paper began with the slogan “A new paradigm for medical practice is emerging.” I think that further clarification here is required, because – as the raft of examples where standard practices are overturned because of new evidence shows – a strongly cumulative model is not a good fit for many cases in medicine.


2. What’s the philosophy of EBM?

In their piece, Djulbegovic and Guyatt discuss – fairly briefly – the “philosophical underpinnings of EBM”. They write that:

On the surface, EBM proposes a specific association between medical evidence, theory, and practice. EBM does not, however, offer a new scientific theory of medical knowledge, but instead has progressed as a coherent heuristic structure for optimising the practice of medicine, which explicitly and conscientiously attends to the nature of medical evidence.

Heuristic here is the really important word, I think. Their fallible rules are simple – just two, and easy enough to state:

The basis for the first EBM epistemological principle is that not all evidence is created equal, and that the practice of medicine should be based on the best available evidence. The second principle endorses the philosophical view that the pursuit of truth is best accomplished by evaluating the totality of the evidence, and not selecting evidence that favours a particular claim.

They are careful also to note that “controlled clinical observations provide more trustworthy evidence” than various other methods of gathering evidence. Nothing so surprising here if you’ve followed the literature a bit before – although see the next section. More worrying to me is the slogan “the totality of the evidence”, as I discuss below.


3. The hierarchy is canon!

I was surprised to see that the old-fashioned ‘evidence pyramid’ (really a triangle) was endorsed here, as shown in this image that I’ve re-used here from their article. I’ve had several arguments at conferences etc. where it has been suggested that the pyramid was never part of ‘real’ EBM practice, and was purely a crude simplification of evidence appraisal designed to engage students etc. Well, hopefully I’ll have a snappier response in future.

Less snarkily, Djulbegovic and Guyatt seem to be nervous about (non-GRADE) modifications of the hierarchy:

by 2002, 106 systems to rate the quality of medical research evidence were available. When investigators applied some of these quality instruments to a set of studies, the result was extensive disagreement, with ratings ranging, for the same studies, from excellent to poor. One evaluation of these systems concluded that none was particularly useful for research or the practice of medicine, that their continued use will not reduce errors in making recommendations or improve communication between guideline developers and guideline users, and thus they will not help people make well informed decisions.

(sidenote – there are some interesting recent modifications of the trad pyramid that come from within GRADE)

And perhaps they are right about this. The most recent updating of GRADE engages with a staggering number of different factors that make a difference to the way that a piece of evidence is appraised, including:

all elements related to the credibility of bodies of evidence: study design, risk of bias (study strengths and limitations), precision, consistency (variability in results between studies), directness (applicability), publication bias, magnitude of effect, and dose-response gradients.

One point to make here – I don’t think I’ve ever seen an EBM pyramid with more than one axis for ranking evidence. There’s always some single best-to-worst ranking that can be made between kinds of evidence. I should declare my competing interest at this point: the EBM+ group are interested in thinking about ranking evidence along multiple axes, to give a quite different picture of the strengths and weaknesses of different kinds of research findings for making decisions in medicine.


4. Total evidence or best evidence?

Note that the EBM+ group have previously mentioned the principle of total evidence (2013: 747, we wrote “What we really need is to use the totality of evidence available to us“).


I think that Djulbegovic and Guyatt clearly mean to endorse the totality of relevant evidence – just as we meant in our paper. Yet “all the relevant evidence” is not at all the same thing as Carnap’s principle of total evidence. Broadly, the principle of total evidence suggests that we should “use all the available evidence when estimating a probability“. It is true that my shoes today are bright orange. An observation of the shoes counts as evidence, which we might then use to make inferences that e.g. I am not qualified to dress myself without adult supervision or something. But it is very hard indeed to see how this evidence might usefully contribute to making medical decisions. And that’s the problem with endorsing total evidence (in the strong, Carnap-ish sense) in EBM: when do you stop?


Happily, Djulbegovic and Guyatt soon clarify what they mean:

…the second EBM principle: health claims should be based on systematic reviews that summarise the best available evidence.

Okay, so we’re talking about best available evidence, which sounds much more suitable for making good decisions in medicine. Yet (to make the obvious point) total evidence and best available evidence are different. How do you know that you are really dealing with the best available evidence, and how do we know that this evidence is best? Djulbegovic and Guyatt suggest that the answer depends on the context in which you ask the question. So, broadly, for clinicians guidelines will usually provide the best evidence, whereas if you’re working more directly with the primary research literature then suitably clever use of informatics will help you strip out the noise in evidence searches. Those are both sensible suggestions that take seriously the desperate need for constraints when picking things to read. But neither are anything close to the totality of evidence – and I’d suggest that they stop using this phrase.


5. Where does best evidence come from?

The question of finding out what your best evidence was if you do need to read the research literature directly was really surprising to me, largely because Djulbegovic and Guyatt provide (with citations) some startling estimates of the number of papers that different workers might be expected to read in order to do their work properly:

Using EBM information processing and filtering, they reported that, on average, clinicians need to be aware of only about 20 new articles per year (99·96% noise reduction) to keep up to date, and, to stay up to date in their area of expertise, authors of evidence-based topics need be aware of only five to 50 new articles per year. Similarly, practising oncologists need be cognisant of only 1–2% of published evidence that is valid and relevant to their practice.

That seems very low to me – but I haven’t read the papers that Djulbegovic and Guyatt cite yet, so I’ll withhold comment. However, I think that there are lots of reasons to worry about how informatics techniques being used in this way without very great care being taken – and Cathy O’Neil is better than me at telling you why you should think similarly.


6. Critical thinking or rule following

EBM has become essential for the training of young clinicians by stressing critical thinking and the importance of statistical reasoning and continuous evaluation of medical practice

I worry deeply here about how EBM+, particularly since it has emphasised the use of guidelines as tools for clinical practice, has really stressed the critical part. Conscientious, certainly – make sure you know the relevant evidence base before you do an intervention on a patient. Explicit – yes, absolutely. But what are the critical skills that Djulbegovic and Guyatt think EBM has brought to the training of medical students?


7. What’s the truth here?

It wouldn’t be a philosophical blogpost if I didn’t do at least a bit of handwaving about truth. Djulbegovic and Guyatt write:

Evidence is, however, necessary but not sufficient for effective decision making, which has to address the consequences of importance to the decision maker within the given environment and context. Thus, the third epistemological principle of EBM is that clinical decision making requires consideration of patients’ values and preferences.

A completely open question: what if there isn’t a single truth of the matter for a particular intervention? What if – for example – individual patient preferences completely over-ride all other considerations. Although Djulbegovic and Guyatt take seriously the role of patient preferences when it comes to the application of evidence in EBM, I do worry that the model here is one where individual preferences only ever make a post-hoc difference to evidence appraisal. My evidence (very brief, at the end of the 1800-odd word post) are several claims of the following form:

In response, EBM, from its inception, developed schemas for the assessment of the quality of evidence, reflecting the first EBM epistemological principle: the higher the quality of evidence, the closer to the truth are estimates of diagnostic test properties, prognosis, and the effects of health interventions.

It seems to me here that the truth that Djulbegovic and Guyatt discuss here is one about efficacy of treatments, rather than one grounded in effectiveness for the care of individuals. And it’s not obvious to me that one should over-ride the other without explicit justification.



Brendan Clarke

Lecturer in the History and Philosophy of Medicine

Science and Technology Studies (STS) Department, University College London (UCL).

Leave a Reply

Your email address will not be published. Required fields are marked *