There has been considerable debate amongst health-promotion
professionals and researchers over what constitutes evidence of effectiveness
in health promotion. One of the chief disagreements has arisen over the
question of the research methodology that should be used to measure
effectiveness. Oakley and colleagues1
have argued that the gold standard measure for effectiveness is a randomised controlled trial (RCT).
However, others have said that a form of experiment designed for research into
medical interventions is an inappropriate and unreliable tool for measuring the
effectiveness of behavioural interventions.2
Part of the problem lies in controlling the sheer amount of
variables that can influence the outcome of a behavioural intervention. RCTs of
medical treatments are designed to eliminate these so-called confounders, as
far as possible. They are designed to eliminate researcher bias, minimise
social-desirability bias (the tendency of trial subjects to report favourable
outcomes to ‘please’ the researcher) and ensure any placebo effect (the
expectation that the treatment will be of benefit, which often does produce
real beneficial effects) is evenly distributed between control and intervention
arms, so it can be taken into account. Double-blinded, placebo-controlled
trials were devised to eliminate this problem by ensuring that neither
researcher nor subject knew whether or not they were getting the experimental
treatment.
In trials that aim to change people’s minds or lifestyles,
it may be very difficult to eliminate subject characteristics that influence
the outcome, and even more difficult to eliminate researcher and social-desirability
bias. Although behavioural trials can be ‘placebo controlled’ by pitting the
trial intervention against standard of care, a less comprehensive intervention
or a waiting list, they cannot be blinded. And although, in trials of
behavioural interventions, efforts are made to rigorously standardise and
script them so that every practitioner delivers an identical intervention, in
practice people’s skills at delivery do vary.
Studies of counselling and psychotherapy have found, for
instance, that only a tiny part of the variance between different therapeutic
outcomes is due to the particular method or theoretical orientation that the
counsellor uses.3
Individual personal characteristics of the counsellor and characteristics of
the counselling relationship, such as a shared goal, are much more important.
In addition, subject populations need to be researched in
advance so that the intervention delivered actually meets the population’s
needs. Teaching condom-use negotiation, for instance, may be a waste of
resources if done in a population where 95% of transmissions are via shared
needles, or where taught to married women who have no power to ensure their
use.
Researchers and practitioners may be unwilling to draw
inferences from study results, especially if those results are
counterintuitive, and may persist in advocating familiar solutions that lack
evidential back-up. This is partly because of scepticism about RCTs of
behavioural interventions, partly because research results are not well enough
disseminated and partly because it is intrinsically more difficult to extract
evidence of effectiveness from some prevention methods than others.
Over-reliance on evidence-based prevention, on the other
hand, may have the opposite effect. Commissioners and providers may become
rigidly prescriptive and only fund a narrow range of best-evidence
interventions, rather than extrapolating from what evidence exists in order to
construct a plausible, but varied, best-practice prevention strategy. Prevention
programmes should have a degree of operational research and evaluation built
into them, not least because the results of behavioural interventions are
always going to vary. This is a result of local conditions much more than of
treatments.
Funders may, therefore, choose to commission only the
approaches which are validated by RCTs or the effectiveness reviews cited
below. Lucas and Scott
have argued that the approaches cited as most effective by such reviews should
only be seen as the minimum contents of a package of prevention measures.4
Many hold the view that evaluating public health
interventions requires a broad definition of evidence of effectiveness and a
wide searching of the evidence base. This is, for example, the position of NICE
(the National Institute for Health and Clinical Excellence) in the UK.
NICE is primarily known for its evaluation of biomedical
treatments: it makes recommendations on whether the NHS should routinely provide
a treatment after it has been licensed. In order to do this, NICE evaluates whether
the treatment has sufficient added value compared to the current
standard-of-care treatment, both in terms of cost and lives saved or improved.
However, NICE is also charged with developing broader
guidelines for clinical practice and has, since 2005, been given an explicit
public health remit to evaluate guidance and interventions designed to improve
public health – which includes preventing HIV. It was only the second agency
after the US Centers for Disease Control to be given such a task.
In 2010, Mike Kelly and colleagues from the Centre for
Public Health Excellence, part of NICE, wrote a paper summarising why evidence
of effectiveness for public health interventions needs to include not only a
wide range of sources of evidence, but also to redefine what ‘evidence’
consists of in the case of a public health intervention.5
Traditionally, the strength of evidence for the
effectiveness of interventions has been rated hierarchically, with randomised
controlled trials at the top, followed by cohort studies, then cross-sectional
studies, and case reports at the bottom.
Kelly and colleagues maintain that the ‘hierarchy of
evidence’ approach does not work so well for public health. This is partly
because there may be a lengthy chain of causation between the intervention and
its intended effect, and because so many other variables may intervene that it
may be difficult to prove that an intervention caused a specific result - or
even design a trial to demonstrate it.
Also, whereas clinical interventions aimed at improving a
single outcome are usually focused on individuals, public health interventions
operate on two levels, influencing the health of both individuals and
communities. Public health interventions may start as an intervention in
society (e.g. the legalisation of homosexuality) and produce better health
outcomes in individuals (e.g. through increased HIV testing), while others
intended as individual interventions (e.g. safer-sex counselling) may have
societal effects (e.g. greater acceptance of sexual diversity). Diseases and
health inequalities happen both to individuals and within societies and it is
important for public health practitioners to be clear about whether a given
intervention is intended primarily as an agent of individual or social change,
as this will influence what kind of outcome evidence (e.g. change in lifespan
versus change in behaviour) is relevant.
Whereas clinical sciences operate inductively, using
empirical evidence to disprove or refine hypotheses and only calling a body of
hypotheses a ‘theory’ if there is the highest level of evidence to support a
body of observations as causally connected, social sciences more often operate
deductively, developing theories (of behaviour, economics, social development,
etc.) drawn from existing observations and then testing them against further
observations.
Kelly and colleagues identify that: a) a great deal of the
literature on social theory is contained in textbooks and ‘grey literature’
that are missed out by conventional internet literature searches; and that b)
the theoretical rigour and internal logical structure of a piece of HIV
programming is a valid part of the evidence base, especially with types of
intervention for which it is difficult to devise conventional tests of
efficacy.
This means that, although refining our knowledge of what
works in HIV-prevention programmes is essential to avoid wasteful and outdated
approaches, lack of evidence of effectiveness should not be taken as evidence
of lack of effectiveness, especially since it is intrinsically more difficult
to gather evidence on the effectiveness of some interventions (such as
mass-media campaigns) than it is on others (such as counselling). While resource allocation must be judged
according to the capacity of different elements of a prevention programme to
exert the best possible effect on new infections, it may prove difficult to
isolate the relative contribution of different components of a prevention
strategy.
Meta-analyses are reviews which collect data from all the
previous studies that meet predefined quality requirements. They can help us to
judge which interventions will be most effective. However, these reviews are
only as powerful as the original studies whose findings they analyse.
While many studies use outcome measures like condom use and
unprotected sex acts, fewer use change in partner numbers or the incidence of
sexually transmitted infections (STIs). Still fewer use the ultimate endpoint
of HIV-prevention interventions – change in HIV incidence. This is largely
because, even in a high-risk population, incidence, at a few per cent a year,
is usually too small for anything other than a huge trial to be sufficiently
powered to produce a statistically significant result. See ‘Measuring Effectiveness’
for more on outcome measures.
Currently available effectiveness reviews have another
weakness. They do not tell commissioners or providers what sort of agency is
best suited to carry out particular types of interventions. However, a number
of meta-analyses have found that prevention programmes conducted at clinics
often have a statistical edge.
A basic understanding of theories of behaviour change should
underlie a particular intervention package, and a defined theoretical base has been
found to be associated with efficacy in meta-analyses. For example, a community
mobilisation approach proposed by a local health-promotion agency needs to be
considered as an example of a social-diffusion intervention. What does social-diffusion
theory in general tell us about the likely nature of the agents best placed to
bring about change in a community or group? See ‘Theoretical Models of Behaviour Change’, below, for more on this.