As noted
above, the efficacy of Truvada as
seen in the iPrEx trial was 44%. This was according to an ‘intent-to-treat’ (ITT) analysis. In an ITT
analysis efficacy is measured as the difference in primary endpoints (in this
case, HIV infections) between the two arms of the trial, as originally
randomised. Individuals who drop out of the trial, who don’t follow the
intervention as planned, or who don’t take all the prescribed doses of a drug
are still included in the analysis, in the group to which they were originally
assigned.
In drug
trials, an ITT analysis is
recognised as the least biased way of reporting results as it takes account of
factors that may influence effectiveness in the real world. For example, a
person may stop taking a drug because they believe that it is not working or
because its side effects are so unpleasant. If only the results of those people
who chose to continue the treatment were included, the results would be biased.
However,
in some trials behavioural factors such as adherence may be so dominant that an
ITT analysis can hide evidence of useful efficacy in certain populations.
Alternatively, the product would be found to be efficacious if behavioural
factors could be better controlled. This is a problem of statistical power.
The
incidence problem is an example of this: if participating in the trial produces
a large increase in condom use and/or reduction in partners then this may
compromise the power of an ITT
analysis to produce a clear result. However, if the results are stratified by
condom use then it may be possible to tease out something nearer the ‘real’
efficacy.
Another
case in which post-randomisation behavioural differences will make a large
difference is where two risk behaviours are closely associated with each other,
for instance if there is a strong link between poor adherence and increased
sexual-risk behaviour. In such a case, even if adherence is quite good overall,
those who are poorly adherent may be disproportionately likely to be infected
not only because of poor adherence, but because they take more risks. In such a
situation, a case could be made for stratifying the results both by adherence
and by risk behaviour.1
The
above considerations are purely about factors that may block the ability of a
trial to produce a statistically meaningful result. The difference between an
OT analysis and an ITT analysis may also be a guide to the difference between
the ideal efficacy achievable with the product and the likely effectiveness in
real-world settings, but they are not the same thing.
Assuming
that a reasonably accurate measure of adherence can be devised, an ‘on
treatment’ analysis that takes adherence into account can tell us about the
potential efficacy of the
intervention. ‘Efficacy’ refers to how well the intervention works in a
scientific trial, or when people use it as advised.
On
the other hand, an intent-to-treat analysis can tell us more about the effectiveness of the intervention.
‘Effectiveness’ refers to how well it actually works in a given population,
given actual levels of use.
But ITT
trial results may still be crucially different from real-world experience. In
some cases, an ITT analysis may report low efficacy (or a non-significant
result) due to behaviours that are more
likely in the ‘real world’ such as poor adherence, but low efficacy may also be
due to behaviours that are less
likely to occur in the real world – such as the high levels of condom use seen
in trials.
Both are
valuable measurements, especially if the behavioural factors that impair
effectiveness can be changed for the better.
With
regard to the adherence problem, if self-report is unreliable, how do we measure adherence?
It has
become apparent, during the HIV biomedical-prevention trials, that self-reports
of adherence (and possibly of risk behaviour too) are extremely unreliable. For
instance, in the trial of the microbicide Carraguard,2
97% of trial participants reported using the product as instructed, but
adherence as measured by a more objective method – a dye in the applicators
that responded to vaginal mucus – found that applicators were only used 43% of
the time. Similarly, in the iPrEx trial, 94% of participants reported taking
their pills as indicated, but in a subset of trial participants whose drug
levels were measured, only 51% of non-infected participants had measurable drug
levels in their blood and tissues.
Trial
participants tend to give different answers to questions about their behaviour
according to how the question is phrased. For instance, in the MIRA trial of a
diaphragm as a possible HIV-prevention method,3
when asked about their behaviour in the previous week, 72% of participants said
they had maintained 100% condom use. However, when asked to allocate themselves
to one of three categories of condom use (always, frequently/sometimes, or
rarely/never using condoms), only 42% reported ‘always’ using condoms.
Other,
apparently more objective, measurements of adherence may also be unreliable.
For instance, pill counts, measured either directly, or by the number of
pharmacy refills, assume that trial participants take all pills prescribed and
that participants are not throwing away unused ones. It appears that a
combination of pill counts and self-reports drastically overestimated true
adherence in the iPrEx trial.
Similarly,
electronic devices, such as the MEMS cap, that register whenever a medicine
bottle is opened, assume that when a bottle is opened, a pill is taken. MEMS also
assumes that people only remove one dose at a time. MEMS will overestimate
adherence if people are not taking pills they remove from the bottle and will
underestimate it if they are removing multiple doses to put into pill boxes.
Several
direct biological methods of measuring adherence and risk behaviour have
therefore been tried in different prevention trials.
For
sexual-risk behaviour, markers of exposure to semen have been used in several
microbicide sub-studies: these collect vaginal-fluid samples and test them for
semen markers such as prostate-specific antigen (PSA),
semenogelin (Sg), and Y chromosome DNA
(Yc DNA). However, these markers
only stay in the genital tract for 48 to 72 hours and their absence does not
mean a condom was used.
In the Carraguard microbicide trial, the device
used to apply the microbicide in the vagina contained a dye that was sensitive
to vaginal mucus. Participants were asked to return all applicators and the dye
was revealed by laboratory processing. Although this method provided a useful
indication of adherence it could not provide information on the number of sex
acts for which microbicide was not applied, the timing of use or the amount of
product inserted.
In PrEP
trials, drug-level assays as used in iPrEx provide a good indication as to
which participants have been taking ARVs and even some indication of the timing
of the most recent dose. However, they provide no indication of the
relationship between dosing and sexual exposure.