Non randomized controlled trial definition
Describe the components, mode of delivery, and dose of comparison treatment as well as the qualifications of the health professionals who provided the treatment.
Describe the data-collection procedures at each time point before, during, and after treatment :. Identify statistics used to evaluate the effects of the interventions on outcomes, moderator and mediator effects, and the influence of adherence to the intervention on outcomes.
Describe the methods used for secondary and adjusted in the case of baseline differences analyses. Define the variables of interest moderators, mediators, adherence to treatment, outcomes at the conceptual and operational levels. Identify the instruments used to measure each variable; for each instrument, give the number of items, type of scale, and direction and range of measurement. Any instruments developed for the study e.
The discussion should cover three main areas: interpretation, generalizability, and overall evidence Des Jarlais et al. Identify the limitations that could contribute to a cautionary interpretation of the data e. Discuss the implications for clinical practice and the potential for generalizing the findings under particular conditions or in particular settings. Craig, P. Developing and evaluating complex interventions: The new Medical Research Council guidance: Commentary.
International Journal of Nursing Studies, 50, — Des Jarlais, D. If no baseline survey is needed, the investigators are in the comfortable position of letting the implementers work according to their schedule, and use the time to develop study procedures in a subset of the trial population or in external pilot clusters.
A disadvantage of CRTs is that for ethical reasons, participants often but not always [ 19 ] may need to be told that they are part of a trial, possibly altering their behaviour and response to questions [ 20 — 22 ]. This may be a considerable problem especially in trials that are neither blinded nor use an objective outcome measure.
Several meta-analyses have shown that such trials produce estimates that are severely affected by responder and observer bias [ 20 , 21 ]. These trials are the most problematic for the public since randomised trials carry a large weight in decision making, while it is the process of informed consent usually required in a randomised trial that may contribute to bias [ 21 ]. Not all may be lost for unblinded trials with a subjective outcome in situations where the purpose of the outcome assessment can be hidden from the study participants, for example by presenting it as a general health survey.
In this context, the unit of treatment allocation may be important. If an unblinded intervention evaluated using a subjective outcome e. If allocation is done at community level e.
In contrast, several CRTs on community-level sanitation an unblinded intervention with the same outcome self-reported diarrhoea symptoms and equally poor compliance with the intervention showed no effect at all [ 23 — 25 ]. In both the point-of-use water treatment and the sanitation trials, compliance with the intervention was very poor.
For both interventions, a true effect would have been biologically implausible. The absence of an observed effect in the sanitation trials may therefore be regarded not only as evidence for absence of a true effect but in contrast to the water treatment trials also as evidence for lack of responder bias, possibly because participants did not link the health surveys to the intervention or did not expect any benefits from giving false information [ 23 — 25 ].
Allocation is done by the investigator or the implementer, e. The EPOC definition of non-randomised trials requires that the investigator controls allocation [ 9 ]. In the definition used here, allocation is not random and it does not matter who allocates. An implementer may decide to deliver an intervention in ten villages, and an evaluator may choose ten suitable control villages for comparison [ 26 , 27 ].
With notable exceptions [ 28 ], participants may not need to be explicitly told that they are part of a trial. Trial procedures may more easily be camouflaged as general demographic and health surveys than in a CRT, which may reduce responder bias. NCTs need to demonstrate that intervention and control arms are comparable. Unlike in CRTs, imbalances are not due to chance until proven otherwise which is usually impossible. Most often, baseline characteristics are used to adjust for imbalances.
Baseline variables may include 1 demographic and socio-economic characteristics and other covariates potentially associated with outcome and intervention, and 2 a baseline measure of the outcome of interest. These two measures need to be clearly distinguished as it can be argued that adjusting for the latter is likely to be more effective than for the former. In a sense, baseline variables are a predictor of the baseline measure of the study outcome, which in turn is a predictor of the outcome at follow-up.
Hence, the baseline measure of the study outcome can be regarded as more proximate to the potential outcome than other baseline variables. Investigating trends of the study outcome from baseline to follow-up is a fairly transparent way of exploring whether baseline imbalances may have affected the effect estimate, as the trends in the outcome in different study arms can be openly discussed. If there is no baseline measure of the study outcome, then one can only compare other baseline variables e.
Such models, however, usually represent a black box with an unknown amount of residual confounding [ 30 ]. It can be argued that the only way to make a NCT convincing is to obtain a precise baseline measurement of the study outcome and use it in the final analysis [ 31 ]. No amount of multivariable adjustment or matching of other variables, even if done with great care [ 27 , 32 ], can replace the value of a precise baseline measure of the study outcome.
Statistical methods to account for baseline measure are imperfect and continue to be debated [ 33 , 34 ]. Methods include the analysis of covariance or lagged regression method [ 33 , 34 ], the analysis of change scores [ 4 , 33 , 34 ], and the exploration of the interaction between treatment allocation and time point [ 35 ].
In the analysis of covariance method, regression models are used that include the baseline measure as just another explanatory variable. The analysis of change-scores is based on between-arm comparison of the difference between the outcome at follow-up and the outcome at baseline, measured in the same individual, or, in cluster-level analysis [ 36 ], in the same cluster [ 34 , 37 ].
The interaction approach is required if different individuals are measured at baseline and follow-up, and is calculated as the interaction term between treatment allocation and time-point e. The effect estimates produced by the change score and the interaction approaches are sometimes referred to as Difference-in-Difference DID [ 35 ].
All three methods work well if baseline imbalances are relatively small, but become problematic if imbalances are large [ 4 , 34 , 37 ], which is fair enough as in this case trial arms are probably not comparable to start with. The regression approach works well if baseline and follow up measures are highly correlated, which is often the case for continuous variables such as child anthropometrics or blood pressure.
The regression approach is problematic for binary outcomes. Binary outcomes measured at two different time points e. Adjusting for a baseline measure showing only a low or moderate correlation with the follow up measure leads to regression dilution bias, and failure of the regression model to adequately adjust for any baseline imbalance [ 4 ].
The change score approach may be preferable in this situation [ 33 ]. It is important to maximise between-arm comparability and not solely rely on statistical methods to achieve balance, since the three methods mentioned above each rely on a number of assumptions. Various matching methods can be applied to achieve comparability [ 31 ], including using publicly available census data [ 27 , 32 ].
The most promising approach may be to match intervention and clusters according to the baseline measure of the outcome of interest, which however may not yet be available at the time of recruitment. Across a number of intervention and control clusters many repeated measurements of the outcome of interest are taken before and after the intervention.
Usually, CITS require the use of regularly collected routine data, which often are only available at the level of large administrative units e. The analysis focuses on whether a certain change in the outcome has taken place after the intervention in the intervention but not the control clusters.
To include intervention and control clusters in the same model, they need to be reasonably comparable. CITS have the advantage that the requirement of including at least 4—6 clusters per arm [ 1 , 13 ] may be relaxed by including a fixed effect for cluster intercepts to control for time-invariant differences between clusters. It may not be necessary to consider random variation in the intervention effect across clusters.
The key feature of a CBA as defined here is that intervention and control arms are not compared statistically. The control arm only serves to get an idea of what the trend in the intervention arm might have been in the absence of an intervention.
Whether or not the intervention is allocated at random is of no relevance for this definition. This definition is in contrast with the EPOC definition which defines a CBA as a trial where before and after measures are taken and where allocation is non-random and outside the control of the investigator [ 9 ], equivalent to the MRC definition of a natural experiment [ 14 ]. Design and interpretation of CBA studies have been described [ 5 ], often disregarding the issue of cluster-level allocation.
In CBAs, the study outcomes can be compared statistically between different points in time before and after the intervention, but only separately for intervention control clusters, not between them. The comparison between intervention and control arm can only be done informally without statistical methods, e. Not being able to calculate a confidence interval or p-value for between-arm comparison is unsatisfying, and usually excludes such studies from formal meta-analyses. In some small CRTs or NCTs with for example 4 or 5 clusters per arm, statistical between-arm comparison is theoretically possible but may have low power.
In one trial in India with just 5 clusters per arm, the investigators chose to analyse the data as a CBA, with the direct comparison serving only as a secondary analysis to enable future meta-analyses [ 42 ]. The advantages of this approach are unclear and require further study. Trend interpretation is critical in CBAs. Similar considerations apply to NCTs, but in CBAs one cannot even use statistical analysis to compare trends across arms. An often cited requirement in NCTs and CBAs is the parallel-trend assumption, assuming that intervention and control arms would have shown the same trend from baseline to follow-up in the absence of an intervention.
However, especially small scale CBAs can often only be meaningfully interpreted if no trend at all is observed in the control arm, i. A moderate or even large change in the control arm from baseline to follow-up may more often indicate a methodological problem in the study procedures than a true change.
Scenario A most strongly suggest an intervention effect, as in the control prevalence is similar at baseline and remains constant at follow-up. In scenario B, intervention and control start at similar prevalence values that decrease in parallel, suggesting absence of an independent intervention effect. This scenario is often encountered in situations of rapid economic development, the health benefits of which overshadow public health interventions [ 43 , 44 ].
In scenario C, prevalence is very different at baseline, suggesting that the two arms are not comparable. Caution is warranted in interpreting the DID estimate as an intervention effect.
This applies even more to Scenario D where there is no change in the intervention arm and a prevalence increase in the control arm. Lack of baseline comparability or poor data quality may well be the cause for the observed trends. In CBAs and NCTs, considerable skill and sometimes a bit of luck is required to identify control clusters with comparable outcome levels at baseline.
Trend interpretation in non-randomised cluster trails NCT and before-and-after trails with control group CBA : a good balance, no trend in control arm; b good balance, strong trend in control arm; c poor balance, no trend in control arm; d poor balance, strong trend in control arm; e poor balance, erratic trends; f poor balance, opposing trends.
Strong and erratic temporal trends in the outcome measure can be natural e. A trend in the opposite direction Scenario F raises the possibility of regression to the mean [ 45 ]: the intervention clusters may have been chosen because prevalence was temporarily high prior to the intervention, indicating a need for an intervention. The control area may have been excluded from the intervention because of temporarily favourable indicators.
Absence of regression to the mean effects is best demonstrated by including measurements at different time points both before and after the intervention. These examples demonstrate the many obstacles faced by trend analyses in the context of NCTs and especially CBAs to identify true intervention effects—the price of non-random allocation.
They highlight the importance of rigorous study procedures to ensure comparability of baseline and follow-up surveys. They also show the value of comparing trends between trial arms, which allows a fairly transparent discussion about the merits and limitations of a particular study.
Consider a NCT, where no baseline data are available in scenarios A to F, and multivariable regression analysis is used to address confounding. Even if as recommended [ 4 ] investigators carefully adjusted for confounders, reported their methods thoroughly and were conscious and critical of the assumptions they made, the analysis would still be a black box.
Neither those who argue in favour of a true effect, nor those arguing that all is due to confounding have much in their hands to support their views. This is a CBA without a control group. One or several measures of the outcome of interest are taken at baseline and follow-up, and compared. The absence of a control arm makes it difficult to support the assumption of an absence of a strong secular trend.
A typical scenario for a BA is the evaluation of a mass media campaign that targets a whole population, leaving no one to serve as control. Temporal variability is even more problematic in BAs than they are in CBAs and NCTs, since in the absence of a control arm we do not know how variable the outcome would have been without an intervention.
However, some outcomes are naturally stable and are potentially suitable for BAs, but even here confounding is possible. One method to increase the validity of a BA is to take several measures of the outcome of interest at baseline and follow-up, ideally to demonstrate a reasonable stability of the outcome of interest pre-intervention and, if the intervention is successful, at a different level post-intervention.
If many before and after measurements are available, ITS may allow obtaining statistically robust estimates [ 47 , 48 ].
In their simplest form, ITS studies assume a common slope over time before and after an intervention and explore changes in the intercept at follow-up. A further way to improve BAs lies in combining it with an adopter versus non-adopter comparison.
Adopters are usually defined as those complying with the intervention or at least having been exposed to the intervention. Post-intervention adopter versus non-adopter comparisons are often used to evaluate mass media campaigns e.
Post-intervention adopter versus non-adopter comparisons carry a high risk of confounding, as it is not clear how similar outcomes in adopters and non-adopters would have been without an intervention. If, however, the same individuals are surveyed before and after the intervention, then trends in the outcome can be assessed separately for adopters and non-adopters, allowing an estimate of the risk of confounding. This design then resembles a CBA study, but is methodologically different as there is no pre-planned or otherwise well-defined group of control clusters.
Evaluations of public health interventions at community level need to fulfil at least one of the following three criteria: 1 randomisation of a sufficiently large number of clusters to allow statistical between-arm comparison RCT , 2 if this is not possible, a precise baseline measure of the outcome of interest to assess baseline comparability and to study trends from baseline to follow-up in the absence of an intervention NCT, CBA , 3 if there is no control group, multiple measures of the outcome of interest at baseline and after the intervention BA.
With each step, the effect size of public health relevance decreases: a reduction in diarrhoea of 2 percentage points would be an important public health goal, while we would not be content with a 2 percentage points difference in intervention coverage or EBF. Similar arguments can be made for the evaluation of biomedical interventions such as micronutrient supplements, where high compliance is needed to achieve a large change in serum-micronutrient levels, which may lead to a moderate change in subclinical disease and a small change in morbidity or mortality [ 50 ].
Nurse versus physician-provision of early medical abortion in Mexico: a randomized controlled non -inferiority trial. The controlled , non - randomized trial in Uganda was negative ACT use was higher in the control group but a new intervention was introduced into the control district after the trial had started, making it hard to draw clear conclusions.
The clinical picture is also improving. US-based reproductive health non -profit, EngenderHealth, and WHO are currently coordinating a randomized controlled trial studying catheter management in post-operative fistula patients. A non - randomized community trial was implemented in 21 community clusters intervention and four clusters where health units provided routine IPTp control.
0コメント