Box 1. Reverse Causation and Residual Confounding in Epidemiology Studies
Box 1. Reverse causation describes the situation when an outcome is a cause of a putative risk factor, rather than the converse. As an example, poor mental health may lead to social isolation, which may further worsen mental health: here, the outcome “mental health” is both a cause and a consequence of the risk factor “social isolation.” An observational regression analysis of mental health and social isolation could, therefore, yield an upwardly biased estimate of the causal effect of social isolation on mental health by not accounting for the reciprocal nature of this relationship. Confounding describes the situation when an outcome and a putative risk factor have a common cause. As an example, educational attainment is a common cause of a (lesser) likelihood to smoke and of (higher) household income. Thus, an observational regression analysis of household income on number of cigarettes smoked daily may provide a biased estimate of the causal effect of smoking, resulting from confounding by education level. Residual confounding occurs when a regression analysis seeks to statistically adjust for the effects of a confounder, but where the adjustment is imperfect: this can occur when the confounder is measured with error or if the nature, for example, linearity, of the confounding relationships assumed by the statistical model does not match the true form of these relationships. Returning to the example of cigarette smoking and household income, then if years of education is included as an index of educational attainment in a regression analysis of household income on number of cigarettes smoked daily, this might only partially account for the true confounding effect of education, leaving some (residual) confounding unaccounted for. IV methods aim to decrease the bias from reverse causation, unmeasured confounding, and residual confounding, by leveraging information from a new variable (the “IV”) that is a discrete cause of the putative risk factor. For example, consider the price of cigarettes as an IV for the number of cigarettes smoked daily. With reference to Figure 1, Step 1 would estimate the reduction in the number of cigarettes smoked daily in a study population before vs. after an increase in the price of cigarettes, whereas step 2 would estimate the shift in household income associated with this change in the number of cigarettes smoked daily.