regression
describes the mathematical relationships between outcome and one or more other variables. Regression adjusts for several confounders and/or intermediate variables at the same time.
ordinary least squares regression.
Each dot represents 1 person. The straight line is the regression line. The regression line comes from an equation. The software tries to draw the line in such a way that the squared differences between the fitted line and the observations are as small as possible
coefficient meaning
The coefficient (in this case 0.80) describes the slope. The coefficient does have an interpretation → for every centimeter in height the people are expected to be 800 grams heavier.
Coefficients have a meaning → it is not just ‘positive’ or ‘negative’. Coefficients are expressed on the same scale as the outcome.
The regression equation can be used to predict the outcome → the prediction is the average for people with the same characteristics.
OLS regression in software
moderation
It is possible that height works differently for men than for women. This can be analyzed by adding an interaction term to the regression analysis
In this case the interaction term represents the additional height effect for women. The value of this interaction term is 0 for men and for women the value is their height. The coefficient of the interaction term means that extra height for women adds less weight (67 gram) for women than it does for men.
Moderation is not clearly part of the DAG concept → it is not about bias. Moderation is about distinguishing between subgroups instead of taking the average. The coefficient of the interaction term represents the additional effect in a subgroup compared to the other subgroup. The interpretation of the interaction term is easier when you fill out the regression equations of the subgroups.
Wheelan’s warnings about regression analyzes
non-linearity
the relationship may be quadratic or logarithmic instead of linear
explanatory variables that cannot be distinguished (if all the men in the data are old and all the women are young, you can no longer distinguish the effect of age from the effect of sex)
results are only valid in similar populations
the exposure does not have an effect on the outcome, but the outcome has an effect on the exposure (should be visible in a DAG)
unresolved confounding
normal causal inference question
“what is the effect of X on Y?”
mediation question/ mediation analyzes
“why does X have this effect, is it because of M?” and “if we do something about M, would that reduce the effect of X on Y?”.
In a mediation analysis, causal paths may have to be blocked.
mediator, intermediate variable
a variable makes another causal path
adjusting for an intermediate leads to the estimate of a partial/ direct causal effect. not adjusting for an intermediate leads to the estimate of a full effect
how to adjust for a path wit a collider
adjust for collider AND 1 other variable in the path. the will block the backdoor path again
p-value
The probability of finding this association (or stronger) in a sample if the real association is 0 → if the null-hypothesis is true, what is the likelihood of finding this association. The difference is statistically significance if it is below a certain threshold (level of significance, α) → 0.05.
The p-value being the probability of finding the results (or more extreme) if the null hypothesis is true is not the same as the p-value being the probability that the null hypothesis is true, given the results.
The probability that the significant result indicate a true effect depends on:
what is not p value/ significance
The significance level is set by the researcher → it does not have to be 0.05. A lower significance level results in less significant results → there are less false-positive results, but there are more false-negative results.
Power is the chance of finding a significant result in a sample if the effect is real. Power is affected by the sample size (the larger the sample size, the higher the power), the real association in the population (if the association in the population is stronger, the power is higher) and the variability in the population (if the outcomes are more or less the same, the power is higher and if the outcomes are very different, the power is lower). The association and variability in the population are unknown.
The usefulness of p-values is limited, even when they are used correctly. The p-value is not a measure of precision, it’s only about the difference with 0. Strictly, the null hypothesis is always wrong. The difference with null is usually not interesting. Whether ‘it works’ (a drug) depends on the strength of the effect.
The prior probability is unknown. However, the plausibility of the hypothesis can be assessed with subject knowledge. Both the power and the prior probability can’t be quantified. This means that you can’t say exactly how likely it is that a significant result is based on a true association → it’s probably less than 95%.
published results
Mathematically, you can show that significant associations are likely to be overestimates → if you find a lower estimate than the actual value, it’s less likely that the result is significant and if you find a higher estimate than the actual value, it’s more likely that the result is significant. So, published results are also more likely to be overestimates → only statistically significant studies get published. Focusing on statistical significance takes attention away from the size of an effect.
Problems with the p-value as stated by the American Statistical Association
testing
Testing gives a dichotomous result → yes (there is enough evidence) or no (there is not enough evidence). If the result is significant there is evidence for the association and if the result is not significant there is no evidence for the association. However, absence of evidence is not evidence of absence → the fact that the result is not significant doesn’t mean that there is not a true effect. This yes/no answer, however, is not really interesting.