Controlled Trial / Experiment
A study where investigators apply a treatment to a group of subjects and compare the outcome to a control group that receives no treatment or a placebo.
Method of Comparison
The fundamental principle of establishing causation in statistical studies. It involves comparing the outcomes of two (or more) groups of subjects: the treatment group (which receives the intervention being tested) and the control group (which serves as the baseline, often receiving a placebo or standard care).
Treatment Group
The group of subjects in a study that receives the intervention being tested (e.g., a new drug, a specific diet, an educational program, or the Salk vaccine). The results for this group are compared to the control group to measure the effect of the intervention.
Control Group
The group of subjects in a study that is used as a baseline for comparison. They are treated identically to the treatment group in every way, except they receive a placebo (or standard treatment) instead of the actual intervention being tested.
Random Assignment
The use of an impersonal chance procedure (like flipping a coin or drawing names) to divide subjects into treatment and control groups. Its purpose is to ensure, on average, that the groups are balanced and equivalent with respect to all confounding factors, both known and unknown.
Double-Blind Study
A controlled experiment in which neither the subjects receiving the treatment nor the researchers/diagnosticians evaluating the outcome know who is in the treatment group and who is in the control group. This design is used to prevent the placebo effect in subjects and diagnostic bias in observers.
Can lead to diagnostic bias if not implemented
Confounding Variable
A variable that is associated with both the treatment being studied and the response/outcome. Because the treatment and the confounder are mixed up, it is impossible to determine whether the observed effect is due to the treatment or the confounder. This is the main reason why observational studies can only establish association, not causation.
Randomized Controlled Trial/Experiment (RCT)
Gold standard of study design, where investigators use an impartial chance procedure to assign subjects to treatment or control groups.
Placebo
Neutral, inactive treatment given to the control group in an experiment, designed to resemble the actual treatment. It’s used to blind subjects and control for the placebo effect, which is the psychological tendency for subjects to show an effect simply because they believe they are receiving a treatment.
Eligible/Target Population
Entire group of individuals that a study intends to describe or draw conclusions about
Sample Population
Smaller population taken from the larger eligible/target population that the experiment is performed on
Historical Controls
Historical controls refer to a type of study design where the control group is not selected and run concurrently with the treatment group, but rather consists of subjects from a previous study or patients whose outcomes are known from the past.
This design is often considered weaker than a Randomized Controlled Experiment because there’s a huge risk of confounding—the differences between the treatment group and the historical control group (like changes in medical care, diagnostic standards, or population characteristics over time) can easily bias the results.
Contemporaneous Controls
Contemporaneous controls refer to subjects in an experiment who are treated exactly the same as the treatment group, except for the intervention being studied, and are followed over the same period of time. They are essential for a good Randomized Controlled Experiment (RCE) because they eliminate the bias inherent in Historical Controls (like changes in the population, environment, or medical care).
Response
Response is the formal term for the outcome that is measured in a study or experiment.
In statistics, the response variable (or dependent variable) is the characteristic that the investigator is interested in measuring or comparing to see if it changes when a factor (the treatment or explanatory variable) is applied.
For example, if a study tests a new fertilizer, the treatment is the fertilizer, and the response might be the plant’s height or the size of the yield.
Q: Difference between placebo and control group?
The key distinction is that the control group is a set of participants, while the placebo is a type of treatment they may receive.
In short, the Control Group is the set of subjects who receive the Placebo, or sometimes standard care or no treatment at all. The control group ensures any observed effect is due to the actual treatment and not other factors
Q: What problem(s) arise when the treatment and control groups are NOT created using random assignment?
The primary problem that arises when treatment and control groups are NOT created using random assignment is confounding (or selection bias).
Confounding and Selection Bias
Example: If a researcher non-randomly assigns healthier, younger participants to the “Treatment” group and older, less healthy participants to the “Control” group, and the treatment group has better outcomes, the researcher cannot distinguish whether the improvement was due to the treatment or the participants’ naturally better health/age.
Q: What are the defining characteristics of a well-designed randomized controlled trial (RCT)?
Q: Why is it important for the treatment and control groups to be comparable, and how does randomization achieve this
Importance of Comparability: Comparability is vital because it ensures that the only systematic difference between the groups is the treatment itself. If the groups are not comparable, any difference in outcome could be due to a confounding variable (e.g., age, health, lifestyle) instead of the treatment, making it impossible to establish causation. Role of
Randomization: Randomization (random assignment) achieves comparability by acting like a fair chance mechanism. It ensures that, on average, all known and unknown confounding variables are distributed roughly equally between the treatment and control groups, thus eliminating selection bias and making the groups statistically equivalent.
Observational Study
Definition: A study where the researcher observes and measures subjects and variables without intervention or manipulation of treatment.
Key Distinction: The investigator does not assign treatments; subjects self-select into groups (e.g., people who choose to exercise vs. those who don’t).
Main Limitation: It cannot establish causation (cause-and-effect) due to the high risk of confounding variables (lurking factors that differ between the groups). It can only show association or correlation.
When Used: When a Randomized Controlled Trial (RCT) is unethical (e.g., studying harmful exposures) or impractical (e.g., studying a rare trait or long-term phenomenon).
Q: Difference between a Controlled Experiment and an Observational Study?
In a Controlled Experiment, the investigator actively intervenes by randomly assigning subjects to treatment and control groups. This control allows for establishing causation (cause-and-effect).
In an Observational Study, the investigator is passive, simply observing subjects who have self-selected into groups. This can only show association (correlation), not causation, due to the risk of confounding variables.
Correlation / Association
Association (or correlation) describes a relationship between two or more variables, meaning that certain values of one variable tend to occur with certain values of another.
Definition: Variables are said to be associated if knowing the value of one variable gives you information about the likely value of the other.
The Crucial Limit: Finding an association does not prove causation. Just because two things are related doesn’t mean one causes the other. The relationship might be due to a confounding variable.
Example: There is an association between carrying a lighter and getting lung cancer, but carrying a lighter doesn’t cause cancer; smoking is the confounding variable that causes both.
Q: Difference between association and causation?
The distinction between association and causation is based on whether one variable is proven to cause the other.
Association (Correlation): This means two variables tend to occur together or change together. An association can be shown by both observational studies and controlled experiments. However, association does not imply causation; the relationship might be due to a third, lurking variable (a confounder).
Causation (Cause-and-Effect): This means a change in one variable is directly responsible for a change in the other. Causation can only be strongly established by a well-designed Randomized Controlled Experiment (RCT), where randomization balances out other potential factors.
Stratification / Cross-Tabulation
Definition: A statistical technique used primarily in observational studies (and sometimes in experiments) to divide the sample into smaller, homogeneous sub-groups called strata based on a potential confounding variable.
Purpose: To see if an association observed in the overall data holds true within each stratum. This helps to control for (or adjust for) the confounding variable.
How it Works: The data is broken down into a table (a cross-tabulation) where the effect of the treatment is examined for each level of the confounding variable (e.g., comparing groups by treatment status separately for young subjects and old subjects).
Outcome: If the association disappears after stratification, it suggests the original association was spurious (fake) and entirely due to the confounding variable. If the association persists within the strata, it strengthens the argument for a real link.
Simpson’s Paradox
Definition: A phenomenon where an association or trend that appears in several different groups of data (strata) reverses or disappears when the groups are combined.
Cause: It is caused by a powerful, unaccounted-for confounding variable that is unequally distributed among the sub-groups.
Significance: It demonstrates the danger of combining data from incomparable groups in observational studies. When the data is stratified (broken down), the true relationship is often revealed.
Key Idea: The association you see in the overall (combined) table is the wrong conclusion; the association seen in the smaller, stratified tables is usually the correct one.