Categorical data analysis Flashcards

(12 cards)

1
Q

What is a proportion in categorical data?

A

The number with a characteristic divided by the total number of individuals (p = x/n).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which distribution describes probability for proportions?

A

The binomial distribution, which approximates a normal distribution for large n.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for the standard error of a proportion?

A

​√p(1−p) /n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a 95% confidence interval for a proportion?

A

p±1.96×SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a contingency table do?

A

Summarizes data for two or more categorical variables and forms the basis for Chi-squared testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the expected frequency (E) in a contingency table calculated?

A

E = rowtotal x (columntotal/grandtotal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When should you use a Chi-squared test vs Fisher’s exact test?

A

Chi-squared: large samples (expected ≥5). Fisher’s: small samples (expected <5).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between odds and risk?

A

Odds: outcome vs alternative. Risk: outcome vs all outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does an odds ratio represent?

A

How much more or less likely an outcome is in one group compared to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does correlation measure?

A

The degree of association between two paired variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Pearson correlation coefficient (r)?

A

Measures how close data points lie to a straight line (strength of linear relationship).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When should you use Spearman’s rank correlation instead of Pearson’s?

A

When data are not normally distributed or the relationship is not linear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly