Module 1 Flashcards

(36 cards)

1
Q

Statistics

A

study of methods to describe and measure aspects of nature from samples& population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Estimation

A

the process of inferring an unknown quantity of a target population using sample&nbsp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population parameter

A

all quantities describing populations mostly denoted by greek letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

sample statistic

A

(estimate) is related quantity calculated from a sample, used to estimate the population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sampling unit

A

the unit/individual/subkect/ replicate that we choose our sample based of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sampling error

A

the chance difference between an estimate and the population parameter being estimated caused by sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Precision

A

less sampling error= higher precision// higher sample size=less sampling error= higher precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Accuracy

A

when the average of all estimates that we might obtain is centered on the true population value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Bias

A

if sampling process favours some outcomes over others, it might systematically under or overestimate the populaiton parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Random sample

A

sample from a population that fulfills two criteria of 1.equal chance of each indiivdual being in sample 2.independent unit( choosing one wont effect others in the study) (snapshot of the entire population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Independent Observations

A

selection of any member shouldnt effect the chance of other ones (one of the musts for random sampling as well)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sample of convenience

A

is a collection of indiivudals that are easily available to researcher (most accesible to them)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

volunteer bias

A

type of bias in human studies since there might be some systematic differnces in the pool of volunteers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Variables

A

characteristics that differe among individuals or other sampling units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Descriptive statistics

A

quantities that describe the sample
1. frequency distribution
2. measures of location (mean, median, mode)
3. measures
4.measures of shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Standard error

A

the standard deviation of the sampling distribution of an estimate, it reflects the precision of an estimate low STD=high precision=less uncertainty about the target parameter

17
Q

Sampling distribution

A

probability distribution of values for an estimate obtained from sampling the population

18
Q

Frequency distribution

A

describe the number of time each value of a variable occurs in a sample can use absolute (numbers) or relative (proportions) values

19
Q

Population

A

the set of all subjects relevant to the scientific hypothesis under examination

20
Q

Census

A

a collection of data where the entire population is examined

21
Q

Non random sampling

A

pick the sample based on a certain characteristic

22
Q

Experimental study

A

(random assignment ensures causality and remove bias)/ provide higher standard of evidence, because researchers assign or impose conditions randomly rather than relying on comparisons of existing conditions

23
Q

Observational study

A

record variables measuring patient health and compare groups A and B

24
Q

is experimental better or observational?

A

experimental are better because they provide higher standard of evidence because researchers assign or impose conditions randomly rather than relying on comparison of existing conditions

25
what are the types of variables?
Quantitative (numerical) Qualitative (categorical)
26
Quantitative variables
Continuous ( measured/ can have any value), could be any value/rate, area, height Discrete ( Counted/ Indivisible),number of siblings, counts Interval: meaningful zero (temp, altitude and longitude) (mostly continuous), zero doesn't mean nothing, it has a meaning Ratio: zero=nothing (you can have zero offspring meaning you have no offspring)
27
Qualitative variables
Nominal ( no order) colour of shirt/ yes, no Ordinal (ordered) letter grade/ pass, fail
28
Explanatory variable (independent)
responsible for the change in the response variable
29
Response variable (dependent)
focus of the study could be: frequency, the count of observations in each bin proportion, of the total observations in each bin density, the proportion of the total observations per unit of the bin width(area under the curve)
30
how do we illustrate Frequency distributions?
for categorical or discrete data=bar graph for continuous data= historgram
31
differences among distributions in histograms could be due to:
1.different location or central tendency 2.different spread or scale measured using std 3.distributions with a different shape or skew
32
what are the measures of location in histograms?
central tendency: 1. mean: arithmetic average 2.median: middle of the data 3.mode: most commonly occurring observation
33
measure of scale?
range, variance, standard deviation(best since it has same units as original variables)
34
Residuals
The difference between an observation and mean -sufficiently large sample from a normal distribution will have residuals that are normally distributed and entered on zero
35
measure of shape
measure the asymmetry: left skewed right skewed
36
confidence interval
range of values that is likely to contain the true population parameter (mean- 2x standard error) when it normally distributed, a 95% interval is a range of values, calculated from sample data, that would contain the true population parameter in 95 out of 100 samples if the sampling process were repeated