what does SPINE of statistics stand fro
standard error parameter interval estimates null hypothesis testing estimation
general linear model
outcome = b0 + b1(predictor) + error
chi sqaured test
chisq.test(data$variable, data$variable, correct = FALSE)
for categorical or count data
spearman correlation
data %>% correlation::correlation(., method = “spearman”)
continuous data
what do the parts of GLM stand for
b0 = estimate value when predictor=0 b1 = represents difference in means if linear model has two categorical groups bn = estimate of parameter for predictor, direction/strength of effect, difference of means
least squared estimation
standard error
central limit theorm
majority of scores around mean
normal distribution
1.96 sd from mean contains 95% data
confidence intervals
express estimates as intervals such that we know population value lies in them
95% chance contains true pop parameter
interpreting parameter estimates
raw effect size is the beta estimate
standardised effect size fits model to raw data that are z -scores (expressed in standarised scores)
long run probability: parameters represent effects
relationships between variables
differences in means
long run probability: parameters reflect hypotheses
h0 : b = 0, b1 = b2
h1 : b =/= 0, b =/= b2
long run probability: test statistic
t= b/SEb
can work out how likely value if null true
value of t on x axis and probability on y
type 1 error
reject null when it is true
believe in effects that dont exist
type 2
accept null when its false
statistical power
probability of test avoiding type 2 error
problems with null hypothesis testing
problem with long run probability
p is relative frequency of observed test statistic relative to all test statistics from infinite no. of identical experiments with exact same priori sample size
type 1 error rate either 0 or 1
comparing sum of sqaures
- only compare the totals when based on same number of scores
illusory truth effect
repetition increases perceived truthfulness
equally true for plausible and implausible statement
SSt
-total variability between mean and scores
-SSm + SSr
-each SSt has associated df
dfT = N-p (p = parameter, N = independent information)
SSr
SSm
mean squared error