data definition
facts, quantities, or items of information about a person or object
data can be _ or _
quantitative (numerical) or qualitative (categorical)
how do we collect data in the sciences?
by conducting experiments
data helps us understand out
study system
ways to get/use data
measure, observe, generate
how to test hypothesis//expected outcomes of experiment
use statistics with collected data
when do scientists use stats
To design studies and select sample sizes
To support or negate hypotheses
To understand error, uncertainty, and outliers in data
To interpret and summarize data
To make money??
4 top skills of data science
data processing
statistics
data visualization
presentation
summary statistics
presentation of data in an easy to understand and easy to digest snippets
better than using long lists of numbers - not informative or easily interpretable
mean
average of data
how to calculate mean
add variables together and divide by sample size
median
middle number of data set
how to find median
place numbers in order and find. middle value
range
how spread out the data is
how to find range
largest - smallest value
best way to visualize data
use a box plot
box plots show
minimum
first quartile
median
mean
third quartile
maximum
outliers
first step after asking a research question
generate null and alternate hypothesis
how to decide which hypothesis to reject
use more complex statistical analysis (like p-values)
p value definition
the probability that any differences seen between datasets are due to natural variation
p represents
probability
p value will always be between
0-1 (0%-100%)
higher p value means
differences in data are more likely to be due to natural variation
which hypothesis is more likely with a higher p value
null