Why is sampling important in data analysis?
Sampling allows analysts to make inferences about a population without collecting all data.
What is data exploration?
Examining and summarizing data to understand its characteristics, spot patterns, and detect anomalies.
Name two common sampling methods.
Random sampling, stratified sampling
A ______ sample gives every member of the population an equal chance of being selected.
Random
A ______ sample ensures all subgroups are proportionally represented.
Stratified
What is one risk of using a non-representative sample?
It can lead to biased conclusions that don’t reflect the population accurately.
Name one technique for exploring data visually.
Examples: histograms, scatter plots, box plots, bar charts.
Which of these is NOT typically part of data exploration?
A) Summarizing variables
B) Identifying outliers
C) Running a full regression model
D) Checking distributions
C) Running a full regression model
Looking for unusual values that may indicate errors or interesting cases is called ______.
Outlier detection / anomaly detection
Why is data exploration iterative?
Initial exploration may reveal new questions, missing data, or interesting patterns that need deeper investigation.
Give an example of a summary statistic used in data exploration.
Examples: mean, median, mode, standard deviation, counts.
How does sampling and exploration help in decision-making?
They provide a manageable, representative view of data and highlight trends, patterns, or issues before deeper analysis.