Stats Flashcards

(27 cards)

1
Q

Define mean.

A

The average of a set of numbers, calculated by dividing the sum by the count.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What function calculates the mean in R?

A

The function is mean(). It computes the average of numeric values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or false: Median is always the same as the mean.

A

FALSE

The median is the middle value, which can differ from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank: The standard deviation measures _______ in a dataset.

A

Variability or dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the summary() function do in R?

A

It provides a summary of the statistical measures for an object, like mean and quartiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define outlier.

A

A data point that significantly differs from other observations in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of the boxplot() function?

A

To visualize the distribution of data and identify outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or false: A p-value less than 0.05 indicates statistical significance.

A

TRUE

It suggests strong evidence against the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank: A normal distribution is also known as a _______ distribution.

A

Gaussian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the cor() function compute?

A

It calculates the correlation coefficient between two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define regression analysis.

A

A statistical method for modeling the relationship between a dependent and one or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the purpose of the lm() function in R?

A

To fit linear models for regression analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: Histograms display frequency distributions of continuous data.

A

TRUE

They show how data is distributed across different ranges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blank: The t-test compares means between _______ groups.

A

Two

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the ggplot2 package do?

A

It provides a system for creating complex graphics based on the Grammar of Graphics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define confidence interval.

A

A range of values derived from sample data that is likely to contain the population parameter.

17
Q

What is the purpose of the shapiro.test() function?

A

To test the normality of a dataset using the Shapiro-Wilk test.

18
Q

True or false: ANOVA is used to compare means across multiple groups.

A

TRUE

It stands for Analysis of Variance.

19
Q

Fill in the blank: Data frames in R are similar to _______ in Excel.

20
Q

What is the dplyr package used for?

A

For data manipulation and transformation in R.

21
Q

Define variable.

A

A characteristic or attribute that can take on different values.

22
Q

What does the tidyverse include?

A

A collection of R packages designed for data science, including ggplot2 and dplyr.

23
Q

True or false: Factor variables are used for categorical data in R.

A

TRUE

They help in statistical modeling and plotting.

24
Q

Fill in the blank: The plot() function creates a _______ of data points.

25
What is the **R-squared** value?
A statistical measure that represents the proportion of variance for a dependent variable.
26
Define **sample size**.
The number of observations or data points collected in a study.
27
What does the **na.omit()** function do?
It removes missing values from a dataset.