05_Exploratory_Data_Analysis Flashcards

(6 cards)

1
Q

Front

A

Back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you compute grouped summary statistics quickly?

A

Use dplyr::group_by() + summarise().

Code:
library(dplyr)
mtcars %>% group_by(cyl) %>% summarise(across(mpg:hp, list(mean=mean, sd=sd)))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you make a scatter plot with ggplot2?

A

Use geom_point().

Code:
library(ggplot2)
ggplot(mtcars, aes(wt, mpg)) + geom_point() + labs(x=’Weight’, y=’MPG’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you plot a histogram and a boxplot?

A

Use geom_histogram() for distributions; geom_boxplot() for spread/outliers.

Code:
library(ggplot2)
ggplot(mtcars, aes(mpg)) + geom_histogram(bins = 20)
ggplot(mtcars, aes(factor(cyl), mpg)) + geom_boxplot()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you facet plots by a category?

A

Use facet_wrap(~ var).

Code:
library(ggplot2)
ggplot(iris, aes(Sepal.Length, Petal.Length)) + geom_point() + facet_wrap(~ Species)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you compute and test correlations?

A

Use cor() for coefficient, cor.test() for inference.

Code:
cor(mtcars$mpg, mtcars$wt, method=’pearson’)
cor.test(mtcars$mpg, mtcars$wt)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly