Lab6 - data manipulation Flashcards

(16 cards)

1
Q

What is data manipulation?

A

Cleaning, transforming, organising data for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why manipulate data?

A

Make raw data suitable for analytics tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give examples of manipulation processes.

A

Filter, create variables, summarise, impute, reorder observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

List out example of Narrowing observations?

A

Select recent year products or subset by condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

List out example of Variable creation

A

Compute BMI from weight and height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List out example of summarising

A

Calculate counts, means, or group summaries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

List out example of Imputation

A

Add or update values to fill missing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the tbl class (tidyverse)?

A

A tidy data frame variant with improved behavior and printing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do tibbles print?

A

Show first 10 rows; truncate columns for legibility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name behaviours tibbles have vs data.frames.

A
  1. Easier to work with list
  2. Doesn’t change column name ( read.csv() does )
  3. Evaluates sequentially column to column
  4. Displays more useful information
  5. Subsetting consistently returns tbl’s
  6. Extraction requires complete column names
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does cbind() do?

A

Column-bind objects ( matr -> matrices ) and joining data frames -> df ) ; rows count must match.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does rbind() do?.

A

Row-bind objects ( matr -> matrices ) and joining data frames -> df ); columns must match names and count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which join merges carrier names into flights?

A
left_join(flights, airlines, by=\"carrier\")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does left_join keep?

A

All left table rows; matched right table values added.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What must match for rbind()?

A

Same columns and names across data frames.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What must match for cbind()?

A

Same number of rows across objects.