Data Analytics & Visualisation Flashcards

(51 cards)

1
Q

What is the PPDAC process?

A

A framework for doing data analysis
- Problem
- Plan (What to measure and how)
- Data (collect, manage, process)
- Analysis (reorganise, run analysis, identify patterns)
- Conclusion (Interpret the results)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is structured data?

A
  • Structured Data: Data structured in rows and columns
  • Unstructured Data: Data such as Text or audio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two general goals and learning paradigms?

A
  • Description of data and knowledge discovery
  • Prediction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is supervised learning?

A
  • Inputs and correct output given –> learn the mapping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is unsupervised learning?

A
  • No given output –> model tries to uncover structure on its own
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the objective for interpretative /analytical data?

A

Understand, summarise, segment & support decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are typical sources of interpretative /analytical data?

A
  • CRM data
  • Surveys
  • Admin data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do the variables for interpretative /analytical data usually look like?

A
  • contracted to be meaningful and stable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are typical results when using interpretative /analytical data?

A
  • Interpretable summaries
  • Insights that can be communicated and acted upon
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the objective for Opportunistic / (pure) predictive data?

A

predict outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are typical sources of Opportunistic / (pure) predictive data?

A
  • Web logs
  • App telemetry
  • Network traffic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do the variables for Opportunistic / (pure) predictive data usually look like?

A
  • Many; often proxies, individual meaning may be weak or unstable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are typical results when using Opportunistic / (pure) predictive data?

A
  • Models assessed by predictive accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is dimensional reduction?

A

Combining many starting variables into a final factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two command types in R-Studio?

A
  • Expressions
  • Assignments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are expressions?

A

Something that R can evaluate to produce a value
- E.g. 3+5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are Assignments?

A

Assignments store the result of an expression in a variable
- E.g. x <- expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How are expressions separated?

A
  • New line or ;
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can you group expressions?

A
  • using brackets ()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the assignment operators? (How is a value assigned?)

A

Either = or <-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How can we allow multiple expressions treated as one unit?

22
Q

What is a vector?

A
  • Ordered collection of elements of the same type
23
Q

What are the basic vector types?

A
  • Numeric (double) –> floating point possible
  • Numeric (integer) –> Whole numbers only
  • Character
  • Logical
24
Q

How can you compute the variance of a vector ?

25
What are the different types of null values and their meaning?
- Null: Something is actually empty - NA: Something is not available due to missing input - NaN: Something cannot be computed - Inf: An excessively large number or computation
26
What types of operators are there in R?
- Arithmetic (e.g. *, +, /) - Comparison (e.g. >, <=, ==) - Logical (e.g. !, &, |, xor)
27
What is a function?
- A collection of commands to manipulate objects (e.g. make calculations or define new objects)
28
How is a function build?
- A list of arguments between () and separated by ,
29
How can I find out the list of arguments needed for a function?
args()
30
How can we make sure to only print a certain amount of digits after the comma?
, digits = 2
31
What are R objects and what are they classified by?
Anything that can be stored in memory and given a name - Mode (storage type) - Class (How R treats it)
32
What objects are there?
- Vector - Matrix - Factor - Dataframe - Function - List
33
What is a Vector?
A one-dimensional collection of values of the same type - A single column in excel (E.g. C(1,2,3))
34
What is a Matrix?
- A 2D table of values with the same type - Multiples columns and rows
35
What is a Factor?
- A vector of categories
36
What is a dataframe?
A table of columns - Real excel sheet
36
What is a list?
- A container that can hold anything
37
What is a function?
- Object that performs computation
38
How are vectors created?
c(1,2,3)
39
How can you generate a sequence from numerical vectors?
- from integers: 1:2 - From a functions: --> sequence(1,10, by=10) --> rep(1,10) -
39
How can you generate a sequence from character and logical vectors?
- letters[1:5] - paste0("c", 1:10) or paste("c", 10) - rep("female",2)
39
What happens when you put mix types within a vector?
- Vector turns everything to the most flexible type (usually character)
40
How can you generate a random number?
runif(6)
40
How can we asses the length of a vector?
length()
41
How can we assign names to values inside vectors?
names(x)<-c("name1","name2") Combining a character vector with same length of the vector
42
How can we subset a vector?
x[3]
43
How can we remove one or multiple elements from a vector?
x[-3] x[C(-3,-4,-6:-7)]
44
How can we extract elements form a vector using a logical argument?
x[x>10]
45
How can we say that something is not?
!
46
How can we select NA elements in a vector?
x[is.na(x)]
47