V3_L2:Data understanding Flashcards

(22 cards)

1
Q

Automation

A

Using tools and techniques to streamline data collection and preparation processes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data Collection

A

The phase of gathering and assembling data from various sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data Compilation

A

The process of organizing and structuring data to create a comprehensive data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Formatting

A

The process of standardizing the data to ensure uniformity and ease of analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data Manipulation

A

The process of transforming data into a usable format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data Preparation

A

The phase where data is cleaned, transformed, and formatted for further analysis, including feature engineering and text analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data Preparation

A

The stage where data is transformed and organized to facilitate effective analysis and modeling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Quality

A

Assessment of data integrity and completeness, addressing missing, invalid, or misleading values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data Quality Assessment

A

The evaluation of data integrity, accuracy, and completeness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data Set

A

A collection of data used for analysis and modeling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data Understanding

A

The stage in the data science methodology focused on exploring and analyzing the collected data to ensure that the data is representative of the problem to be solved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Descriptive Statistics

A

Summary statistics that data scientists use to describe and understand the distribution of variables, such as mean, median, minimum, maximum, and standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Feature

A

A characteristic or attribute within the data that helps in solving the problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Feature Engineering

A

The process of creating new features or variables based on domain knowledge to improve machine learning algorithms’ performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Feature Extraction

A

Identifying and selecting relevant features or attributes from the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Interactive Processes

A

Iterative and continuous refinement of the methodology based on insights and feedback from data analysis.

17
Q

Missing Values

A

Values that are absent or unknown in the dataset, requiring careful handling during data preparation.

18
Q

Model Calibration

A

Adjusting model parameters to improve accuracy and alignment with the initial design.

19
Q

Pairwise Correlations

A

An analysis to determine the relationships and correlations between different variables.

20
Q

Text Analysis

A

Steps to analyze and manipulate textual data, extracting meaningful information and patterns.

21
Q

Text Analysis Groupings

A

Creating meaningful groupings and categories from textual data for analysis.

22
Q

Visualization techniques

A

Methods and tools that data scientists use to create visual representations or graphics that enhance the accessibility and understanding of data patterns, relationships, and insights.