What is data science and how is python used in it? (AI) Flashcards

(8 cards)

1
Q

Data Science

A

The interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

three core disciplines

A
  1. Statistics/Math (Hypothesis testing, modeling), 2. Computer Science (Programming, algorithms), and 3. Domain Expertise (Business or scientific context).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

typical workflow

A
  1. Data Acquisition/Cleaning → 2. Exploration/Analysis → 3. Modeling/Prediction → 4. Visualization/Communication.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Python

A

It has a simple syntax, a huge, mature ecosystem of open-source libraries, and versatility for both data analysis and production deployment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

main Python library for data cleaning, manipulation, and analysis

A

pandas. It provides fast, flexible data structures, most notably the DataFrame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

main Python library for numerical operations and array handling

A

NumPy (Numerical Python). It provides powerful array objects and tools for working with them, forming the foundation for most other scientific libraries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

main Python library for machine learning and predictive modeling

A

scikit-learn. It offers consistent interfaces for algorithms like classification, regression, clustering, and dimensionality reduction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

two key Python libraries for data visualization

A

Matplotlib (the base plotting library) and Seaborn (a library based on Matplotlib that provides a high-level interface for drawing attractive statistical graphics).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly