What is artificial intelligence
is the creation of intelligent machines that react like humans
What is data science
the use of artificial intelligence to solve real world problems with data
What is weak AI
AI designed for specific tasks
no self awareness or understanding beyond trained domain
What is strong AI
Hypothetical AI with human like general reasoning
What is data mining
subset of AI focusing on Extracting of patterns or knowledge from huge amounts of data not previously known
What kind of data is data mining done on
Temporal data,
Structure data,
Text databases,
Image/videos
What is machine leanring
Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed
What are the deep learning models
Deep neural network
Convolutional neural network
Recurrent neural network
Autoencoders
Machine learning vs Data mining
“Data Mining is a subset of Artificial Intelligence and it focuses on extracting knowledge from data — how to identify previously unknown patterns, relationships, or anomalies in the large data sets that humans can then use to solve a complex problem.”
This is a manual process that requires human intervention and decision making.
“Machine Learning is a subset of Artificial Intelligence and it focuses on teaching a computer
- how to learn to analyse large datasets and “learn” the patterns in it (from the training dataset) that can help make predictions on new data.”
Why do deep learning models usually outperform classical statistical machine learning models when trained on very large data bases?
Deep learning models have many more parameters, which allows them to learn very complex patterns when enough data is available
What is the primary goal of data exploration
Data visulisation
Summary Statistics
Why does data visualisation matter
summary statistics alone can be misleading
different datasets can share identical statistics but have very different structures
What is data visulisation
a way to communicate complex information
What is data visualisation able to do
It is critical tool in AI, it provides an effective way to identify summaries, structure, relationships, differences,
and abnormalities in the data.
When should you use the distribution graph
to find how the data is spread
when should you use the comparison graph
to find the differences between groups of data
bar charts
when should you use the composition graph
to find the parts of a whole
pie charts
when should you use the connection graph
network graphs
to find networks and links between data
when should you use the location graph
maps/heatmaps
show geographical patterns
What is the grammar off graphics
provides a structured, layered system for building data visualizations, separating the core components like data, aesthetic mappings
What happens as you increase the span value
The greator the value the smaller the curve will be
What is important about the input data for AI
Quality of the output depends on the quality of the input and dramitcally effects the perofmrance of the AI
What are common issues with data that affect eh performance of AI
Missing values
Duplicate records
noise
invalid or inconsistent data
outliers
What does data wrangling involve
data cleansing
feature selection and transformation