software automation Flashcards

(46 cards)

1
Q

what role does machine learning play in automation

A

ML enables automation by training models to learn from data, allowing systems to make decisions or predictions without being explicitly programmed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a machine learning model

A

a model is the output of a training process that represents learned patterns from data, used to make predictions or decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is devops and how does it relate to automation

A
  • set of practises, tools and cultural philosophy that integrate software development and operations to automate/streamline code build, test, deploy and monitor processes (CI/CD)
  • emphasises teamwork, communication and collaboration, and technology automation
  • ML detects anomalies, predicts failures and optimises pipelines in workflows
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is MLOps and how is it different from devops

A

automated process of designing, training, deploying and managing ML models
- involves both data scientists and operations teams
- deals with more complex, data driven systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is RPA

A

robotic process automation
- using software bots to automate routine and repetitive tasks performed by humans on computers
- subset of bpa - automates individual tasks within processes
- can interact with applications, manipulate data, trigger responses
- improves efficiency and reduces errors
- employees focus on more valuable/complex work
- streamline operations, improve productivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is bpa

A

business process automation
- automating entire workflows (complex, multi-step processes that traditionally require input, coordination and oversight)
- arranging multiple tasks, systems, data sources and people through automated workflows
- eg automating customer onboarding process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

distinguish between ai and ml

A
  • AI is the capability of a machine to mimic human intelligence (reasoning, discovering new information, inferring data)
  • ML is a subset of AI that focuses on the ability of machines to learn from data and make predictions/decisions without being explicitly programmed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are ml training models

A
  • training involves using data to teach an algorithm to recognise patterns and make predictions/decisions
  • supervised, unsupervised, semi-supervised, reinforcement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is supervised learning

A

algorithm is trained on labelled data and learns to predict outcomes
- classification (label/category)
- regression (continuous value)
- eg spam detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is unsupervised learning

A

model finds patterns or clusters in unlabelled data without predefined outcomes
- can find hidden patterns, but can be inaccurate/longer training times + need human validation
- eg customer segmentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is semi-supervised learning

A

hybrid model that uses small amount of labelled data with large amount of unlabelled data
- trains on labelled data, then uses predictions on unlabelled data to create new points to add to training data
- cost effective, faster than supervised, more accurate than unsupervised
- but sensitive to noise/errors, computationally complex
- eg medical diagnosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is reinforcement learning

A

agent interacts with an environment, and receives feedback in the form of awards or penalties
- learns to maximise the cumulative reward over time
- useful when difficult to define specific goal/provide labelled data
- solving complex problems, flexibility, finding best sequence of actions to achieve goal
- but significant computational power, effectiveness relies on quality of reward function, requires lots of data to learn
- eg learning to play a game

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are common applications of key ml algorithms

A
  • data analysis and forecasting (examining large datasets to find patterns / using insights to predict future outcomes by applying regression/classification)
  • virtual personal assistants (software agents that use NPL and ML to interact with users and perform tasks - reinforcement learning)
  • image recognition (ability of software to detect, classify and interpret visual data - CNNS, KNN, neural networks, logistic)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how ml algorithms can be used in data analysis

A
  • trained with subset of data to automatically identify patterns and trends
  • handle complex/nonlinear relationships
  • used for predictions and forecasting
  • learn from new data to improve accuracy over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are key types of ml algorithms (list them out)

A
  • linear regression (predict continuous numeric values)
  • polynomial regression (predict outcomes with nonlinear relationships - curve)
  • logistic regression (predict categorical outcomes - uses sigmoid function)
  • KNN (classify new data based on the majority class of its ‘k’ nearest neighbours)
  • decision trees (make decisions by splitting data into branches based on features)
  • neural networks (complex pattern recognition and predictions - artificial brain)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are variables in ml

A

a characteristic, property or feature that represents data and provides information for models to learn
- features are independent input variables used to make predictions
- targets are dependent output variables that the model is trying to predict
- labels are used during training and testing for supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are decision trees and how are they used in ml

A
  • supervised learning model used for classification and regression
  • tree-like structure of nodes and branches to represent combinations of decisions and consequences
  • root/decision/leaf nodes
  • splitting/pruning
  • at each node the dataset is split based on value of feature - until stopping criteria are met
  • overfitting is when model is too tailored on training data
  • goal is to build accurate tree that is generalisable (can make predictions)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are neural networks and how are they structured

A
  • set of algorithms designed to recognise patterns by mimicking the way the human brain works)
  • complex problems, lots of unstructured data
  • series of neurons/interconnected nodes that receive inputs, processes, and produces an output
    1. input layer receives input (each neuron represents feature of input data)
    2. hidden layers perform computations/transformations on input data (connections have weights and thresholds, activation function)
    3. output layer produces the final result/prediction (each neuron = possible output class)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

describe the cycles or processes involved in neural networks

A
  • training cycle (network learns from data - by comparing output to label and adjusting weights with backpropagation)
  • execution cycle is when trained neuron network is used to make real world decisions (inference)
20
Q

what are weights and biases (in neural networks)

A
  • weights determine the strength of connections between neurons
  • biases provide an additional parameter that shifts the activation function’s output
  • activation function decides whether neuron should be activated based on weighted sum of inputs and bias
21
Q

describe linear regression

A
  • supervised learning algorithm for predicting continuous values
  • assumes linear relationship
  • residuals = actual output - predicted output
  • bias and weight (parameters) calculated from training
  • simple to implement/interpret, but outliers have big impact
22
Q

describe logistic regression

A
  • supervised ml algorithm that predicts probability by analysing relationship and classifying data into discrete classes
  • predictive modelling (outputs probability of input belonging to category)
  • sigmoid curve that maps values between 0 and 1
  • works well when data is linearly separable (binary classification problems)
  • logistic cost function measures how close prediction is
  • gradient descent used to optimise w/b
23
Q

describe k-nearest neighbour

A
  • supervised learning algorithm for classification and regression
  • ‘similar things exist in close proximity’
  • instance based learning (training data stored)
  • k is number of points the algorithm looks at to make decision (should be odd)
  • classification (new point assigned most common class)
  • regression (value predicted by taking average)
  • performance decreases as data size increases, and sensitive to irrelevant features
24
Q

what is cost function and cost

A
  • cost function measures how well a regression models’ predictions match actual target values
  • calculates total loss (prediction - actual) across all training examples
  • aggregates errors across all points into single scalar value
  • cost is the sum of the loss for all data points (quantifies error)
  • mean square error (average of squares of difference between predicted and actual values)
25
what is gradient descent
process that automates learning process by finding best settings for lowest error - adjusts parameters to make predictions more accurate - calculates gradient of cost function and updates parameters in the direction that reduces error - learning rate determines how big each step should be - stops when cost function meets its minimum
26
what is polynomial regression
- extension of linear regression that can model curved relationships and predict numeric values - cost function calculates MSE to measure how far off predictions are - gradient descent used to adjust weight and bias to reduce error
27
how is oop used in ml regression models
oop structures ml components (data handlers and regression models) into classes to build scalable, reusable systems
28
how are neural networks applied using oop
- developers create classes for layers, activation functions and training loops to structure predictions and improve modularity - manage complexity of development + abstraction
29
what is data wrangling
the process of transforming and structuring raw data into a usable format for analysis - addressing missing values, duplicates, outliers, inconsistencies
30
what is forward pass and back propagation in neural networks?
- forward pass: making a prediction (receives input, calculates weighted sum, applies activation function) - back propagation: automates the learning process to fix mistakes
31
what impacts does automation have on the individual, society, and environment?
- safety of workers (reduces exposure to dangerous tasks / job displacement or complacency) - people with disability (assistive technology -> improve QOL / marginalisation) - nature and skills required for employment (reduces repetitive jobs and creates new roles / requires changes in skill set) - production efficiency, waste and the environment (improved efficiency and reduce waste / increased energy usage if poorly managed) - the economy and distribution of wealth (boost economic output / widen inequality)
32
how do patterns in human behaviour influence ml
- psychological responses (how people think, behave and respond to technology - trust, bias, emotion) - patterns related to acute stress response (understanding stress response -> can create adaptive system) - cultural protocols (cultural differences affects how data is collected, represented and interpreted - need to avoid bias and irrelevant data) - belief systems (biased training data can reflect societal biases - need to respect)
33
what is dataset source bias
- selection bias (doesn't represent whole population) - confirmation bias (reinforces existing human biases in decision making) - historical bias (data reflects past inequalities) - causes biased predictions or unfair treatment
34
how can developers mitigate bias
- developers can audit datasets, involve diverse users in development - ensure training data is representative of problem to be solved by model - review data by individuals form diverse backgrounds
35
what are benefits of using machine learning in security
- improved efficiency and accuracy in detecting threats - ability to respond to incidents in real time - security and compliance while automating business processes
36
what are disadvantages of using machine learning in security
- complexity and learning curve - dependency on data quality - potential for bias - security and privacy concerns - integration challenges - cost
37
what is the design stage of MLOps
- defining the business problem - translating to ML problem - define success metrics - research available data (ensure quality, relevance and availability)
38
what is the model development stage of MLOps
- data wrangling (cleaning and formatting raw data) - feature engineering (selecting variables that help model accuracy) - model training - testing and validation (evaluate using metrics)
39
what is the operations stage of MLOps
- model deployment (making available for real world use) - supporting operations/use (ensuring correct functionality) - monitoring model performance (track accuracy + updates)
40
what is the bias variance trade off in machine learning
balance between two types of error: - bias (error from incorrect assumptions - underfitting) - variance (errors from sensitivity to data fluctuations - overfitting) - goal is to minimise both to build a model that generalises well
41
what is bias in a machine learning model
the error caused by overly simple assumptions in the mode. high bias can lead to underfitting, where the model fails to capture important patterns
42
what is variance in a machine learning model
the error caused by the model being too sensitive to small fluctuations in the training data. high variance can lead to overfitting, where the model captures too much noise
43
how can we reduce variance in a model
- use more training data - simplify model - apply regularisation techniques - use cross validation
44
what is the role of the learning rate in gradient descent
the learning rate controls how large each weight update is. - small learning rate means slow, stable learning (may get stuck at local minima) - large learning rate can cause overshooting or instability (fail to converge or diverge entirely)
45
what does convergence mean in gradient descent
convergence occurs when further updates produce minimal change in the loss function - ie the model has reached or near a minimum
46
what are two stopping criteria to ensure convergence without overfitting (gradient descent)
1. early stopping: stop training when validation loss stops improving 2. gradient based: stop when magnitude of gradient is very small, indicating the model has nearly converged