software automation Flashcards by Caitlin Chung

what role does machine learning play in automation

ML enables automation by training models to learn from data, allowing systems to make decisions or predictions without being explicitly programmed

How well did you know this?

Not at all

Perfectly

what is a machine learning model

a model is the output of a training process that represents learned patterns from data, used to make predictions or decisions

How well did you know this?

Not at all

Perfectly

what is devops and how does it relate to automation

set of practises, tools and cultural philosophy that integrate software development and operations to automate/streamline code build, test, deploy and monitor processes (CI/CD)
emphasises teamwork, communication and collaboration, and technology automation
ML detects anomalies, predicts failures and optimises pipelines in workflows

How well did you know this?

Not at all

Perfectly

what is MLOps and how is it different from devops

automated process of designing, training, deploying and managing ML models
- involves both data scientists and operations teams
- deals with more complex, data driven systems

How well did you know this?

Not at all

Perfectly

what is RPA

robotic process automation
- using software bots to automate routine and repetitive tasks performed by humans on computers
- subset of bpa - automates individual tasks within processes
- can interact with applications, manipulate data, trigger responses
- improves efficiency and reduces errors
- employees focus on more valuable/complex work
- streamline operations, improve productivity

How well did you know this?

Not at all

Perfectly

what is bpa

business process automation
- automating entire workflows (complex, multi-step processes that traditionally require input, coordination and oversight)
- arranging multiple tasks, systems, data sources and people through automated workflows
- eg automating customer onboarding process

How well did you know this?

Not at all

Perfectly

distinguish between ai and ml

AI is the capability of a machine to mimic human intelligence (reasoning, discovering new information, inferring data)
ML is a subset of AI that focuses on the ability of machines to learn from data and make predictions/decisions without being explicitly programmed

How well did you know this?

Not at all

Perfectly

what are ml training models

training involves using data to teach an algorithm to recognise patterns and make predictions/decisions
supervised, unsupervised, semi-supervised, reinforcement

How well did you know this?

Not at all

Perfectly

what is supervised learning

algorithm is trained on labelled data and learns to predict outcomes
- classification (label/category)
- regression (continuous value)
- eg spam detection

How well did you know this?

Not at all

Perfectly

what is unsupervised learning

model finds patterns or clusters in unlabelled data without predefined outcomes
- can find hidden patterns, but can be inaccurate/longer training times + need human validation
- eg customer segmentation

How well did you know this?

Not at all

Perfectly

what is semi-supervised learning

hybrid model that uses small amount of labelled data with large amount of unlabelled data
- trains on labelled data, then uses predictions on unlabelled data to create new points to add to training data
- cost effective, faster than supervised, more accurate than unsupervised
- but sensitive to noise/errors, computationally complex
- eg medical diagnosis

How well did you know this?

Not at all

Perfectly

what is reinforcement learning

agent interacts with an environment, and receives feedback in the form of awards or penalties
- learns to maximise the cumulative reward over time
- useful when difficult to define specific goal/provide labelled data
- solving complex problems, flexibility, finding best sequence of actions to achieve goal
- but significant computational power, effectiveness relies on quality of reward function, requires lots of data to learn
- eg learning to play a game

How well did you know this?

Not at all

Perfectly

what are common applications of key ml algorithms

data analysis and forecasting (examining large datasets to find patterns / using insights to predict future outcomes by applying regression/classification)
virtual personal assistants (software agents that use NPL and ML to interact with users and perform tasks - reinforcement learning)
image recognition (ability of software to detect, classify and interpret visual data - CNNS, KNN, neural networks, logistic)

How well did you know this?

Not at all

Perfectly

how ml algorithms can be used in data analysis

trained with subset of data to automatically identify patterns and trends
handle complex/nonlinear relationships
used for predictions and forecasting
learn from new data to improve accuracy over time

How well did you know this?

Not at all

Perfectly

what are key types of ml algorithms (list them out)

linear regression (predict continuous numeric values)
polynomial regression (predict outcomes with nonlinear relationships - curve)
logistic regression (predict categorical outcomes - uses sigmoid function)
KNN (classify new data based on the majority class of its ‘k’ nearest neighbours)
decision trees (make decisions by splitting data into branches based on features)
neural networks (complex pattern recognition and predictions - artificial brain)

How well did you know this?

Not at all

Perfectly

what are variables in ml

a characteristic, property or feature that represents data and provides information for models to learn
- features are independent input variables used to make predictions
- targets are dependent output variables that the model is trying to predict
- labels are used during training and testing for supervised

How well did you know this?

Not at all

Perfectly

what are decision trees and how are they used in ml

supervised learning model used for classification and regression
tree-like structure of nodes and branches to represent combinations of decisions and consequences
root/decision/leaf nodes
splitting/pruning
at each node the dataset is split based on value of feature - until stopping criteria are met
overfitting is when model is too tailored on training data
goal is to build accurate tree that is generalisable (can make predictions)

How well did you know this?

Not at all

Perfectly

what are neural networks and how are they structured

set of algorithms designed to recognise patterns by mimicking the way the human brain works)
complex problems, lots of unstructured data
series of neurons/interconnected nodes that receive inputs, processes, and produces an output
1. input layer receives input (each neuron represents feature of input data)
2. hidden layers perform computations/transformations on input data (connections have weights and thresholds, activation function)
3. output layer produces the final result/prediction (each neuron = possible output class)

How well did you know this?

Not at all

Perfectly

describe the cycles or processes involved in neural networks

Study These Flashcards

training cycle (network learns from data - by comparing output to label and adjusting weights with backpropagation)
execution cycle is when trained neuron network is used to make real world decisions (inference)

what are weights and biases (in neural networks)

Study These Flashcards

weights determine the strength of connections between neurons
biases provide an additional parameter that shifts the activation function’s output
activation function decides whether neuron should be activated based on weighted sum of inputs and bias

describe linear regression

Study These Flashcards

supervised learning algorithm for predicting continuous values
assumes linear relationship
residuals = actual output - predicted output
bias and weight (parameters) calculated from training
simple to implement/interpret, but outliers have big impact

describe logistic regression

Study These Flashcards

supervised ml algorithm that predicts probability by analysing relationship and classifying data into discrete classes
predictive modelling (outputs probability of input belonging to category)
sigmoid curve that maps values between 0 and 1
works well when data is linearly separable (binary classification problems)
logistic cost function measures how close prediction is
gradient descent used to optimise w/b

describe k-nearest neighbour

Study These Flashcards

supervised learning algorithm for classification and regression
‘similar things exist in close proximity’
instance based learning (training data stored)
k is number of points the algorithm looks at to make decision (should be odd)
classification (new point assigned most common class)
regression (value predicted by taking average)
performance decreases as data size increases, and sensitive to irrelevant features

what is cost function and cost

Study These Flashcards

cost function measures how well a regression models’ predictions match actual target values
calculates total loss (prediction - actual) across all training examples
aggregates errors across all points into single scalar value
cost is the sum of the loss for all data points (quantifies error)
mean square error (average of squares of difference between predicted and actual values)

what is gradient descent

process that automates learning process by finding best settings for lowest error - adjusts parameters to make predictions more accurate - calculates gradient of cost function and updates parameters in the direction that reduces error - learning rate determines how big each step should be - stops when cost function meets its minimum

what is polynomial regression

- extension of linear regression that can model curved relationships and predict numeric values - cost function calculates MSE to measure how far off predictions are - gradient descent used to adjust weight and bias to reduce error

how is oop used in ml regression models

oop structures ml components (data handlers and regression models) into classes to build scalable, reusable systems

how are neural networks applied using oop

- developers create classes for layers, activation functions and training loops to structure predictions and improve modularity - manage complexity of development + abstraction

what is data wrangling

the process of transforming and structuring raw data into a usable format for analysis - addressing missing values, duplicates, outliers, inconsistencies

what is forward pass and back propagation in neural networks?

- forward pass: making a prediction (receives input, calculates weighted sum, applies activation function) - back propagation: automates the learning process to fix mistakes

what impacts does automation have on the individual, society, and environment?

- safety of workers (reduces exposure to dangerous tasks / job displacement or complacency) - people with disability (assistive technology -> improve QOL / marginalisation) - nature and skills required for employment (reduces repetitive jobs and creates new roles / requires changes in skill set) - production efficiency, waste and the environment (improved efficiency and reduce waste / increased energy usage if poorly managed) - the economy and distribution of wealth (boost economic output / widen inequality)

how do patterns in human behaviour influence ml

- psychological responses (how people think, behave and respond to technology - trust, bias, emotion) - patterns related to acute stress response (understanding stress response -> can create adaptive system) - cultural protocols (cultural differences affects how data is collected, represented and interpreted - need to avoid bias and irrelevant data) - belief systems (biased training data can reflect societal biases - need to respect)

what is dataset source bias

- selection bias (doesn't represent whole population) - confirmation bias (reinforces existing human biases in decision making) - historical bias (data reflects past inequalities) - causes biased predictions or unfair treatment

how can developers mitigate bias

- developers can audit datasets, involve diverse users in development - ensure training data is representative of problem to be solved by model - review data by individuals form diverse backgrounds

what are benefits of using machine learning in security

- improved efficiency and accuracy in detecting threats - ability to respond to incidents in real time - security and compliance while automating business processes

what are disadvantages of using machine learning in security

- complexity and learning curve - dependency on data quality - potential for bias - security and privacy concerns - integration challenges - cost

what is the design stage of MLOps

- defining the business problem - translating to ML problem - define success metrics - research available data (ensure quality, relevance and availability)

what is the model development stage of MLOps

- data wrangling (cleaning and formatting raw data) - feature engineering (selecting variables that help model accuracy) - model training - testing and validation (evaluate using metrics)

what is the operations stage of MLOps

- model deployment (making available for real world use) - supporting operations/use (ensuring correct functionality) - monitoring model performance (track accuracy + updates)

what is the bias variance trade off in machine learning

balance between two types of error: - bias (error from incorrect assumptions - underfitting) - variance (errors from sensitivity to data fluctuations - overfitting) - goal is to minimise both to build a model that generalises well

what is bias in a machine learning model

the error caused by overly simple assumptions in the mode. high bias can lead to underfitting, where the model fails to capture important patterns

what is variance in a machine learning model

the error caused by the model being too sensitive to small fluctuations in the training data. high variance can lead to overfitting, where the model captures too much noise

how can we reduce variance in a model

- use more training data - simplify model - apply regularisation techniques - use cross validation

what is the role of the learning rate in gradient descent

the learning rate controls how large each weight update is. - small learning rate means slow, stable learning (may get stuck at local minima) - large learning rate can cause overshooting or instability (fail to converge or diverge entirely)

what does convergence mean in gradient descent

convergence occurs when further updates produce minimal change in the loss function - ie the model has reached or near a minimum

what are two stopping criteria to ensure convergence without overfitting (gradient descent)

1. early stopping: stop training when validation loss stops improving 2. gradient based: stop when magnitude of gradient is very small, indicating the model has nearly converged

software automation Flashcards

(46 cards)