Block 1 Flashcards by Raffaele Palumbo

What is the relationship hierarchy between AI, Machine Learning, and Deep Learning?

AI is the overarching field; Machine Learning is a subset of AI; Deep Learning is a specialized subset of ML.

How well did you know this?

Not at all

Perfectly

What is the primary characteristic of Machine Learning (ML) regarding programming?

ML allows systems to learn from data and improve performance without being explicitly programmed for every rule.

How well did you know this?

Not at all

Perfectly

Which technology simulates human intelligence to solve problems, recognize images, and write text?

Artificial Intelligence (AI).

How well did you know this?

Not at all

Perfectly

What specialized field of ML uses multi-layered neural networks inspired by the human brain?

Deep Learning.

How well did you know this?

Not at all

Perfectly

What capability allows computers to derive insights and make decisions based on visual inputs like images and videos?

Computer Vision.

How well did you know this?

Not at all

Perfectly

Which technology enables computers to interpret, manipulate, and comprehend human language (e.g., sentiment analysis)?

Natural Language Processing (NLP).

How well did you know this?

Not at all

Perfectly

What are the three types of layers found in a Neural Network?

Input layer, Hidden layer(s), and Output layer.

How well did you know this?

Not at all

Perfectly

What is the process called where a neural network learns from mistakes by adjusting weights to reduce error?

Backpropagation (or Backward Propagation).

How well did you know this?

Not at all

Perfectly

What is the term for using a trained ML model to make predictions on new, unseen data?

Inference.

How well did you know this?

Not at all

Perfectly

Which inference type processes data points individually as they arrive, providing immediate, low-latency predictions?

Real-time (or Online) Inference.

How well did you know this?

Not at all

Perfectly

Which inference type is best for processing large datasets all at once (e.g., nightly reporting) where immediate results are not critical?

Batch (or Offline) Inference.

How well did you know this?

Not at all

Perfectly

What distinguishes Supervised Learning from other ML methods?

It requires a labeled dataset (input paired with correct output) to train the model.

How well did you know this?

Not at all

Perfectly

Which ML method is used for finding patterns in unlabeled data, such as customer segmentation or anomaly detection?

Unsupervised Learning.

How well did you know this?

Not at all

Perfectly

Which ML method involves training a model to make sequences of decisions through a system of rewards and penalties?

Reinforcement Learning.

How well did you know this?

Not at all

Perfectly

What is “Labeled Data”?

Data that includes both the input features and the corresponding correct output (or target).

How well did you know this?

Not at all

Perfectly

What type of data typically lacks a predefined model, such as text documents, images, and audio files?

Unstructured Data.

How well did you know this?

Not at all

Perfectly

What specific architecture do Large Language Models (LLMs) use to understand and generate human-like text?

Transformer architecture.

How well did you know this?

Not at all

Perfectly

What is “Bias” in the context of an AI model?

Systematic errors in a model that result in unfair, inaccurate, or discriminatory outcomes.

How well did you know this?

Not at all

Perfectly

What is the term for how well a model learns patterns in training data to generalize to new data?

Fit (e.g., Underfitting or Overfitting).

How well did you know this?

Not at all

Perfectly

What is the definition of a “feature” in a dataset?

An individual measurable property or characteristic of the data being observed.

How well did you know this?

Not at all

Perfectly

Identify a scenario where AI/ML solutions are generally NOT appropriate.

When a specific, deterministic outcome is needed instead of a prediction, or when data is insufficient/poor.

How well did you know this?

Not at all

Perfectly

What is “Accountable AI”?

The requirement for human oversight in AI decisions with significant consequences (e.

How well did you know this?

Not at all

Perfectly

Which ML technique is used to predict numerical outcomes, such as stock prices or house values?

Regression.

How well did you know this?

Not at all

Perfectly

Which Amazon SageMaker algorithm trains multiple models in parallel to optimize regression predictions?

Linear Learner algorithm.

How well did you know this?

Not at all

Perfectly

Which AWS service enables users to run ML prediction queries directly within their data warehouse using SQL?

Amazon Redshift ML.

Spam detection and sentiment analysis are examples of which Machine Learning method?

Classification (Supervised Learning).

Which Unsupervised Learning technique discovers patterns or groupings (like customer segmentation) without predefined labels?

Clustering.

What are the four core functional stages of the Amazon SageMaker workflow?

Prepare, Build, Train & Tune, Deploy & Manage.

What is the "Amazon SageMaker Lakehouse" architecture?

A hybrid data management architecture that combines the scalability of data lakes with the performance of data warehouses.

Which Amazon SageMaker feature provides a managed Jupyter server for running code and experimenting with models?

SageMaker Notebooks.

How does we distinguish between Amazon SageMaker and Amazon Bedrock?

SageMaker is for building models from scratch ("building the robot"), while Bedrock is for deploying pre-trained Large Language Models (LLMs).

In the context of SageMaker architecture, what is "Boto3"?

The AWS Software Development Kit (SDK) for Python used to interact with AWS services.

What policy language can be used with AWS Verified Access to create custom, zero-trust API permissions?

CEDAR language.

What is the primary interface for building, training, and deploying ML models in the SageMaker ecosystem?

Amazon SageMaker Studio (referred to as SageMaker Unified Studio in newer iterations).

Which managed SageMaker component is used to leverage optimized algorithms for tasks like image classification without managing infrastructure?

SageMaker Managed Containers.

What is the primary function of Amazon Transcribe?

It is a fully managed Automatic Speech Recognition (ASR) service that converts speech to text.

Which Amazon Transcribe feature allows it to distinguish between different speakers in an audio file?

Automatic speaker identification (diarization).

If a developer needs to remove personally identifiable information (PII) from an audio stream automatically, which service should they use?

Amazon Transcribe (using its redaction feature).

What is the primary difference between Amazon Transcribe and Amazon Polly?

Amazon Transcribe converts speech-to-text (ASR), whereas Amazon Polly converts text-to-speech (Voice Generation).

Which service uses "Neural Machine Translation" to provide high-quality, fast language translation?

Amazon Translate.

How can Amazon Translate be configured to correctly translate industry-specific terminology or brand names?

By using Custom Vocabulary.

Which AWS service is best suited for analyzing customer reviews to determine if the sentiment is positive, negative, neutral, or mixed?

Amazon Comprehend.

What specific feature of Amazon Comprehend is used to identify people, places, brands, and dates within a text document?

Entity Recognition.

What AWS service allows developers to build conversational interfaces (chatbots and voice bots) using voice and text?

Amazon Lex.

What specific AI technology does Amazon Lex use to determine the "intent" behind a user's input?

Natural Language Understanding (NLU).

Which AWS cloud-based contact center service integrates natively with Amazon Lex to create automated customer service flows?

Amazon Connect.

What is the primary use case for Amazon Polly?

To turn text into lifelike, natural-sounding speech (Text-to-Speech).

Which service enables the creation of audio for media from text (e.g., converting articles to speech) at a low cost?

Amazon Polly.

Is Amazon Polly an open-source framework that developers can download and modify?

No, it is a fully managed cloud AI service accessed via API.

Which service is capable of "transliterating" media in real-time?

Amazon Transcribe.

To detect the dominant language in a document before translating it, which service API can be called?

Amazon Comprehend (or Amazon Translate, which also has automatic language identification).

What is the phase in the ML pipeline involved in analyzing and visualizing data to detect anomalies and understand its structure?

Exploratory Data Analysis (EDA).

Which AWS service is used to label datasets for supervised learning tasks?

Amazon SageMaker Ground Truth.

What is the process of creating new input variables from existing data to improve model performance called?

Feature Engineering.

What is "Data Preprocessing"?

Transforming raw data into a clean format (cleaning, normalization, encoding) suitable for ML training.

What are the settings that control the training process (e.g., learning rate, batch size) which are set before training begins?

Hyperparameters.

Which Hyperparameter Tuning strategy involves systematically exploring a predefined set of values?

Grid Search.

Which Hyperparameter Tuning strategy treats tuning as a regression problem to find optimal values?

Bayesian Optimization.

What is the term for taking a pre-trained model (like BERT or ResNet) and fine-tuning it on a new, specific dataset?

Transfer Learning.

What is the primary benefit of using a Managed API Service (like SageMaker Endpoint) over a self-hosted API?

It handles underlying infrastructure, automatic scaling, and security patches automatically.

Which deployment method gives the user full control over the hardware, software, and network configuration (e.g., on EC2)?

Self-hosted API.

Which Amazon SageMaker tool provides a low-code visual interface to import, clean, transform, and visualize data?

Amazon SageMaker Data Wrangler.

What is the specific purpose of the "Offline Store" in Amazon SageMaker Feature Store?

Storing historical feature data for model training and batch inference.

What is the specific purpose of the "Online Store" in Amazon SageMaker Feature Store?

Providing low-latency access to features for real-time inference.

What critical issue does SageMaker Feature Store help prevent by ensuring features are consistent between training and inference?

Training-serving skew.

Which tool captures data from incoming requests and predictions to compare them against a training baseline?

Amazon SageMaker Model Monitor.

What is "Data Drift"?

A change in the statistical properties of input data over time, causing model performance to degrade.

Which SageMaker feature automates the end-to-end ML lifecycle (workflow orchestration)?

Amazon SageMaker Pipelines.

What is the "Validation Dataset" used for?

Assessing the model's performance on unseen data during the training phase to tune hyperparameters.

Which SageMaker tool centralizes model tracking and versioning?

Amazon SageMaker Model Registry.

What is "Imputation" in the context of Data Cleaning?

The process of replacing missing values in a dataset with substituted values (e.

What is the basic unit of text (word, part of a word, or character) that a Generative AI model processes and generates?

Token

What is the process of breaking down large documents into smaller, manageable sections to improve data processing and retrieval?

Chunking

What are the numerical representations of real-world objects or data points that capture relationships and meanings for AI systems?

Embeddings (or Vectors)

What is the term for natural language text used to request a specific task from a Generative AI model?

Prompt

What is the iterative process of refining inputs (choosing specific words, formats, and phrases) to guide a Generative AI model to the desired output?

Prompt Engineering

Which neural network architecture uses "self-attention" to process entire sequences of text in parallel and handle long-range dependencies?

Transformer-based Large Language Model (LLM)

What type of model is trained on massive amounts of generalized, unlabeled data (like the World Wide Web) and serves as a starting point for various downstream tasks?

Foundation Model (FM)

Which type of AI model can process and integrate multiple data types simultaneously, such as text, images, audio, and video?

Multi-modal model

Which type of Generative AI model creates photorealistic images by gradually removing "Gaussian noise" through a reverse process?

Diffusion Model

What is "Gaussian Noise" in the context of training image generation models?

Random noise (white noise) following a normal distribution added to images so the model learns to reconstruct them.

What is "Planogram Optimization" in retail AI use cases?

Using Generative AI to dynamically update virtual shelf layouts based on sales trends and inventory.

What are the seven stages of the Foundation Model Lifecycle in order?

Data Selection, Model Selection, Pre-training, Fine-tuning, Evaluation, Deployment, Feedback

Which data preparation technique involves enhancing a dataset by adding variations (e.g., different formats) to improve model robustness?

Data Augmentation

Which data preparation technique involves ensuring the dataset represents different categories equally to prevent model bias?

Data Balancing

What is the "Context Window" of a Foundation Model?

The maximum number of input tokens the model can consider at one time when making predictions.

Which AWS Marketplace model provider is known for "Jurassic" models focused on language generation?

AI21 Labs

Which AWS Marketplace model provider is known for "Claude" models, which focus on AI safety and alignment?

Anthropic

Which model family is specifically developed by AWS for a range of applications?

Amazon Titan

What distinguishes "Fine-tuning" from "Pre-training"?

Fine-tuning trains a pre-existing model on a smaller, labeled, task-specific dataset, whereas pre-training uses massive generalized datasets.

What is "Transfer Learning"?

Leveraging a pre-trained model and fine-tuning it on a specific task to reduce training time and resources.

Which regularization technique involves randomly ignoring neurons during training to prevent overfitting?

Dropout

What is "ROUGE" or "BLEU" typically used for in AI?

Automatic Evaluation metrics (commonly for text summarization and translation).

Which evaluation type assesses a model's ability to handle noisy, adverse, or adversarial inputs?

Robustness Testing

Which metric measures the harmonic mean of Precision and Recall?

F1 Score

Which metric specifically measures how well a probability distribution or language model predicts a sample?

Perplexity

What does a "Confusion Matrix" visualize?

The performance of a classification model by showing True Positives, False Positives, True Negatives, and False Negatives.

What are the three primary advantages of Generative AI described in this lesson?

Adaptability, Responsiveness, and Simplicity (in deployment and use).

Which GenAI advantage refers to the ability to be fine-tuned for industry-specific language or personalized user preferences?

Adaptability.

What is the main phenomenon where an AI system generates confident outputs that are factually incorrect, fabricated, or unreliable?

AI Hallucination.

What is the root cause of AI hallucinations?

LLMs predict responses based on probabilistic patterns in training data rather than having a true understanding of facts, leading to fabrications.

What specific AWS tool, used in Amazon Bedrock Guardrails, employs mathematical and logic-based verification to combat hallucinations?

Automated Reasoning checks.

What does the "Interpretability" disadvantage refer to?

The challenge in understanding and explaining how "black box" models (like generative adversarial networks or large language models) make decisions.

What is "Nondeterminism" in GenAI?

The characteristic of a system that can produce different outputs from the exact same input under varying conditions.

What is the primary operational risk posed by Nondeterminism?

It compromises the need for consistent, repeatable outcomes, making transparency and reliability difficult.

What is the first crucial step when selecting the appropriate Generative AI model type for a project?

Identifying the Use Cases (e.g., text summarization, image generation, etc.).

Which AWS service provides a Generative AI Best Practices Framework to help organizations continuously monitor and measure their compliance posture?

AWS Audit Manager.

What metric evaluates how well a GenAI model can perform across different domains or types of tasks?

Cross-Domain Performance (Generalization).

Which business metric measures the effectiveness of AI in influencing user behavior to achieve specific goals, such as sign-ups or purchases?

Conversion Rate.

What is "Ground Truth" in evaluation?

A set of known, correct answers used as a reliable benchmark against which AI-generated outputs are compared to determine accuracy.

Which three specific evaluation algorithms are mentioned in the course for measuring the accuracy of generated text?

BERTScore, ROUGE, and F1 Score.

Which specific evaluation metric is superior at measuring the semantic similarity (meaning) between generated text and the ground truth, even with paraphrasing?

BERTScore.

Which evaluation metric is the standard choice for measuring the recall of overlapping content (n-gram overlap) and is typically used for summarization tasks?

ROUGE (Recall-Oriented Understudy for Gisting Evaluation).

Which financial metric measures the average revenue generated from each active user over a specific time period?

Average Revenue Per User (ARPU).

Block 1 Flashcards

(114 cards)