Dsign Considerations for Apps Flashcards

Question 1

Q

What are the two primary trade-offs associated with using larger Foundation Models (more parameters)?

Answer

A

They capture more complex patterns but require more computational resources (cost) and have higher latency.

Question 2

Q

Which model selection criterion ensures a model can handle long documents for tasks like summarization without truncation?

Answer

A

Input Length (or Context Window).

Question 3

Q

If a real-time application like a customer service chatbot requires faster responses, which Bedrock runtime API parameter should be adjusted?

Answer

A

Set the latency parameter to “optimized”.

Question 4

Q

What is the effect of setting a low temperature value (e.g., near 0) during inference?

Answer

A

The output becomes more deterministic, focused, and consistent (steep probability distribution).

Question 5

Q

What is the effect of setting a high temperature value during inference?

Answer

A

The output becomes more random, creative, and diverse (flat probability distribution).

Question 6

Q

Which inference parameter allows you to control the verbosity or conciseness of the generated response?

Answer

A

Output Length (Max Tokens).

Question 7

Q

What is Retrieval-Augmented Generation (RAG)?

Answer

A

A method to improve model accuracy by retrieving relevant data from external knowledge bases and adding it to the context before generation.

Question 8

Q

What are “Embeddings” in the context of RAG?

Answer

A

Numerical representations (vectors) of text/data that capture semantic meaning.

Question 9

Q

What algorithm is typically used by vector databases to find relevant embeddings for a user query?

Answer

A

k-nearest neighbor (k-NN) similarity search.

Question 10

Q

Which AWS service supports vector search using the pgvector extension?

Answer

A

Amazon Aurora PostgreSQL (or Amazon RDS for PostgreSQL).

Question 11

Q

Which AWS Graph Database service supports vector indexing via Neptune Analytics?

Answer

A

Amazon Neptune.

Question 12

Q

Which AWS document database (MongoDB compatible) supports storing and searching vector embeddings?

Answer

A

Amazon DocumentDB.

Question 13

Q

What is the primary cost advantage of “In-Context Learning” over “Fine-Tuning”?

Answer

A

It has lower initial costs because it requires no model training or weight updates.

Question 14

Q

Comparing RAG and Fine-Tuning, which method typically offers lower latency during the inference phase?

Answer

A

Fine-Tuning (because it avoids the external retrieval step).

Question 15

Q

What is “Continued Pre-training”?

Answer

A

Training a foundation model further on large amounts of unlabeled domain-specific data to improve general domain knowledge.

Question 16

Q

What are “Agents” used for in Generative AI applications?

Answer

Study These Flashcards

A

To orchestrate multi-step tasks, break down complex problems, and interact with external APIs/systems.

Question 17

Q

In a multi-agent system, what is the role of a “Supervisor Agent”?

Answer

Study These Flashcards

A

To coordinate and orchestrate the efforts of specialized agents to ensure tasks are executed in the correct sequence.

Question 18

Q

What is “Metadata Filtering” in a RAG application?

Answer

Study These Flashcards

A

Refining search results based on custom attributes (tags) to reduce noise and retrieve more relevant context.

Question 19

Q

Which customization method is best for adapting a model to a specific task using a labeled dataset?

Answer

Study These Flashcards

A

Fine-tuning.

Question 20

Q

What are the four steps of the Amazon OpenSearch vector search process?

Answer

Study These Flashcards

A

Configure permissions, Create collection, Upload data (embeddings), Perform similarity search.

Dsign Considerations for Apps Flashcards

(20 cards)