Dsign Considerations for Apps Flashcards

(20 cards)

1
Q

What are the two primary trade-offs associated with using larger Foundation Models (more parameters)?

A

They capture more complex patterns but require more computational resources (cost) and have higher latency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which model selection criterion ensures a model can handle long documents for tasks like summarization without truncation?

A

Input Length (or Context Window).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a real-time application like a customer service chatbot requires faster responses, which Bedrock runtime API parameter should be adjusted?

A

Set the latency parameter to “optimized”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the effect of setting a low temperature value (e.g., near 0) during inference?

A

The output becomes more deterministic, focused, and consistent (steep probability distribution).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the effect of setting a high temperature value during inference?

A

The output becomes more random, creative, and diverse (flat probability distribution).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which inference parameter allows you to control the verbosity or conciseness of the generated response?

A

Output Length (Max Tokens).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Retrieval-Augmented Generation (RAG)?

A

A method to improve model accuracy by retrieving relevant data from external knowledge bases and adding it to the context before generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are “Embeddings” in the context of RAG?

A

Numerical representations (vectors) of text/data that capture semantic meaning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What algorithm is typically used by vector databases to find relevant embeddings for a user query?

A

k-nearest neighbor (k-NN) similarity search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which AWS service supports vector search using the pgvector extension?

A

Amazon Aurora PostgreSQL (or Amazon RDS for PostgreSQL).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which AWS Graph Database service supports vector indexing via Neptune Analytics?

A

Amazon Neptune.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which AWS document database (MongoDB compatible) supports storing and searching vector embeddings?

A

Amazon DocumentDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the primary cost advantage of “In-Context Learning” over “Fine-Tuning”?

A

It has lower initial costs because it requires no model training or weight updates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Comparing RAG and Fine-Tuning, which method typically offers lower latency during the inference phase?

A

Fine-Tuning (because it avoids the external retrieval step).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is “Continued Pre-training”?

A

Training a foundation model further on large amounts of unlabeled domain-specific data to improve general domain knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are “Agents” used for in Generative AI applications?

A

To orchestrate multi-step tasks, break down complex problems, and interact with external APIs/systems.

17
Q

In a multi-agent system, what is the role of a “Supervisor Agent”?

A

To coordinate and orchestrate the efforts of specialized agents to ensure tasks are executed in the correct sequence.

18
Q

What is “Metadata Filtering” in a RAG application?

A

Refining search results based on custom attributes (tags) to reduce noise and retrieve more relevant context.

19
Q

Which customization method is best for adapting a model to a specific task using a labeled dataset?

20
Q

What are the four steps of the Amazon OpenSearch vector search process?

A

Configure permissions, Create collection, Upload data (embeddings), Perform similarity search.