Context Management Flashcards

Question 1

Q

Context Poisoning

Answer

A

Misinformation in the context (e.g., goals, summary)
can cause the model to become fixated on impossible or irrelevant objectives and lead to nonsensical strategies.

Question 2

Q

Context Distraction

Answer

A

The context grows so long that the model over-focuses on history, “forgetting” what it learned in training.
Results in repeating past actions rather than synthesizing novel plans.

Question 3

Q

Context Confusion

Answer

A

Irrelevant content forces the model to pay attention to it
leading to low-quality responses

Question 4

Q

Context Quarantine

Answer

A

Isolating contexts in dedicated threads, often facilitated by subagents, to explore different aspects of a question in parallel and then condense the most important tokens for a lead agent.

Question 5

Q

Context Clash

Answer

A

New information or tools in the context directly conflict with other existing information, derailing the model’s reasoning.

Question 6

Q

Context Pruning

Answer

A

Actively removing irrelevant or unneeded information from the context
score sentences or chunks of conversation based on their relevance to the current goal or user query. Irrelevant or low-scoring content is dynamically removed.
use provence

Question 7

Q

Context Summarization

Answer

A

Using a separate LLM to condense conversation history.

Question 8

Q

Context Offloading

Answer

A

Storing information outside the LLM’s active context via a tool, often referred to as a “scratchpad”. Particularly effective for detailed tool output analysis, policy-heavy environments, and sequential decision-making.

Question 9

Q

What is provence

Answer

A

lightweight context pruning model for RAG
fast, simple to use, small (1.75GB)

Question 10

Q

Why should you use a map for conversation history?

Answer

A

Maintain a map of context for easier pruning.
separate instructions and goals from conversation history

Question 11

Q

How many tokens does it take for a model to experience context distraction?

Answer

A

Beyond 100k tokens in some cases, or around 32k for smaller models like Llama 3.1-405b)

Question 12

Q

Rolling Window

Answer

A

Truncation
This simple method involves keeping only the most recent ‘N’ messages or a fixed number of tokens, discarding older information entirely once the context window limit is reached.

Question 13

Q

RAG as a context management strategy

Answer

A

Rather than keeping all history in the active context, details are offloaded to an external database (like a vector database). When information is needed, the system retrieves only the necessary, relevant data based on the current query, ensuring the active context remains lean and focused.
Hierarchical Memory Management: This approach uses a tiered system where critical information (e.g., system instructions, key decisions) is preserved, recent history is kept in detail, and older or less relevant content is summarized or removed entirely.

Question 14

Q

Hierarchical Memory Management

Answer

A

This approach uses a tiered system where critical information (e.g., system instructions, key decisions) is preserved, recent history is kept in detail, and older or less relevant content is summarized or removed entirely)

Context Management Flashcards

(14 cards)