Inner Developer Loop (seconds/minutes) Flashcards

(135 cards)

1
Q

What is the Inner Developer Loop?

A

You + an AI agent in a tight feedback cycle

Measured in seconds to a few minutes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the time scale for the Inner Loop?

A
  • Seconds to minutes
  • Rarely more than ~5–10 minutes per micro-iteration

It’s the loop felt while actively coding, not overnight jobs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

List the basic cycle steps of the Inner Loop.

A
  • Observe
  • Instruct
  • Agent acts
  • Feedback / Inspect
  • Adjust

This cycle emphasizes rapid, low-cost experimentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are micro-tasks in the Inner Loop?

A
  • Small, sharply scoped tasks
  • Easy to judge quickly

Examples include renaming functions or adding input validation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does rapid hypothesis / trial-and-error involve?

A
  • Trying changes through the agent
  • Seeing outcomes and iterating

This is experimental coding powered by AI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define clarification and steering in the Inner Loop.

A
  • Repeatedly clarifying intent
  • Adjusting requests based on feedback

This is interactive negotiation to refine agent behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What three things should the user be checking in local validation during the Inner Loop?

A
  • Running unit tests
  • Linting/formatting code
  • Quick manual inspection

Fast checks validate the agent’s actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or false: Underspecified instructions can lead to unclear outcomes.

A

TRUE

Vague prompts can result in the agent doing something unintended.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the consequences of overly large tasks in the Inner Loop?

A
  • Agent times out
  • Giant, messy diffs
  • Hard to review

Chunking tasks into smaller pieces is essential.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the impact of weak feedback in the Inner Loop?

A
  • Vague signals lead to poor agent performance

Specific feedback helps guide the agent’s next steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does it mean to trust without verifying in the Inner Loop?

A
  • Accepting agent claims without checks

This can lead to undetected bugs and higher costs later.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens when you let hallucinations slide?

A
  • Accepting incorrect references or functions

Immediate corrections keep the loop grounded in reality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the four characteristics of a good inner loop practice?

A
  • Micro-sized tasks
  • Specific and fast feedback
  • Tight coupling to fast checks
  • Small, reversible steps

These practices enhance the efficiency of the inner loop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Four roles of the developer in the inner loop?

A
  • Intent setting
  • Constraint setting
  • Quality judgment
  • Boundary enforcement

You are not just a typist; you guide the process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe a concrete example of the Inner Loop in action (all four steps)

A
  • Observe a test failure
  • Instruct the agent to fix it
  • Agent acts and shows output
  • Inspect and adjust if needed

This sequence can take one or two minutes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why does the Inner Loop matter?

A
  • Affects trust in agents
  • Influences task scaling
  • Impacts overall workflow efficiency

A fast, reliable inner loop encourages using agents for larger tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the goal of Stage 0 in Triage & Task Sizing?

A

Make the task small, clear, and bounded

This helps prevent going off on wild tangents or timing out.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Common classifications for tasks in Stage 0.

A
  • Bugfix / small feature
  • Refactor / cleanup
  • Config / infra change
  • Docs / tests

These classifications help in organizing the tasks effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What should be constrained in scope during Stage 0?

A

An inner-loop chunk

This means focusing on one module or feature at a time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the target outcome of Stage 0?

A

Fix this test, add this config flag, or refactor this function

This keeps the focus on specific, manageable tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the must not change constraints in Stage 0?

A
  • Public APIs that must stay
  • Files that must remain untouched
  • Systems that must not be restarted

These constraints help mitigate risks during the task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does Stage 0 aim to mitigate regarding task execution?

A
  • Literal but improvisational
  • Non-determinism
  • Timeouts
  • Last 10% gap

These mitigations help maintain focus and manageability in task execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the goal of Stage 2 in the formatting process?

A

Make strict outputs robust by contract + auto-check

This stage focuses on ensuring the output adheres to a defined format and is validated for correctness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What should the response contain according to the defined format?

A

ONLY valid JSON following a specified schema

This ensures that the response is structured and easily interpretable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the purpose of self-validation in the output process?
To ensure the JSON matches the schema and fix any issues before responding ## Footnote This step helps maintain the integrity and reliability of the output.
26
What command can be used to run a syntax check on the JSON output?
python -m py_compile / yamllint ## Footnote This command checks for syntax errors in the JSON or YAML format.
27
What does Stage 2 aim to mitigate regarding output format fragility?
Explicit schemas + self-checks ## Footnote This approach reduces the risk of format errors and ensures consistency.
28
What is one of the risks associated with over-trust in claims?
You see actual validator output, not just 'it’s valid' ## Footnote This transparency helps in verifying the correctness of the output.
29
What does a tight schema tend to collapse variation into?
A consistent shape ## Footnote This helps in maintaining uniformity in the output format.
30
What is the **goal** of the Evidence-First Rule for Actions & Claims?
Never accept 'I did X' without a lightweight proof ## Footnote Train yourself and the agent into an evidence culture.
31
What must the agent show for any **non-trivial claim**?
Evidence ## Footnote This is a hard rule to ensure accountability.
32
For file edits, what is required to show evidence?
* Show the full updated file or a unified diff * List the exact files changed and show their contents ## Footnote This ensures transparency in modifications.
33
What should be provided as evidence for **tests**?
* Paste the test command * Include its exit code + summary ## Footnote Example: a short pytest summary, go test ./... exit status.
34
What evidence is required for **scripts/commands**?
* Show the command you ran * Last 20 lines of output ## Footnote This helps verify the command's execution and results.
35
What should be shown if a command **fails**?
The error message verbatim ## Footnote This provides clarity on what went wrong.
36
For **ops tasks**, what evidence should be provided when claiming a service was restarted?
Show the log snippet that proves it ## Footnote Example: last lines of the journal or status output.
37
What is the benefit of quickly scanning the evidence before marking a step as **done**?
Mitigates over-trust in agent claims ## Footnote Everything is backed by a log, diff, or output.
38
What does the Evidence-First Rule help to avoid regarding **invented commands**?
Hallucinated tools/paths ## Footnote You see the failure instead of a fake success.
39
What issue does the Evidence-First Rule address regarding **timeouts**?
Truncated runs are obvious in the partial logs ## Footnote This ensures that incomplete processes are identified.
40
What is the goal of **Stage 4** in the environment grounding?
Make it easy for the agent to know what actually exists, and forbid invention of tools/paths ## Footnote This stage emphasizes clarity in available commands and directories.
41
What are the **available commands** mentioned in Stage 4?
* [List of available commands not provided in the text] ## Footnote The commands should be explicitly listed in the environment brief.
42
Where are the **key directories** located in the environment?
* src/ * tests/ * config/ ## Footnote These directories are essential for organizing the project structure.
43
What is the **main app entry** file in the environment?
src/main.py ## Footnote This file serves as the starting point for the application.
44
What tool is NOT used in this environment according to Stage 4?
docker-compose ## Footnote The environment relies on just and make instead.
45
What should you do if you are unsure about commands or paths?
ASK; do not invent new commands or directories ## Footnote This encourages communication and prevents errors.
46
What happens when **invention** occurs in the environment?
It is corrected immediately ## Footnote This ensures adherence to the established commands and paths.
47
What command should be used instead of **yarn test**?
npm run test ## Footnote This is an example of correcting an invented command.
48
Where do the **configs** live in the environment?
/config/ ## Footnote This specifies the correct location for configuration files.
49
What does the explicit **no-invent** clause aim to mitigate?
* Hallucinated tools * Over-trust in claims * Literal/improvisational errors ## Footnote This clause helps tighten the agent's internal mental model of the stack.
50
What is the goal of **Stage 5** in the process?
Make inner-loop runs repeatable enough that you can replay them and understand what changed ## Footnote This stage emphasizes the importance of logging and summarizing actions taken during the process.
51
What should you log in the **repo** or a scratch log?
* Task description * Prompt (or summary) * Files touched * Tests run ## Footnote Keeping detailed logs helps in tracking changes and understanding the context of actions taken.
52
What is the purpose of the **agents.log** or **AGENT_NOTES.md**?
To log prompts and responses ## Footnote This log serves as a record of interactions and decisions made during the process.
53
What should you ask for at the end of a run?
A short 'run summary' from the agent ## Footnote This summary should include key details about the run for better understanding and documentation.
54
What should the run summary include?
* Files changed * Commands run * Tests executed and their status ## Footnote This information provides a clear overview of what occurred during the run.
55
What should you do if you need to **re-run** the process?
Re-use the same structure ## Footnote Consistency in structure helps maintain clarity and repeatability in the process.
56
What should you keep the same during multi-pass refactors?
Behavior ## Footnote Pinning behavior ensures that the core functionality remains unchanged throughout the refactoring process.
57
What should you reiterate every pass?
Constraints ## Footnote Reiterating constraints helps maintain focus on the goals and limitations of the task.
58
What does mitigating **2.5 Non-determinism** at the prompt level involve?
Having a record of 'what' + 'why' and consistent constraints ## Footnote This approach helps in understanding the rationale behind decisions and actions taken.
59
What does mitigating **2.3 Over-trust** involve?
An explicit list of tests/commands ## Footnote This anchors what actually happened during the process, reducing reliance on assumptions.
60
What does mitigating **2.7 Last 10% gap** help you see?
What remains ## Footnote Summaries provide clarity on outstanding tasks and areas that need further attention.
61
What is the goal of **chunking for timeouts** and long tasks?
Proactively avoid local timeouts by working in small batches ## Footnote This approach helps manage long-running tasks effectively.
62
What should be set to limit processing in a run?
Batch limits ## Footnote For example, 'Process at most 3 files in this run.'
63
What is required to ensure progress is tracked?
Progress report ## Footnote This includes listing completed items, pending items, and any blockers.
64
What design principle helps with resumability in tasks?
Simple checklist ## Footnote This allows for easy tracking of updates and tasks that remain.
65
True or false: Chunking tasks can mitigate non-determinism.
TRUE ## Footnote Chunked, checkpointed runs are easier to reproduce and reason about.
66
What does chunking help to avoid in task management?
Over-trust ## Footnote Progress reports make partial completion visible.
67
What is the goal of **Stage 7** in the process?
Accept that the agent does 80–90%; you handle the final fit-and-finish ## Footnote This stage emphasizes the importance of human oversight in the final steps of a process.
68
What should you ask the agent regarding potential issues?
* List edge cases your change might not cover * List potential risks or performance impacts of this change ## Footnote Identifying these issues helps mitigate risks before deployment.
69
What is included in your own quick review checklist?
* Match business intent * Keep API compatibility * Respect security constraints * Avoid obvious performance cliffs ## Footnote This checklist ensures that changes align with overall goals and standards.
70
How should you use the agent for micro-fixes?
Delegate them as tiny follow-up tasks ## Footnote Examples include adding tests for edge cases or optimizing code loops.
71
What is your responsibility regarding the final approval?
Merge, commit, or deploy only after your sanity check ## Footnote This ensures that you maintain control over the final output.
72
What does the term **mitigates** refer to in this context?
* Inner loop feels slower & last 10% gap * Literal/improvisational corrections * Over-trust in final judgment ## Footnote These points highlight the challenges and adjustments needed in the final stages of the process.
73
What is the **sandbox first** rule?
Work in a feature branch, never main ## Footnote This rule emphasizes the importance of using a controlled environment for development.
74
In the **sandbox first** rule, what type of environment should you work in?
A scratch DB / test environment ## Footnote This ensures that changes are made in a safe and controlled manner.
75
What should you use instead of **prod configs**?
Local files ## Footnote This practice helps prevent unintended changes to production settings.
76
If the agent wants to touch **infra or data**, what should you say?
Propose the changes, but don’t apply them. I’ll run them manually in staging. ## Footnote This approach maintains control over changes to infrastructure and data.
77
Why does the **sandbox first** rule keep you safe?
* Bad edits are git reset-able * Mistakes stay in a controlled environment * You can be aggressive with the agent without risking prod ## Footnote These factors contribute to minimizing risks during development.
78
What should you do when faced with **dangerous commands**?
Do not execute commands. Instead, output a numbered list of shell commands / SQL statements you would run. I will review and run them myself. ## Footnote This approach prevents accidental execution of harmful commands.
79
For each command, what should you include?
* A one-line explanation * Whether it’s safe to re-run ## Footnote This ensures clarity and safety in command execution.
80
Why is it important to review commands before execution?
To avoid surprise commands like rm -rf or DROP TABLE ## Footnote Reviewing commands allows for oversight and prevents unintended consequences.
81
What is the benefit of this command review process?
You get the automation (it writes commands) without blind execution ## Footnote This maintains efficiency while ensuring safety.
82
What mindset should the **agent** adopt?
Hyper-fast junior engineer, not a senior architect ## Footnote This mindset emphasizes speed and efficiency in tasks without assuming correctness.
83
List the capabilities of the **agent**.
* Draft code * Suggest commands * Propose refactors ## Footnote These capabilities highlight the agent's role in supporting development tasks.
84
What are the responsibilities of the **agent**?
* Set the goal * Set constraints * Approve changes ## Footnote These responsibilities ensure that the agent maintains focus and direction in development.
85
What actions should the **agent** always perform?
* Review diffs * Scan logs * Run tests * Sanity check reasoning ## Footnote These actions are crucial for maintaining code quality and ensuring correctness.
86
True or false: The agent should assume that **AI = correct**.
FALSE ## Footnote The agent must remain in review/guardrail mode to avoid incorrect assumptions.
87
Why is it important for the agent to stop assuming **AI = correct**?
To naturally fall into review/guardrail mode ## Footnote This approach aligns with reality, recognizing that agents excel in brute-force tasks rather than judgment.
88
What is the **main principle** behind limiting the blast radius of every task?
To ensure safety by constraining potential damage ## Footnote This principle helps in managing risks associated with changes in code or services.
89
When requesting a change, what should you specify regarding the **scope**?
Specify the exact files or services to modify ## Footnote This prevents unintended changes to other parts of the system.
90
What is the **max-change rule** intended to do?
Reject large diffs to maintain safety ## Footnote If the proposed changes are too extensive, it prompts a reevaluation of the task.
91
True or false: You should modify multiple files at once when making changes.
FALSE ## Footnote The guideline emphasizes limiting changes to specific files or services.
92
What should you do if the **diff** is huge?
Narrow the task and rerun ## Footnote This approach helps in managing the complexity and potential risks of changes.
93
What should you do when the agent says, **“I did X”**?
Ask them to **show evidence** ## Footnote This ensures accountability and verifies the actions taken.
94
What is an example of a request for evidence when an agent claims to have made changes?
Request a **unified diff** of all files changed ## Footnote This provides a clear comparison of modifications.
95
When verifying a test command, what should you ask for?
The **pytest command** and its **exit code + summary** ## Footnote This helps confirm whether the tests were successful.
96
What command can you request to check the status of a service after a restart?
Ask for the **systemctl status** output ## Footnote This provides information on the current state of the service.
97
What should you print to verify configuration changes?
The **first 20 lines** of the updated config file ## Footnote This allows for a quick review of the changes made.
98
True or false: You should mentally mark something as **“done”** without evidence.
FALSE ## Footnote Always require evidence to ensure tasks are completed correctly.
99
What does requiring evidence help to expose?
* Silent failures * Hallucinated actions * Invented actions ## Footnote This promotes transparency and accountability in actions taken.
100
Why is it important to force the agent into a more **honest, tool-driven posture**?
It encourages **transparency** and **accountability** ## Footnote This approach minimizes the risk of misinformation.
101
What should you do instead of saying: **Migrate the whole codebase**?
Migrate this module from library A to B ## Footnote This approach emphasizes making tasks smaller and more manageable.
102
What is a better alternative to saying: **Rewrite the entire pipeline**?
Add a step to run pytest tests/fast/ in CI ## Footnote This keeps tasks focused and reduces complexity.
103
Instead of saying: **Refactor the whole service**, what should you say?
Refactor this function into smaller helpers ## Footnote This allows for easier review and implementation.
104
Why does making tasks tiny keep you safe?
* Less room for misinterpretation * Easier to review * Timeout and non-determinism matter less ## Footnote Smaller tasks are more repeatable and manageable.
105
What should you do if you encounter a **non-existent config file/path/command**?
Use X instead ## Footnote This is the suggested approach to handle such situations.
106
What is the purpose of the environment description at the top of the prompt?
To provide context for the sturcture/function of the environment (containment tree) ## Footnote It helps clarify the tools and structure being used.
107
What is the strict rule regarding commands, tools, and paths?
You may only use the commands, tools, and paths I mention ## Footnote This rule ensures clarity and prevents confusion.
108
What benefit does reducing time wasted on chasing non-existent things provide?
Makes hallucinations obvious ## Footnote This helps in identifying errors quickly.
109
Fill in the blank: The project is structured to make hallucinations _______.
cheap to detect ## Footnote This is a key goal of the project setup.
110
What is the purpose of running quick validation tests after an agent finishes?
To provide an independent signal and catch subtle mistakes ## Footnote Running tests again is a cheap way to ensure safety and accuracy.
111
True or false: It is unnecessary to run tests again if the agent has already done so.
FALSE ## Footnote Re-running tests helps to catch any issues that may have been overlooked.
112
What should you do for config / YAML files after the agent finishes?
Run a linter or syntax check ## Footnote This ensures that the configuration files are correctly formatted.
113
What is a recommended action for documentation after the agent finishes?
Click through generated links ## Footnote This helps verify that the links are functional and lead to the correct destinations.
114
What is the benefit of catching subtle mistakes during validation?
It protects you from potential errors that could arise later ## Footnote Subtle mistakes can lead to significant issues if not addressed.
115
Fill in the blank: Running tests again is a _______ way to ensure safety and accuracy.
cheap ## Footnote This emphasizes the cost-effectiveness of re-validation.
116
What is the purpose of using **version control**?
To track changes and manage modifications in code or documents ## Footnote Version control allows for safe experimentation and easy reversion to previous states.
117
What should you do if an agent’s change looks **messy**?
* Use git diff to review changes * Restore or reset to a previous state * Try again with smaller changes ## Footnote This process helps maintain clean and manageable edits.
118
True or false: You should feel **sentimental** about your work when using version control.
FALSE ## Footnote Discarding messy edits is encouraged to maintain quality.
119
What does frequent small **commits** or **stashes** help you achieve?
* Keeps changes manageable * Allows for easy reversion * Reduces the feeling of being trapped by bad edits ## Footnote Frequent commits encourage a more organized workflow.
120
What role does the user play when using version control?
Final gatekeeper ## Footnote The user is responsible for approving changes and ensuring quality.
121
What should you keep out of the loop to ensure safety?
Secrets and sensitive data ## Footnote This includes production secrets, private keys, and customer PII.
122
What is the recommended action if an agent needs a secret?
Abstract it ## Footnote For example, say 'Assume there is an environment variable PAYMENT_API_KEY.'
123
What should you do before sharing logs containing sensitive info?
Redact any logs ## Footnote This prevents accidental leaks of sensitive information.
124
True or false: Keeping secrets and sensitive data out of the loop helps avoid security and compliance issues.
TRUE ## Footnote It reduces the risk of obvious security/compliance landmines.
125
What does keeping secrets and sensitive data out of the loop help with?
Lowering mental overhead ## Footnote It alleviates concerns about accidentally leaking information.
126
What should you do if you catch yourself **arguing over the same function** three times?
Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This policy helps to prevent endless thrashing and frustration.
127
If you find yourself **fixing the same issue** over and over, what action should you take?
Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This approach helps to avoid shipping weird half-broken changes out of annoyance.
128
What is a sign that you should **clean up after** the agent more than it helps?
Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This reminder reinforces that the agent is a tool, not an obligation.
129
True or false: Setting a policy to move on after 10–15 minutes of no progress with the agent helps prevent frustration.
TRUE ## Footnote This policy is designed to keep you from getting stuck in unproductive cycles.
130
What is the **last 10%** of a project typically considered to be?
The final polishing and optimization phase ## Footnote This phase focuses on refining the work done to ensure quality and performance.
131
In the context of project completion, what percentage of work is assumed to be done by the agent?
80–90% ## Footnote The agent is expected to deliver a substantial portion of the project, leaving the final adjustments to be made.
132
What are some tasks you are responsible for in the final phase of a project? List at least three.
* Polish edge cases * Optimize performance * Align naming and style * Sanity-check correctness ## Footnote These tasks ensure that the project meets quality standards and functions as intended.
133
True or false: You should wait for perfection before starting the final phase of a project.
FALSE ## Footnote The approach emphasizes planning for a human finish rather than waiting for an ideal outcome.
134
What should you explicitly ask the agent regarding potential issues in a project?
List 3–5 **edge cases** this change might not handle well ## Footnote Identifying edge cases helps in understanding potential shortcomings in the project.
135
What is the main goal of the final phase of a project?
To ensure quality and performance ## Footnote This phase is crucial for addressing any remaining issues and enhancing the overall product.