Inner Developer Loop (seconds/minutes) Flashcards

Question

What is the purpose of self-validation in the output process?

Answer 1

To ensure the JSON matches the schema and fix any issues before responding ## Footnote This step helps maintain the integrity and reliability of the output.

Answer 2

python -m py_compile / yamllint ## Footnote This command checks for syntax errors in the JSON or YAML format.

Answer 3

Explicit schemas + self-checks ## Footnote This approach reduces the risk of format errors and ensures consistency.

Answer 4

You see actual validator output, not just 'it’s valid' ## Footnote This transparency helps in verifying the correctness of the output.

Answer 5

A consistent shape ## Footnote This helps in maintaining uniformity in the output format.

Answer 6

Never accept 'I did X' without a lightweight proof ## Footnote Train yourself and the agent into an evidence culture.

Answer 7

Evidence ## Footnote This is a hard rule to ensure accountability.

Answer 8

* Show the full updated file or a unified diff * List the exact files changed and show their contents ## Footnote This ensures transparency in modifications.

Answer 9

* Paste the test command * Include its exit code + summary ## Footnote Example: a short pytest summary, go test ./... exit status.

Answer 10

* Show the command you ran * Last 20 lines of output ## Footnote This helps verify the command's execution and results.

Answer 11

The error message verbatim ## Footnote This provides clarity on what went wrong.

Answer 12

Show the log snippet that proves it ## Footnote Example: last lines of the journal or status output.

Answer 13

Mitigates over-trust in agent claims ## Footnote Everything is backed by a log, diff, or output.

Answer 14

Hallucinated tools/paths ## Footnote You see the failure instead of a fake success.

Answer 15

Truncated runs are obvious in the partial logs ## Footnote This ensures that incomplete processes are identified.

Answer 16

Make it easy for the agent to know what actually exists, and forbid invention of tools/paths ## Footnote This stage emphasizes clarity in available commands and directories.

Answer 17

* [List of available commands not provided in the text] ## Footnote The commands should be explicitly listed in the environment brief.

Answer 18

* src/ * tests/ * config/ ## Footnote These directories are essential for organizing the project structure.

Answer 19

src/main.py ## Footnote This file serves as the starting point for the application.

Answer 20

docker-compose ## Footnote The environment relies on just and make instead.

Answer 21

ASK; do not invent new commands or directories ## Footnote This encourages communication and prevents errors.

Answer 22

It is corrected immediately ## Footnote This ensures adherence to the established commands and paths.

Answer 23

npm run test ## Footnote This is an example of correcting an invented command.

Answer 24

/config/ ## Footnote This specifies the correct location for configuration files.

Answer 25

* Hallucinated tools * Over-trust in claims * Literal/improvisational errors ## Footnote This clause helps tighten the agent's internal mental model of the stack.

Answer 26

Make inner-loop runs repeatable enough that you can replay them and understand what changed ## Footnote This stage emphasizes the importance of logging and summarizing actions taken during the process.

Answer 27

* Task description * Prompt (or summary) * Files touched * Tests run ## Footnote Keeping detailed logs helps in tracking changes and understanding the context of actions taken.

Answer 28

To log prompts and responses ## Footnote This log serves as a record of interactions and decisions made during the process.

Answer 29

A short 'run summary' from the agent ## Footnote This summary should include key details about the run for better understanding and documentation.

Answer 30

* Files changed * Commands run * Tests executed and their status ## Footnote This information provides a clear overview of what occurred during the run.

Answer 31

Re-use the same structure ## Footnote Consistency in structure helps maintain clarity and repeatability in the process.

Answer 32

Behavior ## Footnote Pinning behavior ensures that the core functionality remains unchanged throughout the refactoring process.

Answer 33

Constraints ## Footnote Reiterating constraints helps maintain focus on the goals and limitations of the task.

Answer 34

Having a record of 'what' + 'why' and consistent constraints ## Footnote This approach helps in understanding the rationale behind decisions and actions taken.

Answer 35

An explicit list of tests/commands ## Footnote This anchors what actually happened during the process, reducing reliance on assumptions.

Answer 36

What remains ## Footnote Summaries provide clarity on outstanding tasks and areas that need further attention.

Answer 37

Proactively avoid local timeouts by working in small batches ## Footnote This approach helps manage long-running tasks effectively.

Answer 38

Batch limits ## Footnote For example, 'Process at most 3 files in this run.'

Answer 39

Progress report ## Footnote This includes listing completed items, pending items, and any blockers.

Answer 40

Simple checklist ## Footnote This allows for easy tracking of updates and tasks that remain.

Answer 41

TRUE ## Footnote Chunked, checkpointed runs are easier to reproduce and reason about.

Answer 42

Over-trust ## Footnote Progress reports make partial completion visible.

Answer 43

Accept that the agent does 80–90%; you handle the final fit-and-finish ## Footnote This stage emphasizes the importance of human oversight in the final steps of a process.

Answer 44

* List edge cases your change might not cover * List potential risks or performance impacts of this change ## Footnote Identifying these issues helps mitigate risks before deployment.

Answer 45

* Match business intent * Keep API compatibility * Respect security constraints * Avoid obvious performance cliffs ## Footnote This checklist ensures that changes align with overall goals and standards.

Answer 46

Delegate them as tiny follow-up tasks ## Footnote Examples include adding tests for edge cases or optimizing code loops.

Answer 47

Merge, commit, or deploy only after your sanity check ## Footnote This ensures that you maintain control over the final output.

Answer 48

* Inner loop feels slower & last 10% gap * Literal/improvisational corrections * Over-trust in final judgment ## Footnote These points highlight the challenges and adjustments needed in the final stages of the process.

Answer 49

Work in a feature branch, never main ## Footnote This rule emphasizes the importance of using a controlled environment for development.

Answer 50

A scratch DB / test environment ## Footnote This ensures that changes are made in a safe and controlled manner.

Answer 51

Local files ## Footnote This practice helps prevent unintended changes to production settings.

Answer 52

Propose the changes, but don’t apply them. I’ll run them manually in staging. ## Footnote This approach maintains control over changes to infrastructure and data.

Answer 53

* Bad edits are git reset-able * Mistakes stay in a controlled environment * You can be aggressive with the agent without risking prod ## Footnote These factors contribute to minimizing risks during development.

Answer 54

Do not execute commands. Instead, output a numbered list of shell commands / SQL statements you would run. I will review and run them myself. ## Footnote This approach prevents accidental execution of harmful commands.

Answer 55

* A one-line explanation * Whether it’s safe to re-run ## Footnote This ensures clarity and safety in command execution.

Answer 56

To avoid surprise commands like rm -rf or DROP TABLE ## Footnote Reviewing commands allows for oversight and prevents unintended consequences.

Answer 57

You get the automation (it writes commands) without blind execution ## Footnote This maintains efficiency while ensuring safety.

Answer 58

Hyper-fast junior engineer, not a senior architect ## Footnote This mindset emphasizes speed and efficiency in tasks without assuming correctness.

Answer 59

* Draft code * Suggest commands * Propose refactors ## Footnote These capabilities highlight the agent's role in supporting development tasks.

Answer 60

* Set the goal * Set constraints * Approve changes ## Footnote These responsibilities ensure that the agent maintains focus and direction in development.

Answer 61

* Review diffs * Scan logs * Run tests * Sanity check reasoning ## Footnote These actions are crucial for maintaining code quality and ensuring correctness.

Answer 62

FALSE ## Footnote The agent must remain in review/guardrail mode to avoid incorrect assumptions.

Answer 63

To naturally fall into review/guardrail mode ## Footnote This approach aligns with reality, recognizing that agents excel in brute-force tasks rather than judgment.

Answer 64

To ensure safety by constraining potential damage ## Footnote This principle helps in managing risks associated with changes in code or services.

Answer 65

Specify the exact files or services to modify ## Footnote This prevents unintended changes to other parts of the system.

Answer 66

Reject large diffs to maintain safety ## Footnote If the proposed changes are too extensive, it prompts a reevaluation of the task.

Answer 67

FALSE ## Footnote The guideline emphasizes limiting changes to specific files or services.

Answer 68

Narrow the task and rerun ## Footnote This approach helps in managing the complexity and potential risks of changes.

Answer 69

Ask them to **show evidence** ## Footnote This ensures accountability and verifies the actions taken.

Answer 70

Request a **unified diff** of all files changed ## Footnote This provides a clear comparison of modifications.

Answer 71

The **pytest command** and its **exit code + summary** ## Footnote This helps confirm whether the tests were successful.

Answer 72

Ask for the **systemctl status** output ## Footnote This provides information on the current state of the service.

Answer 73

The **first 20 lines** of the updated config file ## Footnote This allows for a quick review of the changes made.

Answer 74

FALSE ## Footnote Always require evidence to ensure tasks are completed correctly.

Answer 75

* Silent failures * Hallucinated actions * Invented actions ## Footnote This promotes transparency and accountability in actions taken.

Answer 76

It encourages **transparency** and **accountability** ## Footnote This approach minimizes the risk of misinformation.

Answer 77

Migrate this module from library A to B ## Footnote This approach emphasizes making tasks smaller and more manageable.

Answer 78

Add a step to run pytest tests/fast/ in CI ## Footnote This keeps tasks focused and reduces complexity.

Answer 79

Refactor this function into smaller helpers ## Footnote This allows for easier review and implementation.

Answer 80

* Less room for misinterpretation * Easier to review * Timeout and non-determinism matter less ## Footnote Smaller tasks are more repeatable and manageable.

Answer 81

Use X instead ## Footnote This is the suggested approach to handle such situations.

Answer 82

To provide context for the sturcture/function of the environment (containment tree) ## Footnote It helps clarify the tools and structure being used.

Answer 83

You may only use the commands, tools, and paths I mention ## Footnote This rule ensures clarity and prevents confusion.

Answer 84

Makes hallucinations obvious ## Footnote This helps in identifying errors quickly.

Answer 85

cheap to detect ## Footnote This is a key goal of the project setup.

Answer 86

To provide an independent signal and catch subtle mistakes ## Footnote Running tests again is a cheap way to ensure safety and accuracy.

Answer 87

FALSE ## Footnote Re-running tests helps to catch any issues that may have been overlooked.

Answer 88

Run a linter or syntax check ## Footnote This ensures that the configuration files are correctly formatted.

Answer 89

Click through generated links ## Footnote This helps verify that the links are functional and lead to the correct destinations.

Answer 90

It protects you from potential errors that could arise later ## Footnote Subtle mistakes can lead to significant issues if not addressed.

Answer 91

cheap ## Footnote This emphasizes the cost-effectiveness of re-validation.

Answer 92

To track changes and manage modifications in code or documents ## Footnote Version control allows for safe experimentation and easy reversion to previous states.

Answer 93

* Use git diff to review changes * Restore or reset to a previous state * Try again with smaller changes ## Footnote This process helps maintain clean and manageable edits.

Answer 94

FALSE ## Footnote Discarding messy edits is encouraged to maintain quality.

Answer 95

* Keeps changes manageable * Allows for easy reversion * Reduces the feeling of being trapped by bad edits ## Footnote Frequent commits encourage a more organized workflow.

Answer 96

Final gatekeeper ## Footnote The user is responsible for approving changes and ensuring quality.

Answer 97

Secrets and sensitive data ## Footnote This includes production secrets, private keys, and customer PII.

Answer 98

Abstract it ## Footnote For example, say 'Assume there is an environment variable PAYMENT_API_KEY.'

Answer 99

Redact any logs ## Footnote This prevents accidental leaks of sensitive information.

Answer 100

TRUE ## Footnote It reduces the risk of obvious security/compliance landmines.

Answer 101

Lowering mental overhead ## Footnote It alleviates concerns about accidentally leaking information.

Answer 102

Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This policy helps to prevent endless thrashing and frustration.

Answer 103

Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This approach helps to avoid shipping weird half-broken changes out of annoyance.

Answer 104

Set a policy: If I haven’t made real progress with the agent in 10–15 minutes on this micro-task, I’ll do it manually and move on. ## Footnote This reminder reinforces that the agent is a tool, not an obligation.

Answer 105

TRUE ## Footnote This policy is designed to keep you from getting stuck in unproductive cycles.

Answer 106

The final polishing and optimization phase ## Footnote This phase focuses on refining the work done to ensure quality and performance.

Answer 107

80–90% ## Footnote The agent is expected to deliver a substantial portion of the project, leaving the final adjustments to be made.

Answer 108

* Polish edge cases * Optimize performance * Align naming and style * Sanity-check correctness ## Footnote These tasks ensure that the project meets quality standards and functions as intended.

Answer 109

FALSE ## Footnote The approach emphasizes planning for a human finish rather than waiting for an ideal outcome.

Answer 110

List 3–5 **edge cases** this change might not handle well ## Footnote Identifying edge cases helps in understanding potential shortcomings in the project.

Answer 111

To ensure quality and performance ## Footnote This phase is crucial for addressing any remaining issues and enhancing the overall product.

Inner Developer Loop (seconds/minutes) Flashcards

(135 cards)