What is the core purpose of the extract_invoice_data tool?
To take messy invoice text and convert it into a clean, consistent JSON structure using a fixed schema.
Why does the invoice extractor use a fixed schema?
To ensure all invoices are processed consistently and predictably, regardless of format.
What is the role of the store_invoice tool?
It saves the extracted invoice data into a storage dictionary, indexed by invoice number.
Why must invoice numbers be required in the schema?
Because the storage tool uses invoice numbers as unique keys; without them, it cannot save or update invoices.
What are the three major tasks the agent must perform for each invoice?
Extract data → Store data → Confirm success.
How does self-prompting help in extraction?
The tool sends its own carefully-crafted prompt to the LLM, enabling specialized invoice understanding without cluttering the agent’s main reasoning.
Why is tool modularity important?
It hides complexity inside tools, preventing the core agent from becoming overloaded or fragile.
What is “horizontal scaling” of agents?
Expanding agent capabilities by adding tools rather than expanding the main system prompt.
Why is this architecture easier to maintain?
Tools can be updated or swapped independently without touching core agent logic.
How do tags (e.g., [“invoices”, “document_processing”]) help?
They help the agent understand when to use a particular tool.
Why is the extraction prompt so detailed?
Rich guidance improves accuracy and helps avoid hallucinated fields.
What type of storage mechanism is used in this example?
A simple Python dictionary stored in the agent’s context (but can be replaced with a real database).