An eight-layer vertical stack representing what production agents require beyond a prototype. The stack is read and built from the foundation upward — each layer rests on the ones below it. Grouped from top (most operational) to bottom (most foundational): the Operations group at the top contains Security and Observability; the Optimisation group in the middle contains Cost, Latency, and Quality; the Foundation group at the bottom contains Evaluation and guardrails, Tools and memory and knowledge, and Task decomposition. Quality appears before Latency and Cost in the build order, reflecting the article's principle of nailing quality first before optimising for speed and spend.

  • Security
    Prevent unsafe actions, leakage, and resource abuse
  • Observability
    Zoom in on traces, zoom out on trends
  • Cost
    Measure, cache, batch, constrain outputs
  • Latency
    Baseline, parallelise, right-size, trim context
  • Quality
    Improve non-LLM and LLM components
  • Evaluation / guardrails
    Check quality before output ships
  • Tools / memory / knowledge
    Give the agent capability and context
  • Task decomposition
    Split work into small, checkable steps
Production agents are layered systems: first make them work well, then make them fast, affordable, observable, and safe.