The most dangerous agent failures aren't the ones that throw errors. They're the ones that look fine on dashboards: an agent that confirms an order modification it never executed, another that skips an approval step while dashboards show a 99% success rate. AWS just shipped a set of capabilities on Bedrock AgentCore that turn production traces into a continuous improvement loop, closing the gap between a capable model and an agent that actually delivers in production.
Three Layers of Knowledge: Organizational, Web, and Paid
Your most valuable information sits in SharePoint, Google Drive, Confluence, and S3. Getting an agent to answer a basic question about your own business has traditionally meant months of building custom ingestion pipelines. Bedrock Managed Knowledge Base, now available on AgentCore, replaces that work with an agentic retriever that plans queries across knowledge bases, connects related concepts across documents, and re-ranks before answering. For complex multi-part queries, agentic retrieval surfaced noticeably broader and more complete coverage than basic RAG. No pipeline to build, no retrieval to tune.
Internal knowledge has gaps. Regulations change, markets shift. Web Search, a new tool on AgentCore, provides information from the live web while keeping queries within your AWS security boundary. It uses Amazon's search infrastructure (the same that powers Alexa+) and combines public web results with Amazon's proprietary knowledge graph for structured entity data and real-time info like stock prices. Your agent can now reason over the live web the same way it queries internal docs, with no extra vendor to onboard.
The best information isn't always free. AgentCore payments (in preview) lets agents discover and pay for services and content within their execution loop. WAF AI traffic monetization (generally available) lets providers control agent access: block, allow, or get paid. Both run on the same platform, so providers using WAF automatically recognize verified agents. Together they build infrastructure for both sides of the agent economy.
Optimization: From Guesswork to Data-Driven Fixes
Silent failures are the norm, not the exception. AgentCore now provides failure insights that discover recurring patterns (including the quiet ones with no error signal), explain root causes, and rank by impact. Intent insights cluster what users are actually trying to do. Trajectory insights show common paths and outliers. You can run these across hundreds of sessions in minutes or enable continuous monitoring.
Once you know what's wrong, recommendations analyze traces and evaluation outputs to suggest specific prompt or tool description improvements. Batch evaluation catches regressions before production. A/B testing splits live traffic between agent versions to prove a change works under real conditions. This works anywhere your agents run: on AgentCore's runtime, Lambda, EKS, or non-AWS environments. Continuous improvement built into the platform, not stitched on after.
Deterministic Guardrails on a Probabilistic Agent
More capable agents mean more attack surface. The new point of exposure is the agent's context, where prompt injection and memory poisoning don't require breaking in but just convincing the agent. AgentCore's policy engine already provides deterministic controls at the gateway. Now Bedrock Guardrails integration (GA) evaluates every agent action for prompt injection, harmful content, and sensitive data exposure at the gateway layer, outside the agent's code, where it can't be reasoned around. Coming soon, detection signals from Check Point, Zscaler, Rubrik, Netskope, and SentinelOne can feed into the same policies. Detection can be probabilistic, but enforcement is always deterministic.
The Harness: Decoupled, Managed, and Ship-Ready
An agent is more than a model; the harness runs the orchestration loop, executes tools, manages context, persists state. Building a durable one is where most teams spend their time. AgentCore harness (GA) turns that into configuration: choose a model, define tools and skills, and get a working agent in minutes with its own filesystem, shell, memory, and web browsing. The harness is decoupled from the model, so you can switch models mid-session without touching agent logic. Identity, memory, and observability come from the same platform, so every action is governed and traced from the first call.
Twilio's VP of Product Omar Paul notes that combining AgentCore harness with Twilio Conversations lets developers go from idea to live agent without rewiring infrastructure. That's the payoff: the agent you declare on day one is the agent you run at your thousandth, on the same foundation.
These capabilities are generally available today: managed harness, Managed Knowledge Base, Web Search, Guardrail integration, recommendations, and A/B testing. Insights and payments are in preview. Start in the console or with the AgentCore CLI.
Source: New in Amazon Bedrock AgentCore: Build agents with broader knowledge and continuous learning
Domain: aws.amazon.com
Comments load interactively on the live page.