AWS goes beyond prompt-level safety with automated reasoning in AgentCore

AWS is leveraging automated reasoning, which uses math-based verification, to build out new capabilities in its Amazon Bedrock AgentCore platform as the company digs deeper into the agentic AI ecosystem.

Announced during its annual re: Invent conference in Las Vegas, AWS is adding three new capabilities to AgentCore: "policy," "evaluations" and "episodic memory." The new features aim to give enterprises more control over agent behavior and performance.

AWS also revealed what it calls “a new class of agents," or "frontier agents," that are autonomous, scalable and independent.

Swami Sivasubramanian, AWS VP for Agentic AI, told VentureBeat that many of AWS’s new features represent a shift in who becomes a builder.

“We are actually on the cusp of a major tectonic transformation with AI, but agentic AI is truly starting to transform what is the art of the possible, and it is going to make this one of the most truly transforming technologies,” Sivasubramanian said.

Policy agents

The new policy capability helps enterprises reinforce guidelines even after the agent has already reasoned its response.

AWS VP for AgentCore David Richardson told VentureBeat that the policy tool sits between the agent and the tools it calls, rather than being baked into the agent, as fine-tuning often is. The idea is to prevent an agent from violating enterprise rules and redirect it to re-evaluate its reasoning.

Richardson gave the example of a customer service agent: A company would write a policy stating that the agent can grant a refund of up to $100, but for anything higher, the agent would need to bounce the customer to a human. He noted that it remains easy to subvert an agent's reasoning loop through, for instance, prompt injection or poisoned data, leading agents to ignore guardrails.

“There are always these prompt injection attacks where people try to subvert the reasoning of the agent to get the agent to do things it shouldn’t do,” Richardson said. “That’s why we implemented the policy outside of the agent, and it works using the automated reasoning capabilities that we’ve spent years building up to help customer define their capabilities.”

AWS unveiled Automated Reasoning Checks on Bedrock at last year’s re: Invent. These use neurosymbolic AI, or math-based validation, to prove correctness. The tool applies mathematical proofs to models to confirm that it hasn’t hallucinated. AWS has been leaning heavily into neurosymbolic AI and automated reasoning, pushing for enterprise-grade security and safety in ways that differ from other AI model providers.

Episodic memories and evaluations

The two other new updates to AgentCore, "evaluations" and "episodic memory," also give enterprises a better view of agent performance and give agents episodic memory.

An enhancement of AgentCore memory, episodic memory refers to knowledge that agents tap into only occasionally, unlike longer-running preferences, which they have to refer back to constantly. Context window limits hamper some agents, so they sometimes forget information or conversations they haven’t tapped into for a while.

“The idea is to help capture information that a user really would wish the agent remembered when they came back," said Richardson. "For example, 'what is their preferred seat on an airplane for family trips?' Or 'what is the sort of price range they're looking for?'"

Episodic memory differs from the previously shipped AgentCore memory because, instead of relying on maintaining short- and long-term memory, agents built on AgentCore can recall certain information based on triggers. This can eliminate the need for custom instructions.

With AgentCore evaluations, organizations can use 13 pre-built evaluators or write their own. Developers can set alerts to warn them if agents begin to fail quality monitoring.

Frontier agents

But perhaps AWS's strongest push into enterprise agentic AI is the release of frontier agents, or fully automated and independent agents that the company says can act as teammates with little direction.

The concept is similar, if not identical, to those of more asynchronous agents from competitors like Google and OpenAI. However, AWS seems to be releasing more than just autonomous coding agents.

Sivasubramanian called them a "new class" of agents, "not only a step function change in what you can do today; they move from assisting with individual tasks to complex projects."

The first is Kiro, an autonomous coding agent that has been in public preview since July. At the time, Kiro was billed as an alternative to vibe coding platforms like OpenAI’s Codex or Windsurf. Similar to Codex and Google’s myriad asynchronous coding agents, including Jules, Kiro can code, undertake reviews, fix bugs independently and determine the tasks it needs to accomplish.

AWS security agent, meanwhile, embeds deep security expertise into applications from the start. The company said in a press release that users “define security standards once and AWS security agent automatically validates them across your applications during its review — helping teams address the risks that matter to their business, not generic checklists.”

The AWS DevOps agent will help developers, especially those on call, proactively find system breaks or bugs. It can respond to incidents using its knowledge of the application or service. It also acknowledges the relationships between the application and the tools it taps, such as Amazon CloudWatch, Datadog and Splunk, to trace the root cause of the issue.

Enterprises are interested in deploying agents and, eventually, bringing more autonomous agents into their workflows. And, while companies like AWS continue to bolster these agents with security and control, organizations are slowly figuring out how to connect them all.

Original Source: https://venturebeat.com/ai/aws-goes-beyond-prompt-level-safety-with-automated-reasoning-in-agentcore

Disclaimer: This article is a reblogged/syndicated piece from a third-party news source. Content is provided for informational purposes only. For the most up-to-date and complete information, please visit the original source. Digital Ground Media does not claim ownership of third-party content and is not responsible for its accuracy or completeness.

AWS goes beyond prompt-level safety with automated reasoning in AgentCore

Policy agents

Episodic memories and evaluations

Frontier agents

About The Author

admin

More From Author

Elon Musk Is Convinced He Can Turn Low-Earth Orbit Into a Gigantic Data Center

Is Moltbook, the Social Network for AI Agents, Actually Fake?

Abxylute’s new Switch 2 controller is for all the die-hard GameCube fans

Leave a Reply Cancel reply