category
|
July 28, 2025: The code and the steps of the demo have been updated to simplify the experience.
在短短几年内,基础模型(FM)已经从直接用于根据用户的提示创建内容发展到现在为人工智能代理提供动力,人工智能代理是一类新的软件应用程序,它使用FM进行推理、计划、行动、学习和适应,以在有限的人为监督下追求用户定义的目标。这种新的代理AI浪潮是由标准化协议的出现所推动的,如模型上下文协议(MCP)和Agent2Agent(A2A),这些协议简化了代理与其他工具和系统的连接方式。
事实上,由于CrewAI、LangGraph、LlamaIdex和Strands agents等开源框架的出现,构建能够可靠地执行复杂任务的AI代理变得越来越容易。然而,从有前景的概念验证转向可扩展到数千用户的生产就绪代理,带来了重大挑战。
开发人员和人工智能工程师不得不花费数月时间为会话管理、身份控制、内存系统和可观察性构建基础架构,同时支持安全性和合规性,而不是能够专注于代理的核心功能。
今天,我们很高兴地宣布Amazon Bedrock AgentCore的预览版,这是一套全面的企业级服务,可帮助开发人员使用任何框架和模型快速、安全地大规模部署和操作AI代理,托管在Amazon Bedrock或其他地方。
更具体地说,我们今天介绍:
- AgentCore Runtime-提供具有会话隔离的低延迟无服务器环境,支持任何代理框架,包括流行的开源框架、工具和模型,并处理多模式工作负载和长时间运行的代理。
- AgentCore内存–管理会话和长期内存,为模型提供相关上下文,同时帮助代理从过去的交互中学习。
- AgentCore Observability–通过元数据标记、自定义评分、轨迹检查和故障排除/调试过滤器,提供代理执行的逐步可视化。
- AgentCore Identity-使AI代理能够代表用户或在预先授权的用户同意下自行安全地访问AWS服务和第三方工具和服务,如GitHub、Salesforce和Slack。
- AgentCore网关–将现有API和AWS Lambda函数转换为代理就绪工具,提供跨协议(包括MCP)的统一访问和运行时发现。
- AgentCore Browser–提供托管的web浏览器实例,以扩展代理的web自动化工作流程。
- AgentCore代码解释器——提供一个隔离的环境来运行代理生成的代码。
这些服务可以单独使用,并经过优化以协同工作,因此开发人员不需要花时间拼凑组件。AgentCore可以与开源或自定义AI代理框架配合使用,使团队能够灵活地维护他们喜欢的工具,同时获得企业能力。要将这些服务集成到现有代码中,开发人员可以使用AgentCore SDK。
现在,您可以使用AgentCore Runtime从AWS Marketplace发现、购买和运行预构建的代理和代理工具。只需几行代码,您的代理就可以通过AgentCore Gateway安全地连接到AWS Marketplace中基于API的代理和工具,以帮助您运行复杂的工作流,同时保持合规性和控制。
AgentCore消除了繁琐的基础设施工作和操作复杂性,因此开发团队可以更快地将突破性的代理解决方案推向市场。
让我们看看这在实践中是如何运作的。在我们使用这些服务时,我将分享更多信息。
使用Amazon Bedrock AgentCore部署生产就绪的客户支持助理(预览版)
当客户发送电子邮件时,需要时间来提供回复。客户支持需要检查电子邮件的有效性,在客户关系管理(CRM)系统中找到实际客户,检查他们的订单,并使用特定于产品的知识库来查找准备答案所需的信息。
人工智能代理可以通过连接到内部系统、使用语义数据源检索上下文信息以及为支持团队起草回复来简化这一过程。对于这个用例,我使用Strands Agent构建了一个简单的原型。为了简单起见并验证场景,使用Python函数模拟内部工具。
当我与开发人员交谈时,他们告诉我,许多公司正在构建涵盖不同用例的类似原型。当这些原型向公司领导层展示并得到确认后,开发团队必须确定如何投入生产,并满足安全性、性能、可用性和可扩展性的通常要求。这就是AgentCore可以提供帮助的地方。
步骤1–使用AgentCore Runtime部署到云端
AgentCore Runtime是一项新服务,用于安全地部署、运行和扩展AI代理,提供隔离,使每个用户会话在其自己的受保护环境中运行,以帮助防止数据泄漏——这是处理敏感数据的应用程序的关键要求。
为了匹配不同的安全态势,代理可以使用不同的网络配置:
公共–使用受管理的互联网接入运行。
仅限VPC(即将推出)-此选项将允许访问客户VPC中托管的资源或通过AWS PrivateLink端点连接的资源。
为了将代理部署到云端并使用AgentCore Runtime获得安全的无服务器端点,我使用AgentCore SDK向原型添加了几行代码:
- 导入AgentCore SDK。
- 创建AgentCore应用程序。
- 指定哪个函数是调用代理的入口点。
使用不同的或自定义的代理框架是替换入口点函数内的代理调用的问题。
这是原型的代码。我为使用AgentCore Runtime添加的三行是前面有注释的行。
The previous code needs the Strands Agents modules installed in the Python environment. To do so,
To install dependencies, I create and activate a virtual environment:
I add Strands Agents modules, AgentCore SDK, and AgentCore starter toolkit to the dependency file (requirements.txt
):
I then install all the requirements in the virtual environment:
Now the virtual environment, gives me access to the AgentCore command line interface (CLI) provided by the starter toolkit.
First, I use agentcore configure --entrypoint my_agent.py
to configure the agent. I press Enter
to auto-create the AWS Identity and Access Management (IAM) execution role and the Amazon Elastic Container Registry (Amazon ECR) repository and to confirm the detected dependency file.
In this case, the agent only needs access to Amazon Bedrock to invoke the model. The role can give access to other AWS resources used by an agent, such as an Amazon Simple Storage Service (Amazon S3) bucket or a Amazon DynamoDB table. The ECR repository is used to store the container image created when deploying the agent.
By default, the agent configuration enables observability. To enable trace delivery, I use the AWS Command Line Interface (AWS CLI) to set up Transaction Search in Amazon CloudWatch. This switches all trace ingestion for the entire account into cost effective collection mode using CloudWatch Application Signals pricing plan.
I check the result of these commands with:
I launch the agent locally with agentcore launch --local
. When running locally, I can interact with the agent using agentcore invoke --local <PAYLOAD>
. The payload is passed to the entry point function. Note that the JSON syntax of the invocations is defined in the entry point function. In this case, I look for prompt
in the JSON payload, but can use a different syntax depending on your use case.
When I am satisfied by local testing, I use agentcore launch
to deploy to the cloud.
After the deployment is succesful and an endpoint has been created, I check the status of the endpoint with agentcore status
and invoke the endpoint with agentcore invoke <PAYLOAD>
. For example, I pass a customer support request in the invocation:
agentcore invoke '{"prompt": "From: me@example.net – Hi, I bought a smartphone from your store.
I am traveling to Europe next week, will I be able to use the charger?
Also, I struggle to remove the cover. Thanks, Danilo"}'
Step 2 – Enabling memory for context
After an agent has been deployed in the AgentCore Runtime, the context needs to be persisted to be available for a new invocation. I add AgentCore Memory to maintain session context using its short-term memory capabilities.
First, I create a memory client and the memory store for the conversations:
I can now use create_event
to stores agent interactions into short-term memory:
I can load the most recent turns of a conversations from short-term memory using list_events
:
With this capability, the agent can maintain context during long sessions. But when a users come back with a new session, the conversation starts blank. Using long-term memory, the agent can personalize user experiences by retaining insights across multiple interactions.
To extract memories from a conversation, I can use built-in AgentCore Memory policies for user preferences, summarization, and semantic memory (to capture facts) or create custom policies for specialized needs. Data is stored encrypted using a namespace-based storage for data segmentation.
I change the previous code creating the memory store to include long-term capabilities by passing a semantic memory strategy. Note that an existing memory store can be updated to add strategies. In that case, the new strategies are applied to newer events as they are created.
After long-term memory has been configured for a memory store, calling create_event
will automatically apply those strategies to extract information from the conversations. I can then retrieve memories extracted from the conversation using a semantic query:
In this way, I can quickly improve the user experience so that the agent remembers customer preferences and facts that are outside of the scope of the CRM and use this information to improve the replies.
Step 3 – Adding identity and access controls
Without proper identity controls, access from the agent to internal tools always uses the same access level. To follow security requirements, I integrate AgentCore Identity so that the agent can use access controls scoped to the user’s or agent’s identity context.
I set up an identity client and create a workload identity, a unique identifier that represents the agent within the AgentCore Identity system:
Then, I configure the credential providers, for example:
I can then add the @requires_access_token
Python decorator (passing the provider name, the scope, and so on) to the functions that need an access token to perform their activities.
Using this approach, the agent can verify the identity through the company’s existing identity infrastructure, operate as a distinct, authenticated identity, act with scoped permissions and integrate across multiple identity providers (such as Amazon Cognito, Okta, or Microsoft Entra ID) and service boundaries including AWS and third-party tools and services (such as Slack, GitHub, and Salesforce).
To offer robust and secure access controls while streamlining end-user and agent builder experiences, AgentCore Identity implements a secure token vault that stores users’ tokens and allows agents to retrieve them securely.
For OAuth 2.0 compatible tools and services, when a user first grants consent for an agent to act on their behalf, AgentCore Identity collects and stores the user’s tokens issued by the tool in its vault, along with securely storing the agent’s OAuth client credentials. Agents, operating with their own distinct identity and when invoked by the user, can then access these tokens as needed, reducing the need for frequent user consent.
When the user token expires, AgentCore Identity triggers a new authorization prompt to the user for the agent to obtain updated user tokens. For tools that use API keys, AgentCore Identity also stores these keys securely and gives agents controlled access to retrieve them when needed. This secure storage streamlines the user experience while maintaining robust access controls, enabling agents to operate effectively across various tools and services.
Step 4 – Expanding agent capabilities with AgentCore Gateway
Until now, all internal tools are simulated in the code. Many agent frameworks, including Strands Agents, natively support MCP to connect to remote tools. To have access to internal systems (such as CRM and order management) via an MCP interface, I use AgentCore Gateway.
With AgentCore Gateway, the agent can access AWS services using Smithy models, Lambda functions, and internal APIs and third-party providers using OpenAPI specifications. It employs a dual authentication model to have secure access control for both incoming requests and outbound connections to target resources. Lambda functions can be used to integrate external systems, particularly applications that lack standard APIs or require multiple steps to retrieve information.
AgentCore Gateway facilitates cross-cutting features that most customers would otherwise need to build themselves, including authentication, authorization, throttling, custom request/response transformation (to match underlying API formats), multitenancy, and tool selection.
The tool selection feature helps find the most relevant tools for a specific agent’s task. AgentCore Gateway brings a uniform MCP interface across all these tools, using AgentCore Identity to provide an OAuth interface for tools that do not support OAuth out of the box like AWS services.
Step 5 – Adding capabilities with AgentCore Code Interpreter and Browser tools
To answer to customer requests, the customer support agent needs to perform calculations. To simplify that, I use the AgentCode SDK to add access to the AgentCore Code Interpreter.
Similarly, some of the integrations required by the agent don’t implement a programmatic API but need to be accessed through a web interface. I give access to the AgentCore Browser to let the agent navigate those web sites autonomously.
Step 6 – Gaining visibility with observability
Now that the agent is in production, I need visibility into its activities and performance. AgentCore provides enhanced observability to help developers effectively debug, audit, and monitor their agent performance in production. It comes with built-in dashboards to track essential operational metrics such as session count, latency, duration, token usage, error rates, and component-level latency and error breakdowns. AgentCore also gives visibility into an agent’s behavior by capturing and visualizing both the end-to-end traces, as well as “spans” that capture each step of the agent workflow including tool invocations, memory
The built-in dashboards offered by this service help reveal performance bottlenecks and identify why certain interactions might fail, enabling continuous improvement and reducing the mean time to detect (MTTD) and mean time to repair (MTTR) in case of issues.
AgentCore supports OpenTelemetry to help integrate agent telemetry data with existing observability platforms, including CloudWatch, Datadog, LangSmith, and Langfuse. I just need to enable observability in the agent configuration and launch it again to start sending telemetry data to CloudWatch. Check that the IAM role used by the agent has the necessary permission to do so.
Step 7 – Conclusion
Through this journey, we transformed a local prototype into a production-ready system. Using AgentCore modular approach, we implemented enterprise requirements incrementally—from basic deployment to sophisticated memory, identity management, and tool integration—all while maintaining the existing agent code.
Things to know
Amazon Bedrock AgentCore is available in preview in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Frankfurt). You can start using AgentCore services through the AWS Management Console , the AWS Command Line Interface (AWS CLI), the AWS SDKs, or via the AgentCore SDK.
You can try AgentCore services at no charge until September 16, 2025. Standard AWS pricing applies to any additional AWS Services used as part of using AgentCore (for example, CloudWatch pricing will apply for AgentCore Observability). Starting September 17, 2025, AWS will bill you for AgentCore service usage based on this page.
Whether you’re building customer support agents, workflow automation, or innovative AI-powered experiences, AgentCore provides the foundation you need to move from prototype to production with confidence.
To learn more and start deploying production-ready agents, visit the AgentCore documentation. For code examples and integration guides, check out the AgentCore samples GitHub repo.
Join the AgentCore Preview Discord server to provide feedback and discuss use cases. We’d like to hear from you!
— Danilo
- 登录 发表评论
- 4 次浏览
最新内容
- 1 day 18 hours ago
- 1 day 18 hours ago
- 2 days 4 hours ago
- 2 days 4 hours ago
- 2 days 4 hours ago
- 2 days 5 hours ago
- 2 days 5 hours ago
- 2 days 5 hours ago
- 2 days 5 hours ago
- 2 days 6 hours ago