-
Notifications
You must be signed in to change notification settings - Fork 70
AgentCore Runtime example, using new AgentCore bidirectional streaming (WebSocket) endpoint #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…for aws-agentcore-websocket
deployment/aws-agentcore/uv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was removed automatically when renaming aws-agentcore to aws-agentcore-webrtc, due to .gitignore. I could've fought it, but it felt like the right thing to do to go along with to the examples-repo-wide .gitinogre ruels.
There is one other uv.lock in this repo, btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
…is now in its own sub-folder. This makes the dependencies make more sense: the root is concerned with managing agents (the CLI), the agent folder is concerned with the agent logic itself. Also, fix a bug in launch.sh. Also, use pyproject.toml rather than requirements.txt for the agent.
|
|
||
|
|
||
| ############################################### | ||
| # STEP 2 — Read AGENT ARN from agentcore status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm reading the Agent ARN from agentcore status rather than reading it from the agentcore YAML, as it's more robust—if you've configured multiple different agents, reading from YAML may not work. Reading from agentcore status will read the value for the default agent.
I've also added a note in the README about ensuring that you've configured the agentcore CLI with the right default agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, if this change seems good to you, @filipi87, we can make the corresponding change to the other AgentCore example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it makes sense to keep it standard between the two examples. So if we think this is the best approach, it would be good to update the other example to use it as well. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm using pyproject.toml rather than requirements.txt.
IIRC I ran into problems using pyproject.toml with the agentcore CLI before. But it seems to work just fine now!
@filipi87, if this seems like a good change we can work it into the other AgentCore example after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I thought you had tried that before and it hadn’t worked, so I didn’t even try it when developing the other example.
Nice to hear that it’s working now. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you had tried that before and it hadn’t worked
Totally. Might've been an agentcore CLI update that fixed it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I've put scripts in the root folder rather than under the agent folder, for a couple of reasons:
- It affects both
agentandserver - It cleans up the story about what folder level is concerned with what. Now when you're in the root folder you're concerned about agent management. When you're in the
agentfolder you're concerned with agent logic itself. The dependencies reflect that—the CLI is in the root folder, the agent runtime dependencies are in theagentfolder.
@filipi87, if this seems reasonable to you we can make the corresponding change to the other AgentCore example after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think it makes sense. 👍
|
|
||
| Follow the prompts to complete the configuration. It's fine to just accept all defaults. | ||
|
|
||
| ## ⚠️ Before Proceeding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the warning I was describing in another comment.
If this seems useful, we can update other AgentCore examples to follow suit after this PR. @filipi87
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that makes sense. I think we should try to keep both examples as consistent as possible.
| voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady | ||
| ) | ||
|
|
||
| llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should use AWSBedrockLLMService instead of OpenAILLMService. This is something they had asked us to change in the other example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, good call! Will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I used a slightly different pattern for instantiating the LLM service than in the other example, and used a different model that seems to work better with the prompt. If you agree that these changes makes sense, I'll add it to the list of things to change in the other example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that would be great. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an improvement for the future, mostly as a reminder to myself 😅, we really should support all transports in our “prebuilt” UI, and probably rename it from pipecat-ai-small-webrtc-prebuilt to just pipecat-ai-prebuilt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100%! While working on this, I had the thought "oh, too bad we can't use the WebRTC Prebuilt here..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I will add this to my todo list. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we remove this and add .vscode to .gitignore ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Good catch. Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| ```bash | ||
| export AWS_SECRET_ACCESS_KEY=... | ||
| export AWS_ACCESS_KEY_ID=... | ||
| export AWS_REGION=... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have also added in the other example this:
export AWS_DEFAULT_REGION=your_default_region
Should we add it here as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's actually no need to export that in this example. There was some specific tool or something in the other example that didn't honor AWS_REGION, but I can't now remember what exactly that was.
|
|
||
| Add your API keys: | ||
|
|
||
| - `OPENAI_API_KEY`: Your OpenAI API key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove everything related to OPENAI from this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| Configure your Pipecat agent as an AgentCore agent: | ||
|
|
||
| ```bash | ||
| uv run agentcore configure -e agent/agent.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that here you don’t need to change the Dockerfile, but just for the sake of keeping both examples consistent, wouldn’t it make sense to create the configuration script as well ?
And maybe the one to destroy ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it. There were two competing interests. One was
keeping both examples consistent
But there's also the desire not to introduce unnecessary abstraction, to not obscure what the example is trying to demonstrate, which is usage of AgentCore with Pipecat. For configuration and destruction, wrapping the AgentCore CLI in a script adds very little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I don't feel very strongly about this decision one way or another. Just wanted to explain my decision. If you feel strongly otherwise, happy to make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One advantage of wrapping it inside the script is that we can force the agent to use the container during configuration. Otherwise, we need to tell the user to select it manually, because if they don’t, it won’t work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually no longer to use the container option. At least not for this example. 🎉
Selecting all defaults works.
| from pipecat.runner.types import DailyRunnerArguments, RunnerArguments | ||
| from pipecat.runner.utils import create_transport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can remove DailyRunnerArguments, and the create_transport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| # We store functions so objects (e.g. SileroVADAnalyzer) don't get | ||
| # instantiated. The function will be called when the desired transport gets | ||
| # selected. | ||
| transport_params = { | ||
| "daily": lambda: DailyParams( | ||
| audio_in_enabled=True, | ||
| audio_out_enabled=True, | ||
| vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)), | ||
| turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()), | ||
| ), | ||
| "webrtc": lambda: TransportParams( | ||
| audio_in_enabled=True, | ||
| audio_out_enabled=True, | ||
| vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)), | ||
| turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()), | ||
| ), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to keep this for local development?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, no, maybe they can be removed. Although...in general, it might be helpful to update this example to make it possible/easy to do local testing. I can look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| context = LLMContext(messages, tools) | ||
| context_aggregator = LLMContextAggregatorPair(context) | ||
|
|
||
| pipeline = Pipeline( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using anything in the client which would require adding the RTVI here as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, not that I'm aware of.
Is it always best practice, though, to add RTVI anyway whenever there's a web client involved? Even if just for the "on_client_ready" event and setting rtvi.set_bot_ready()? I'm assuming yes. I'll go ahead and do that.
Speaking of which...looks like in the other AgentCore example we don't do the LLMRunFrame() in "on_client_ready" like in our other examples, we do it in "on_client_connected"...maybe updating that should also go on the list of improvements to port over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, with RTVI enabled we're getting a lot more info in the client interface. Cool cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| @app.websocket | ||
| async def agentcore_bot(websocket, context): | ||
| """Bot entry point for running on Amazon Bedrock AgentCore Runtime.""" | ||
| await websocket.accept() | ||
|
|
||
| transport = FastAPIWebsocketTransport( | ||
| websocket=websocket, | ||
| params=FastAPIWebsocketParams( | ||
| audio_in_enabled=True, | ||
| audio_out_enabled=True, | ||
| add_wav_header=False, | ||
| vad_analyzer=SileroVADAnalyzer(), | ||
| serializer=ProtobufFrameSerializer(), | ||
| ), | ||
| ) | ||
|
|
||
| await run_bot(transport, RunnerArguments()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So simple. Amazing! 👏🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's great! Was really glad that we already had all the infrastructure we needed in Pipecat.
| @@ -0,0 +1,3 @@ | |||
| OPENAI_API_KEY=... | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably remove OPENAI_API_KEY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| "@pipecat-ai/client-js": "^1.2.0", | ||
| "@pipecat-ai/websocket-transport": "^1.2.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update it to use the latest versions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, probably. This was just copy-pasted from elsewhere in the examples repo.
(We should probably update versions everywhere, actually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| // The baseURL and endpoint of your bot server that the client will connect to | ||
| endpoint: "http://localhost:7860/start", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that we’re creating environment variables for all the settings, should we extract this one into an environment variable as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just leave it as /start, would it work ? And then inside the vite.config.js redirect to our server ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just leave it as
/start
Hmm, I think making it a relative path would maybe make it less clear to the reader that this should be the URL of your server, which is distinct from where the client is hosted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually...I'm not sure we need to extract this out. This is an example and not a production thing and maybe it should stay focused only on the necessary configurable things. We hard-code 7860 all over the place in our examples.
| // Proxy /api requests to the backend server | ||
| '/connect': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do the same to /start ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually...can we get rid of this for /connect? There is no connect endpoint on our server...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Simplified the client project configuration.
| ) | ||
|
|
||
|
|
||
| @app.post("/start") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see it working. 👏🎉
filipi87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just tested it, and it is working nicely. Amazing work @kompfner . 👏🎉
I believe the only thing we should do before merging is switch to using AWSBedrockLLMService instead of OpenAILLMService.
And for the sake of the example, it might also be nice to enable RTVI on the server and show both user and bot transcriptions on the client. But feel free to add that in a follow up PR if you think it makes sense.
…gentcore-websocket example (and improve the prompt)
…ences to OpenAI, after switching to `AWSBedrockLLMService`
…eploying it to AgentCore Runtime
b9c8d6c to
077e9c1
Compare
No description provided.