- Install node.js 18+
- Install localtunnel to expose local server to the internet
npm i -g localtunnel
- AWS profile configured with environment variable
AWS_PROFILE
defaulting tobedrock-test
and region configured viaAWS_REGION
defaulting tous-east-1
-
Setup a Twilio account. Sign up for free
-
Claim a Twilio Phone number with voice capabilities. Instructions here
-
(Optional) A programmable SIP domain to demo escalation to a call center agent. Check SIP domain section in this blog
-
Go to account dashboard and capture the Account SID . We need this to set
TWILIO_ACCOUNT_SID
environment variable.
- Click on the Generate API keys and capture the newly created API keys.
We need this to update the
TWILIO_API_SID
andTWILIO_API_SECRET
environemnt variables.
AWS_PROFILE
- AWS IAM config profile that has necessary permissions for Bedrock modelsAWS_REGION
- AWS region nameTWILIO_ACCOUNT_SID
,TWILIO_API_SID
&TWILIO_API_SECRET
- Needed to configure Twilio SDK clientTWILIO_FROM_NUMBER
- The Twilio phone number in E.164 format (Needed only for outbound calling demo)TWILIO_VERIFIED_CALLER_ID
- The destination number in E.164 format (Needed only for outbound calling demo). In Twilio trial /sandbox account, the destination number has to be a verified phone numberSIP_ENDPOINT
- The Twilio SIP domain endpoint (Needed only if the call should be escalated to human agent)
- Clone the library and
cd
into it - Install the dependencies by running
npm install
- Build the app by running
npm run build
- Run the command
npm start
to start the webserver that interfaces with Amazon Nova Sonic via Bedrock. Make sure all the environment variables are set before running this command. Refer to above section for required environment variables specific to the functionality. - In a different terminal, run
lt --port 3000
to tunnel the app and capture the public endpoint. You could also use ngrok instead. The endpoint looks likehttps://<random_domain>.loca.lt
NOTE: The use of ngrok or localtunnel is strictly for the demonstration purpose as we intend to run the sample locally and expose it to public internet. When running in production, the application is typically deployed on EC2/ECS/EKS and exposed to internet via an Application Load Balancer. However running the app behind load balancer itself doesn't make it immune to attacks. There should be additional measures like HTTP authentication and request signature validation to make sure the requests are indeed originated from Twilio. For production setup, please refer to Twilio's secure communication documentation.
In this section of demo, we'll be making a call to Twilio number and the call will be answered by Amazon Nova Sonic speech-to-speech model
In the active phone, go to the corresponding Voice Configuration tab and paste the url path webhook for incoming call. In our case, the uri is incoming-call
, so the path will be https://<random_domain>.loca.lt/incoming-call
.
- Dial the phone number, it will play a welcome message and it connects to the websocket endpoint that connects to Amazon Nova Sonic model
- All your speech will be handled and responded by Sonic
- Say something like "I need to cancel my reservation" to invoke the tools
In this section of demo, we'll be making outbound calls from a Twilio number to a destination phone number. When the call is connected, the application instructs Twilio to connect to media streams endpoint to let Amazon Nova Sonic process and answer the call audio.
- Grab the public endpoint
https://<random_domain>.loca.lt
from the previous "Build & Run" section - Make sure the
TWILIO_FROM_NUMBER
andTWILIO_VERIFIED_CALLER_ID
are set to correct phone numbers (E.164 format) before running the app - Trigger the outbound call using a curl command
curl https://<random_domain>.loca.lt/outbound-call
. The/outbound-call
endpoint will initiate the call to the destination phone number and also connects to the websocket endpoint that interfaces with Sonic model.
- Create SIP user credentials to be connected to softphone
- Create a SIP domain to forward the calls to a customer support agent. Make sure the user credentials are attached.
- Download a softphone like Zoiper and login using the SIP user credentials generated above
-
Set the environment variable
SIP_ENDPOINT
to the SIP user (E.g. <username>@<domain>.sip.twilio.com) and runnpm start
again -
While on the call, say something like "I need help with billing issues, connect me to an agent" to route the call to the agent.
All incoming calls in Twilio are routed to the webhook, which in our case, is /incoming-call
and TwiML should be returned with our Websocket endpoint (which is /media-stream
) for Twilio media streams to connect to.
Twilio programmable voice API connects to the websocket endpoint and streams the media (the call audio) to it. The application passes the audio to Nova Sonic speech-to-speech model via Bedrock's bidirectional API. This allows the incoming and outgoing audio to be exchanged asynchronously. When the Sonic model detects a tool use, the corresponding tool will be invoked and the tool result is passed back to the model.
When the Sonic model detects "support" tool use, the current call leg is updated to dial a SIP endpoint. When an agent is connected to the endpoint using a softphone, that phone will ring.
An outbound call needs a trigger to initiate the call from Twilio to a destination phone number. In this sample, the outbound call triggering is exposed via /outbound-call
endpoint. It leverages Twilio SDK to initiate a call to the destination number and also connect to the media streams websocket endpoint (/media-stream
). Rest of the call flow will be similar to that of the inbound flow.