Skip to content

Commit 5fd2aa8

Browse files
authored
Merge pull request #194 from mrgiba/main
Use Cross-region Inference; Retry request in case of Bedrock throttling
2 parents d0aa6bc + f0477ef commit 5fd2aa8

File tree

4 files changed

+56
-9
lines changed

4 files changed

+56
-9
lines changed

samples/contract-compliance-analysis/back-end/README.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -128,17 +128,22 @@ You can then go the Amazon Cognito page at the AWS Console, search for the User
128128
#### Enable access to Bedrock models
129129

130130
Models are not enabled by default on Amazon Bedrock, so if this is the first time you are going to use Amazon Bedrock,
131-
it is recommended to first check if the access is already enabled.
131+
it is recommended to first check if the access is already enabled.
132132

133-
Go to the AWS Console, then go to Amazon Bedrock
133+
The default model is Anthropic Claude 3 Haiku v1, being used in [cross-region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html) mode. Please ensure this model is enabled in the regions listed in the **US Anthropic Claude 3 Haiku** section from the [Supported Regions and models for inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) page.
134134

135-
Click Model access at the left side
135+
Steps:
136+
137+
- Go to the AWS Console, then go to Amazon Bedrock
138+
139+
- Click Model access at the left side
136140

137141
![Bedrock Model Access](images/bedrock-model-access.png)
138142

139-
Click the **Enable specific models** button and enable the checkbox for Anthropic Claude models
143+
- Click the **Enable specific models** button and enable the checkbox for Anthropic Claude models
144+
145+
- Click **Next** and **Submit** buttons
140146

141-
Click **Next** and **Submit** buttons
142147

143148
## How to customize contract analysis according to your use case
144149

@@ -172,7 +177,7 @@ The recommended sequence of steps:
172177

173178
By default, the application uses Anthropic Claude 3 Haiku v1. Here are steps explaining how to update the model to use. For this example, we will use [Amazon Nova Pro v1](https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/):
174179

175-
- Open the [app_properties.yaml](./app_properties.yaml) file and update the field ```claude_model_id``` to use the model you selected. In this case, we update the field to ```us.amazon.nova-pro-v1:0```. Replace it with the model id you want to use. The list of model ids available through Amazon Bedrock is available in the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). Ensure the model you are selecting is enabled in the console (Amazon Bedrock -> Model access) and available in your region.
180+
- Open the [app_properties.yaml](./app_properties.yaml) file and update the field ```claude_model_id``` to use the model you selected. In this case, we update the field to ```us.amazon.nova-pro-v1:0```. Replace it with the model id you want to use. The list of model ids available through Amazon Bedrock is available in the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). Ensure the model you are selecting is enabled in the console (Amazon Bedrock -> Model access) and available in your region. In case of using a predefined Inference Profile to use a model in a cross-region fashion, consult [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) of all regions that needs to have model access enabled.
176181
- Depending on the model selected, you might need to update some hardcoded values regarding the max number of new tokens generated. For instance, Amazon Nova Pro v1 supports 5000 output tokens, which doesn't require any modifications. However, some models might have a max output tokens of 3000, which requires some changes in the sample. Update the following lines if required:
177182
- In file [fn-preprocess-contract/index.py](./stack/sfn/preprocessing/fn-preprocess-contract/index.py), update line 96 to change the chunks size to a value smaller than the max tokens output for your model, as well as line 107 to match your model's max output tokens.
178183
- In file [scripts/utils/llm.py](./scripts/utils/llm.py), update the max tokens output line 28.

samples/contract-compliance-analysis/back-end/app_properties.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ language: English
1414

1515
# Claude Model ID (Global configuration). To switch to a smaller Language Model for cost savings).
1616
# Disabling the property will let each prompt execution to its default model id
17-
claude_model_id: anthropic.claude-3-haiku-20240307-v1:0
17+
claude_model_id: us.anthropic.claude-3-haiku-20240307-v1:0
1818

1919
# Thresholds determine the maximum number of clauses with risk that a contract can have without requiring human review,
2020
# per risk level

samples/contract-compliance-analysis/back-end/stack/sfn/common-layer/llm.py

Lines changed: 42 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@
1515
import logging
1616
import os
1717

18+
from retrying import retry
19+
from botocore.config import Config
20+
from botocore.exceptions import ClientError
1821
from langchain_aws import ChatBedrock
1922
from langchain_core.messages import HumanMessage
2023
from langchain_core.prompts import ChatPromptTemplate
@@ -24,8 +27,45 @@
2427
logger = logging.getLogger()
2528
logger.setLevel(os.getenv("LOG_LEVEL", "INFO"))
2629

27-
bedrock_client = boto3.client('bedrock-runtime')
30+
bedrock_client = boto3.client('bedrock-runtime', config=Config(
31+
connect_timeout=180,
32+
read_timeout=180,
33+
retries={
34+
"max_attempts": 50,
35+
"mode": "adaptive",
36+
},
37+
))
2838

39+
class BedrockRetryableError(Exception):
40+
"""Custom exception for retryable Bedrock errors"""
41+
pass
42+
43+
@retry(
44+
wait_fixed=10000, # 10 seconds between retries
45+
stop_max_attempt_number=None, # Keep retrying indefinitely
46+
retry_on_exception=lambda ex: isinstance(ex, BedrockRetryableError),
47+
)
48+
def invoke_chain_with_retry(chain):
49+
"""Invoke Bedrock with retry logic for throttling"""
50+
try:
51+
return chain.invoke({})
52+
except ClientError as exc:
53+
logger.warning(f"Bedrock ClientError: {exc}")
54+
55+
if exc.response["Error"]["Code"] == "ThrottlingException":
56+
logger.warning("Bedrock throttling. Retrying...")
57+
raise BedrockRetryableError(str(exc))
58+
elif exc.response["Error"]["Code"] == "ModelTimeoutException":
59+
logger.warning("Bedrock ModelTimeoutException. Retrying...")
60+
raise BedrockRetryableError(str(exc))
61+
else:
62+
raise
63+
except bedrock_client.exceptions.ThrottlingException as throttlingExc:
64+
logger.warning("Bedrock ThrottlingException. Retrying...")
65+
raise BedrockRetryableError(str(throttlingExc))
66+
except bedrock_client.exceptions.ModelTimeoutException as timeoutExc:
67+
logger.warning("Bedrock ModelTimeoutException. Retrying...")
68+
raise BedrockRetryableError(str(timeoutExc))
2969

3070
def invoke_llm(prompt, model_id, temperature=0.5, top_k=None, top_p=0.8, max_new_tokens=4096, verbose=False):
3171
model_id = (model_id or CLAUDE_MODEL_ID)
@@ -57,7 +97,7 @@ def invoke_llm(prompt, model_id, temperature=0.5, top_k=None, top_p=0.8, max_new
5797
])
5898
chain = prompt | chat
5999

60-
response = chain.invoke({})
100+
response = invoke_chain_with_retry(chain)
61101
content = response.content
62102

63103
usage_data = None
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
retrying==1.3.4
2+
botocore==1.38.9

0 commit comments

Comments
 (0)