[#64] Queue and Storage triggers for AWS, GCP and Azure #201

oanarosca · 2024-05-06T16:09:49Z

Summary by CodeRabbit

New Features
- Introduced QueueTrigger and StorageTrigger for AWS, allowing interaction with SQS and S3 for handling asynchronous and storage-related events.
- Added functionality in GCP to manage trigger resources (queues, storage) ensuring they exist before function deployment.
- Enhanced Azure function handler to support multiple triggers, including HTTP requests, queue messages, and blob storage events.
Enhancements
- New methods for creating, configuring, and updating function triggers in GCP.
- Improved event handling logic in AWS Lambda to process messages from SQS and S3.
- Expanded Azure and GCP classes to dynamically generate function configurations based on trigger types.
- Updated configuration files to support new dependencies and deployment capabilities related to queue management.

mcopik

Very nice work! I left some comments for refactoring - effectively, the two missing pieces are integration into our existing storage APIs and abstractions & using SeBS cache to make sure that we allocate resources only once, and off the critical path of invocation :)

I recommend using a single-bucket approach for AWS and Azure. If you find it impossible to do that on Google Cloud, then please add there an option to create more than one bucket; AFAIK there's no hard limit on the number of buckets on this platform. Just make sure that each trigger remembers its bucket name for future invocation and has it cached; it also makes it easier to clean up buckets that are no longer needed.

benchmarks/wrappers/aws/python/handler.py

benchmarks/wrappers/gcp/python/handler.py

config/systems.json

docs/platforms.md

sebs/azure/azure.py

sebs/azure/triggers.py

sebs/benchmark.py

sebs/gcp/triggers.py

coderabbitai · 2024-07-05T05:32:12Z

Walkthrough

The recent updates introduce new trigger types (QueueTrigger and StorageTrigger) to the AWS, Azure, and GCP modules, enhancing the ability to handle asynchronous and storage-related events. Additionally, methods for managing trigger resources have been added, ensuring their creation and configuration prior to function deployment. These changes collectively expand the event-driven capabilities across multiple cloud platforms.

Changes

File	Change Summary
`sebs/aws/triggers.py`	Introduced `QueueTrigger` and `StorageTrigger` classes with methods for handling SQS and S3 events.
`sebs/gcp/gcp.py`	Added methods to manage trigger resources, ensuring their existence before function deployment.
`benchmarks/wrappers/aws/python/handler.py`	Enhanced event handling to support SQS and S3 events in AWS Lambda functions.
`benchmarks/wrappers/azure/python/handler.py`	Renamed main function and added support for handling Azure Queue and Blob Storage events.
`benchmarks/wrappers/gcp/python/handler.py`	Restructured handler to support GCP Queue and Storage events.
`benchmarks/wrappers/aws/python/queue.py`	Implemented a wrapper class for interacting with AWS SQS.
`benchmarks/wrappers/azure/python/queue.py`	Introduced a class for interacting with Azure Storage Queues.
`benchmarks/wrappers/gcp/python/queue.py`	Added a class for publishing messages to Google Cloud Pub/Sub.
`sebs/aws/aws.py`	Updated functions to handle new trigger types for AWS Lambda.
`sebs/azure/azure.py`	Enhanced function configuration and trigger management for Azure functions.
`sebs/azure/triggers.py`	Added `QueueTrigger` and `StorageTrigger` classes for Azure Queue and Blob operations.
`sebs/gcp/triggers.py`	Introduced `QueueTrigger` and `StorageTrigger` classes for GCP Pub/Sub and Cloud Storage interactions.
`config/systems.json`	Expanded configuration for Python base images and added new dependencies for Azure functionality.
`docs/modularity.md`	Updated documentation to reflect new trigger types and their configurations.
`requirements.gcp.txt`	Added dependency for Google Cloud Pub/Sub.
`tests/aws/create_function.py`	Included `queue.py` in the package files for AWS function creation tests.
`scripts/run_experiments.py`	Updated initialization to include `queue.py`.

Poem

In the cloud where functions play,
New triggers dance in bright array.
Queues and storage, side by side,
In eventful realms, they now abide.
With every message, data flows,
In harmony, our system grows! 🐇✨

Tip

Announcements

The review status is no longer posted as a separate comment when there are no actionable or nitpick comments. In such cases, the review status is included in the walkthrough comment.
We have updated our review workflow to use the Anthropic's Claude family of models. Please share any feedback in the discussion post on our Discord.
Possibly related PRs: Walkthrough comment now includes a list of potentially related PRs to help you recall past context. Please share any feedback in the discussion post on our Discord.
Suggested labels: CodeRabbit can now suggest labels by learning from your past PRs in the walkthrough comment. You can also provide custom labeling instructions in the UI or configuration file.
Possibly related PRs, automatic label suggestions based on past PRs, learnings, and possibly related issues require data opt-in (enabled by default).

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dac2840 and 7de164e.

Files selected for processing (8)

config/systems.json (3 hunks)
docs/platforms.md (1 hunks)
sebs.py (3 hunks)
sebs/azure/azure.py (7 hunks)
sebs/cache.py (1 hunks)
sebs/faas/system.py (1 hunks)
sebs/gcp/gcp.py (9 hunks)
sebs/local/local.py (1 hunks)

Files skipped from review due to trivial changes (2)

docs/platforms.md
sebs/cache.py

Additional context used

Ruff

sebs/azure/azure.py

259-259: Use context handler for opening files

(SIM115)

Additional comments not posted (13)

config/systems.json (1)

74-75: LGTM!

The changes look good:

Adding queue.py to the deployment files across multiple sections indicates that this file is now part of the deployment process.

Adding azure-storage-queue and azure-identity packages suggests an expansion of functionality related to Azure services.

The formatting changes (new lines before the added packages) may improve readability.

Also applies to: 120-122

sebs.py (3)

181-181: LGTM!

The code changes to add "queue" and "storage" trigger choices are approved.

Line range hint 232-257: LGTM!

The code changes to handle the new trigger types in the invoke function are approved.

Line range hint 223-223: LGTM!

The code change to add the optional trigger parameter to the package_code function is approved.
sebs/azure/azure.py (3)
257-261: LGTM!

The code changes to utilize the create_function_json method in the package_code function are approved.

Tools

Ruff

259-259: Use context handler for opening files

(SIM115)

356-361: LGTM!

The code changes to conditionally create an HTTPTrigger in the update_function method are approved.

585-647: Approved: Method create_trigger supports multiple trigger types.

The changes to the create_trigger method significantly enhance the functionality by supporting both queue and storage triggers. The method effectively sets up the necessary Azure resources and permissions.

The method also raises a RuntimeError for unsupported trigger types, which is a good practice for error handling.

Verify that the RuntimeError is raised for unsupported trigger types:

Verification successful

Verified: RuntimeError is raised for unsupported trigger types in create_trigger.

The create_trigger method in sebs/azure/azure.py correctly raises a RuntimeError for unsupported trigger types, confirming the implementation of error handling as expected.

Location: sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type python -A 5 $'raise RuntimeError$"Not supported!"$'
Length of output: 95

Script:
#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type py -A 5 $'raise RuntimeError$"Not supported!"$'
Length of output: 1744
sebs/gcp/gcp.py (6)

143-192: LGTM!

The create_trigger_resource method is well-implemented and enhances the functionality related to trigger resources in GCP. It effectively checks the trigger type, creates the necessary resources, logs the creation process, and handles potential conflicts if the resources already exist.

251-252: LGTM!

The code changes to write additional dependencies to the requirements.txt file in the package_code function are approved.

311-314: LGTM!

The code changes to call create_trigger_resource before creating the function in the create_function method are approved. This ensures that the required trigger resources are created before the function deployment process.

380-416: LGTM!

The code changes to the create_trigger method to handle different trigger types, including HTTP, Queue, and Storage triggers, are approved. The method now imports QueueTrigger and StorageTrigger classes and includes appropriate logging for each trigger type, providing a more structured approach to managing triggers.

428-441: LGTM!

The code changes to the cached_function method to handle different trigger types, including LibraryTrigger, QueueTrigger, and StorageTrigger, are approved. These changes enhance the functionality of the method by allowing it to handle different trigger types.

457-460: LGTM!

The code changes to call create_trigger_resource before updating the function in the update_function method are approved. This ensures that the required trigger resources are verified before the function deployment process.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 19

Outside diff range and nitpick comments (1)

docs/platforms.md (1)
88-90: Specify language for fenced code blocks.

The added environment variable export commands should specify the language for fenced code blocks to follow Markdownlint guidelines.
-``` 
+```bash
export AZURE_SECRET_APPLICATION_ID=XXXXXXXXXXXXXXXX
export AZURE_SECRET_TENANT=XXXXXXXXXXXX
export AZURE_SECRET_PASSWORD=XXXXXXXXXXXXX
</blockquote></details>

</blockquote></details>

<details>
<summary>Review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**

<details>
<summary>Commits</summary>

Files that changed from the base of the PR and between a2eb936737231f70a237832c91fb41e240dd32a5 and ba67b4a8d4e739364eed7dc2e86cbdaf432e633b.

</details>


<details>
<summary>Files selected for processing (28)</summary>

* benchmarks/wrappers/aws/python/handler.py (1 hunks)
* benchmarks/wrappers/aws/python/storage.py (1 hunks)
* benchmarks/wrappers/azure/python/handler.py (2 hunks)
* benchmarks/wrappers/gcp/python/handler.py (2 hunks)
* config/example.json (1 hunks)
* config/systems.json (1 hunks)
* docs/modularity.md (1 hunks)
* docs/platforms.md (1 hunks)
* requirements.azure.txt (1 hunks)
* sebs.py (3 hunks)
* sebs/aws/aws.py (14 hunks)
* sebs/aws/config.py (6 hunks)
* sebs/aws/function.py (2 hunks)
* sebs/aws/s3.py (7 hunks)
* sebs/aws/triggers.py (5 hunks)
* sebs/azure/azure.py (7 hunks)
* sebs/azure/function.py (4 hunks)
* sebs/azure/triggers.py (2 hunks)
* sebs/benchmark.py (1 hunks)
* sebs/cache.py (1 hunks)
* sebs/experiments/config.py (5 hunks)
* sebs/faas/function.py (1 hunks)
* sebs/faas/system.py (1 hunks)
* sebs/gcp/function.py (2 hunks)
* sebs/gcp/gcp.py (8 hunks)
* sebs/gcp/triggers.py (2 hunks)
* sebs/local/local.py (1 hunks)
* sebs/openwhisk/openwhisk.py (1 hunks)

</details>






<details>
<summary>Files skipped from review due to trivial changes (5)</summary>

* config/example.json
* requirements.azure.txt
* sebs/aws/config.py
* sebs/cache.py
* sebs/faas/function.py

</details>



<details>
<summary>Additional context used</summary>

<details>
<summary>Ruff</summary><blockquote>

<details>
<summary>sebs/experiments/config.py</summary><blockquote>

63-63: Use `config.get("flags", {})` instead of an `if` block

Replace with `config.get("flags", {})`

(SIM401)

---

64-64: Use `config.get("trigger", {})` instead of an `if` block

Replace with `config.get("trigger", {})`

(SIM401)

</blockquote></details>
<details>
<summary>benchmarks/wrappers/azure/python/handler.py</summary><blockquote>

2-2: `base64` imported but unused

Remove unused import: `base64`

(F401)

---

80-80: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

---

89-89: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

</blockquote></details>
<details>
<summary>benchmarks/wrappers/gcp/python/handler.py</summary><blockquote>

3-3: `google.cloud.storage` imported but unused

Remove unused import: `google.cloud.storage`

(F401)

---

72-72: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

---

91-91: Local variable `ret` is assigned to but never used

Remove assignment to unused variable `ret`

(F841)

</blockquote></details>
<details>
<summary>sebs/gcp/triggers.py</summary><blockquote>

5-5: `os` imported but unused

Remove unused import: `os`

(F401)

---

8-8: `googleapiclient.errors.HttpError` imported but unused

Remove unused import: `googleapiclient.errors.HttpError`

(F401)

---

212-212: Statement ends with an unnecessary semicolon

Remove unnecessary semicolon

(E703)

</blockquote></details>
<details>
<summary>sebs/azure/azure.py</summary><blockquote>

257-257: Use context handler for opening files

(SIM115)

</blockquote></details>

</blockquote></details>
<details>
<summary>Markdownlint</summary><blockquote>

<details>
<summary>docs/platforms.md</summary><blockquote>

87-87: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</blockquote></details>

</details>
<details>
<summary>Additional comments not posted (35)</summary><blockquote>

<details>
<summary>sebs/azure/function.py (2)</summary><blockquote>

`1-2`: **Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `31-49`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>benchmarks/wrappers/aws/python/storage.py (1)</summary><blockquote>

`50-51`: **New `get_object` method looks good!**

The method is correctly implemented and follows the pattern used in other methods in the class.

</blockquote></details>
<details>
<summary>sebs/gcp/function.py (2)</summary><blockquote>

Line range hint `1-2`: 
**Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `33-51`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>sebs/aws/function.py (2)</summary><blockquote>

Line range hint `1-2`: 
**Imports look good!**

The import statements are necessary for the changes in the `deserialize` method.

---

Line range hint `42-65`: 
**Changes to `deserialize` method look good!**

The updates ensure that the method can handle the new trigger types and improve type safety by using the `cast` function.

</blockquote></details>
<details>
<summary>sebs/experiments/config.py (2)</summary><blockquote>

`35-38`: **LGTM!**

The `trigger` property method is correctly implemented.

---

`50-50`: **LGTM!**

The `trigger` attribute is correctly added to the dictionary returned by the `serialize` method.

</blockquote></details>
<details>
<summary>benchmarks/wrappers/aws/python/handler.py (3)</summary><blockquote>

`10-13`: **LGTM!**

The queue trigger handling block is correctly implemented.

---

`14-23`: **LGTM!**

The storage trigger handling block is correctly implemented.

---

`28-28`: **LGTM!**

The `request-id` and `income-timestamp` are correctly added to the event dictionary.

</blockquote></details>
<details>
<summary>config/systems.json (1)</summary><blockquote>

`22-22`: **Approved: Addition of Python 3.9 base image.**

The addition of Python 3.9 to the list of base images for local experiments is consistent with the existing configuration.

</blockquote></details>
<details>
<summary>sebs/aws/s3.py (6)</summary><blockquote>

`57-58`: **Approved: Improved logging message in `_create_bucket`.**

The log message now includes the bucket name and region, improving clarity.

---

`74-75`: **Approved: Region-specific bucket creation logic.**

The logic ensures that buckets are created correctly in regions other than `us-east-1`.

---

`91-92`: **Approved: Improved error message in `_create_bucket`.**

The error message now includes the bucket name and region, improving clarity.

---

`116-117`: **Approved: Improved logging message in `uploader_func`.**

The log message now includes the file path and bucket name, improving clarity.

---

`127-130`: **Approved: Improved logging message in `download`.**

The log message now includes the bucket name, key, and file path, improving clarity.

---

`159-162`: **Approved: Improved logging message in `clean_bucket`.**

The log message now includes the bucket name and region, improving clarity.

</blockquote></details>
<details>
<summary>sebs/gcp/triggers.py (1)</summary><blockquote>

`5-8`: **Remove unused imports and unnecessary semicolon.**

The imports `os` and `googleapiclient.errors.HttpError` are unused, and there is an unnecessary semicolon.

```diff
- import os
- from googleapiclient.errors import HttpError
+ from google.cloud import storage as gcp_storage
Likely invalid or redundant comment.

Tools

Ruff

5-5: os imported but unused

Remove unused import: os

(F401)

8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)
sebs/aws/triggers.py (2)
138-152: Ensure correct initialization of QueueTrigger attributes.

The QueueTrigger constructor initializes several attributes. Ensure that the attributes _deployment_client, _queue_arn, and _queue_url are correctly assigned and used.

Verification successful

Initialization and usage of QueueTrigger attributes are correct.

The attributes _deployment_client, _queue_arn, and _queue_url are correctly initialized and used within the QueueTrigger class. The properties include assertions to ensure these attributes are set before being accessed.

sebs/aws/triggers.py:

Lines 138-152: Initialization in the constructor.

Usage in properties and methods within the class.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the correct initialization of `QueueTrigger` attributes.

# Test: Search for the usage of `_deployment_client`, `_queue_arn`, and `_queue_url` within the `QueueTrigger` class. Expect: At least one occurrence for each attribute.
rg --type python --pcre2 --context 5 $'class QueueTrigger.*\\n.*def .*\\n.*self\\._deployment_client.*\\n.*self\\._queue_arn.*\\n.*self\\._queue_url' sebs/aws/triggers.py
Length of output: 195

Script:
#!/bin/bash
# Description: Verify the correct initialization of `QueueTrigger` attributes.

# Test: Search for the usage of `_deployment_client`, `_queue_arn`, and `_queue_url` within the `QueueTrigger` class. Expect: At least one occurrence for each attribute.
rg --pcre2 --context 5 'self\._deployment_client|self\._queue_arn|self\._queue_url' sebs/aws/triggers.py
Length of output: 3693
244-255: Ensure correct initialization of StorageTrigger attributes.

The StorageTrigger constructor initializes several attributes. Ensure that the attributes _deployment_client and _bucket_name are correctly assigned and used.
docs/modularity.md (1)

Line range hint 1-1:
Verify the usage and documentation of the new parameter in some_function.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function and that its usage is documented appropriately.
sebs.py (3)
227-230: Ensure the correct insertion of the trigger parameter into the experiment config.

The trigger parameter is inserted into the experiment config. Verify that this is done correctly and consistently.

Verification successful

Verified the insertion of the trigger parameter into the experiment config.

The trigger parameter is correctly and consistently inserted into the experiment config using update_nested_dict.

Location: sebs.py, lines 227-230
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Test: Search for the insertion of the `trigger` parameter. Expect: Correct insertion of the parameter.
rg --type python $'update_nested_dict(config, ["experiments", "trigger"], trigger)'
Length of output: 115

Script:
#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Find all Python files and search for the pattern within them.
fd --type f --extension py --exec rg 'update_nested_dict(config, \["experiments", "trigger"\], trigger)'
Length of output: 106

Script:
#!/bin/bash
# Description: Verify the insertion of the `trigger` parameter into the experiment config.

# Find all Python files and search for any usage of `update_nested_dict` involving "trigger" in the arguments.
fd --type f --extension py --exec rg 'update_nested_dict$config, \[.*"trigger".*\], trigger$'
Length of output: 165
246-251: Ensure the correct handling of the trigger parameter for GCP and Azure.

The trigger parameter is used to augment the function name for GCP and Azure. Verify that this is done correctly.

Verification successful

The trigger parameter is correctly handled for GCP and Azure.

The trigger parameter is used to augment the function name, ensuring the correct handling for both GCP and Azure as described in the code snippet.

sebs.py, lines 246-251: The trigger parameter is appended to the function name for GCP and Azure.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for the handling of the `trigger` parameter for GCP and Azure. Expect: Correct handling of the parameter.
rg --type python $'if deployment_client.name() == "gcp" or deployment_client.name() == "azure":'
Length of output: 128

Script:
#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for the handling of the `trigger` parameter for GCP and Azure.
rg $'if deployment_client.name() == "gcp" or deployment_client.name() == "azure":'
Length of output: 83

Script:
#!/bin/bash
# Description: Verify the handling of the `trigger` parameter for GCP and Azure.

# Test: Search for variations of the condition and the usage of the `trigger` parameter.
rg 'deployment_client.name == "gcp"'
rg 'deployment_client.name == "azure"'
rg '\btrigger\b'
Length of output: 13037
176-176: Verify the usage of the new parameter trigger in invoke.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function.

Verification successful

Verify the usage of the new parameter trigger in invoke.

The trigger parameter is correctly used within the invoke function in sebs.py. The parameter is checked, inserted into the configuration, and used to modify the function name and create triggers appropriately.

Code locations:

trigger parameter initialization and default assignment: trigger = trigger if trigger is not None else "http"

Configuration update: update_nested_dict(config, ["experiments", "trigger"], trigger)

Function name augmentation: function_name = "{}-{}".format(function_name, trigger)

Trigger type retrieval and usage:

trigger_type = Trigger.TriggerType.get(trigger)

triggers = func.triggers(trigger_type)

trigger = deployment_client.create_trigger(func, trigger_type)

trigger = triggers[0]

ret = trigger.sync_invoke(input_config)
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Test: Search for the function definition and its usage. Expect: Correct usage of the new parameter.
rg --type python $'def invoke(benchmark, benchmark_input_size, repetitions, trigger, memory, timeout, function_name, image_tag_prefix, **kwargs)'
Length of output: 177

Script:
#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Test: Search for the function definition and its usage. Expect: Correct usage of the new parameter.
rg --type py 'def invoke(benchmark, benchmark_input_size, repetitions, trigger, memory, timeout, function_name, image_tag_prefix, **kwargs)'
Length of output: 142

Script:
#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Step 1: Search for the function definition `invoke`.
rg --type py 'def invoke'

# Step 2: Search for the usage of the parameter `trigger` within the `invoke` function.
rg --type py 'invoke' | xargs rg 'trigger'
Length of output: 836

Script:
#!/bin/bash
# Description: Verify the usage of the new parameter `trigger` in `invoke`.

# Step 1: Search for the function definition `invoke` in sebs.py.
rg --type py 'def invoke' sebs.py

# Step 2: Search for the usage of the parameter `trigger` within the `invoke` function in sebs.py.
rg --type py 'trigger' sebs.py
Length of output: 900
sebs/aws/aws.py (2)
269-281: Ensure correct handling of the new trigger types in cached_function.

The function cached_function has been updated to handle new trigger types (QueueTrigger and StorageTrigger). Verify that this is done correctly.

Verification successful

The new trigger types (QueueTrigger and StorageTrigger) are correctly handled in the cached_function. The function sets the logging_handlers and assigns the deployment_client for these trigger types, similar to the existing trigger types.

sebs/aws/aws.py:

cached_function handles Trigger.TriggerType.QUEUE and Trigger.TriggerType.STORAGE correctly.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Test: Search for the handling of new trigger types. Expect: Correct handling of new trigger types.
rg --type python $'for trigger in function.triggers(Trigger.TriggerType.QUEUE):'
Length of output: 112

Script:
#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Test: Search for the handling of new trigger types. Expect: Correct handling of new trigger types.
rg 'for trigger in function.triggers(Trigger.TriggerType.QUEUE):' --glob '*.py'
rg 'for trigger in function.triggers(Trigger.TriggerType.STORAGE):' --glob '*.py'
Length of output: 165

Script:
#!/bin/bash
# Description: Verify the handling of new trigger types in `cached_function`.

# Locate the `cached_function` definition.
rg 'def cached_function' --glob '*.py'

# Verify if `cached_function` handles `Trigger.TriggerType.QUEUE`.
rg 'Trigger.TriggerType.QUEUE' --glob '*.py'

# Verify if `cached_function` handles `Trigger.TriggerType.STORAGE`.
rg 'Trigger.TriggerType.STORAGE' --glob '*.py'
Length of output: 1830

Script:
#!/bin/bash
# Description: Extract the content of the `cached_function` from `sebs/aws/aws.py` to verify the handling of new trigger types.

# Extract the `cached_function` definition and its content.
rg -A 20 'def cached_function' sebs/aws/aws.py
Length of output: 1047
135-135: Verify the usage of the new parameter trigger in package_code.

The function signature has been updated to include an additional parameter trigger. Ensure that the new parameter is used correctly within the function.
sebs/benchmark.py (1)

508-508: Verify the integration of the trigger argument.

Ensure that the deployment_build_step function correctly handles the new trigger argument. This change might affect the behavior of the function, so thorough testing is recommended.
sebs/azure/azure.py (3)
160-204: Verify the correctness of the create_function_json function.

Ensure that the function correctly generates the JSON configuration for different trigger types. Pay attention to the entryPoint and bindings fields to ensure they match the expected structure for Azure functions.

Verification successful

Verify the correctness of the create_function_json function.

The create_function_json function generates JSON configurations for different trigger types (queue, storage, HTTP). The usage context shows it writes the configuration to a function.json file. Ensure the generated JSON structure matches Azure's expected format for each trigger type.

Queue Trigger: Verify that the bindings field contains a queueTrigger with the correct attributes.

Storage Trigger: Verify that the bindings field contains a blobTrigger with the correct attributes.

HTTP Trigger: Verify that the bindings field contains an httpTrigger and an http output with the correct attributes.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_function_json` function.

# Test: Search for the function usage and ensure it generates the correct JSON configuration.
rg --type python $'create_function_json'
Length of output: 70

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_function_json` function.

# Test: Search for the function usage and ensure it generates the correct JSON configuration.
rg --glob '*.py' 'create_function_json'
Length of output: 223

Script:
#!/bin/bash
# Description: Extract the context in which `create_function_json` is used.

# Extract 10 lines before and after the usage of `create_function_json`.
rg --glob '*.py' -A 10 -B 10 'create_function_json' sebs/azure/azure.py
Length of output: 1495
Line range hint 221-258:
Verify the integration of the trigger argument in package_code.

Ensure that the trigger argument is correctly handled in the package_code function. This change might affect the behavior of the function, so thorough testing is recommended.

Verification successful

The trigger argument is correctly integrated in the package_code function.

The trigger argument is used appropriately in the func_name string formatting, and there are no other usages of the trigger argument in the function.

sebs/azure/azure.py, lines 238-261: The trigger argument is correctly handled in the package_code function.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of the `trigger` argument in `package_code`.

# Test: Search for the function definition and its usage. Expect: Proper handling of the `trigger` argument.
rg --type python $'def package_code'
Length of output: 68

Script:
#!/bin/bash
# Description: Verify the usage of the `trigger` argument in `package_code`.

# Test: Search for the function definition and its usage. Expect: Proper handling of the `trigger` argument.
rg 'def package_code'
Length of output: 397

Script:
#!/bin/bash
# Description: Extract the `package_code` function implementation from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 30 'def package_code' sebs/azure/azure.py
Length of output: 1192

Script:
#!/bin/bash
# Description: Extract the complete implementation of the `package_code` function from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 50 'def package_code' sebs/azure/azure.py
Length of output: 1831

Script:
#!/bin/bash
# Description: Extract the remaining implementation of the `package_code` function from `sebs/azure/azure.py`.

# Extract the function implementation for review.
rg -A 70 'def package_code' sebs/azure/azure.py
Length of output: 2502
Tools

Ruff

257-257: Use context handler for opening files

(SIM115)

577-613: Verify the correctness of the create_trigger function.

Ensure that the function correctly creates the specified trigger type and integrates it with the Azure function. Pay attention to the role assignment and trigger creation logic to ensure they are implemented correctly.

Verification successful

The create_trigger function appears to be correctly implemented and used.

The function correctly creates the specified trigger type and integrates it with the Azure function. The role assignment and trigger creation logic are appropriately handled.

sebs.py: The function is invoked to create a trigger when no existing triggers are found for the specified type.

sebs/regression.py: Similar usage pattern, ensuring triggers are created if not already present.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function.

# Test: Search for the function usage and ensure it creates the correct trigger type.
rg --type python $'create_trigger'
Length of output: 64

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function.

# Search for the function definition of `create_trigger` in all Python files.
rg 'def create_trigger' --glob '*.py'

# Search for the usage of the `create_trigger` function in all Python files.
rg 'create_trigger' --glob '*.py'
Length of output: 2532

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function in `sebs/azure/azure.py`.

# Extract the specific usages of the `create_trigger` function in `sebs/azure/azure.py`.
rg 'create_trigger' --glob 'sebs/azure/azure.py'
Length of output: 162

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function by analyzing its usage in other files.

# Extract the specific lines of code where `create_trigger` is invoked in other files.
rg 'create_trigger' --glob '*.py' -n
Length of output: 1772

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger` function by analyzing its usage in `sebs.py` and `sebs/regression.py`.

# Extract the specific lines of code around the invocation of `create_trigger` in `sebs.py`.
rg 'create_trigger' sebs.py -A 5 -B 5

# Extract the specific lines of code around the invocation of `create_trigger` in `sebs/regression.py`.
rg 'create_trigger' sebs/regression.py -A 5 -B 5
Length of output: 1117
sebs/gcp/gcp.py (3)
110-123: Verify the correctness of the get_trigger_resource_name function.

Ensure that the function correctly generates the fully qualified name of the trigger resource for GCP functions. Pay attention to the string formatting to ensure it matches the expected structure.

Verification successful

The get_trigger_resource_name function is correctly implemented.

The function is used appropriately within the codebase to generate the fully qualified names of GCP trigger resources, and the string formatting matches the expected structure.

sebs/gcp/gcp.py: Lines 110-123
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the correctness of the `get_trigger_resource_name` function.

# Test: Search for the function usage and ensure it generates the correct resource name.
rg --type python $'get_trigger_resource_name'
Length of output: 75

Script:
#!/bin/bash
# Description: Verify the correctness of the `get_trigger_resource_name` function.

# Test: Search for the function usage and ensure it generates the correct resource name.
rg --glob '*.py' 'get_trigger_resource_name'
Length of output: 505

Script:
#!/bin/bash
# Description: Extract lines around the occurrences of `get_trigger_resource_name` to understand its usage.

# Extract 5 lines before and after each occurrence of `get_trigger_resource_name` to understand its context.
rg --glob '*.py' -C 5 'get_trigger_resource_name'
Length of output: 2747
143-193: Verify the correctness of the create_trigger_resource function.

Ensure that the function correctly creates the required resources for GCP functions. Pay attention to the error handling and resource creation logic to ensure they are implemented correctly.

Verification successful

The create_trigger_resource function is correctly implemented and used.

The function is appropriately used to ensure the existence of necessary resources before creating or updating GCP functions. The error handling and resource creation logic are correctly implemented and utilized.

sebs/gcp/gcp.py: Lines around 143 and 193
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger_resource` function.

# Test: Search for the function usage and ensure it creates the correct resources.
rg --type python $'create_trigger_resource'
Length of output: 73

Script:
#!/bin/bash
# Description: Verify the correctness of the `create_trigger_resource` function.

# Test: Search for the function usage and ensure it creates the correct resources.
rg 'create_trigger_resource'
Length of output: 300

Script:
#!/bin/bash
# Description: Extract surrounding lines of code where `create_trigger_resource` is called.

# Extract 10 lines before and after the function calls to understand the context.
rg 'create_trigger_resource' -A 10 -B 10
Length of output: 3607
310-313: Verify the integration of the trigger argument in create_function.

Ensure that the trigger argument is correctly handled in the create_function function. This change might affect the behavior of the function, so thorough testing is recommended.

sebs/gcp/triggers.py

coderabbitai · 2024-07-05T05:35:46Z

sebs/experiments/config.py

        cfg._download_results = config["download_results"]
        cfg._runtime = Runtime.deserialize(config["runtime"])
        cfg._flags = config["flags"] if "flags" in config else {}
+        cfg._trigger = config["trigger"] if "trigger" in config else {}


Simplify the conditional statement.

The conditional statement can be simplified using the get method.

cfg._flags = config["flags"] if "flags" in config else {} - cfg._trigger = config["trigger"] if "trigger" in config else {} + cfg._trigger = config.get("trigger", {})

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cfg._trigger = config["trigger"] if "trigger" in config else {}

cfg._trigger = config.get("trigger", {})

Tools

Ruff

64-64: Use config.get("trigger", {}) instead of an if block

Replace with config.get("trigger", {})

(SIM401)

coderabbitai · 2024-07-05T05:35:46Z

sebs/experiments/config.py

        self._flags: Dict[str, bool] = {}
        self._experiment_configs: Dict[str, dict] = {}
        self._runtime = Runtime(None, None)
+        self._trigger: Trigger.TriggerType


Initialize the _trigger attribute in the constructor.

The _trigger attribute is added but not initialized in the constructor. This could lead to an AttributeError if accessed before being set.

self._runtime = Runtime(None, None) self._trigger: Trigger.TriggerType + self._trigger = None

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self._trigger: Trigger.TriggerType

self._runtime = Runtime(None, None)

self._trigger: Trigger.TriggerType

self._trigger = None

sebs/azure/triggers.py

coderabbitai · 2024-07-05T05:35:47Z

sebs/aws/triggers.py

+    def sync_invoke(self, payload: dict) -> ExecutionResult:
+
+        self.logging.debug(f"Invoke function {self.name}")
+
+        sqs_client = boto3.client(
+            'sqs', region_name=self.deployment_client.config.region)
+
+        # Publish payload to queue
+        serialized_payload = json.dumps(payload)
+        sqs_client.send_message(
+            QueueUrl=self.queue_url, MessageBody=serialized_payload)
+        self.logging.info(f"Sent message to queue {self.name}")
+
+        # TODO(oana): gather metrics
+
+    def async_invoke(self, payload: dict) -> concurrent.futures.Future:


Gather metrics in QueueTrigger.sync_invoke.

The sync_invoke method has a TODO comment to gather metrics. Ensure that metrics are gathered to monitor the performance and usage of the queue.

Do you want me to implement the metrics gathering logic or open a GitHub issue to track this task?

sebs/aws/triggers.py

coderabbitai · 2024-07-05T05:35:47Z

sebs/aws/triggers.py

+        # When creating the trigger for the first time, also create and store
+        # queue information.
+        if (not self.queue_arn and not self.queue_url):
+            # Init clients
+            lambda_client = self.deployment_client.get_lambda_client()
+            sqs_client = boto3.client(
+                'sqs', region_name=self.deployment_client.config.region)
+
+            # Create queue
+            self.logging.debug(f"Creating queue {self.name}")
+
+            self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
+            self._queue_arn = sqs_client.get_queue_attributes(
+                QueueUrl=self.queue_url,
+                AttributeNames=["QueueArn"]
+            )["Attributes"]["QueueArn"]
+
+            self.logging.debug("Created queue")
+
+            # Add queue trigger
+            if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,
+                                                                FunctionName=self.name)
+                        ["EventSourceMappings"])):
+                lambda_client.create_event_source_mapping(
+                    EventSourceArn=self.queue_arn,
+                    FunctionName=self.name,
+                    MaximumBatchingWindowInSeconds=1
+                )
+


Encapsulate queue creation logic in a separate method.

The queue creation logic within the QueueTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+ def create_queue(self): + lambda_client = self.deployment_client.get_lambda_client() + sqs_client = boto3.client('sqs', region_name=self.deployment_client.config.region) + self.logging.debug(f"Creating queue {self.name}") + self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"] + self._queue_arn = sqs_client.get_queue_attributes( + QueueUrl=self.queue_url, + AttributeNames=["QueueArn"] + )["Attributes"]["QueueArn"] + self.logging.debug("Created queue") + if not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn, FunctionName=self.name)["EventSourceMappings"]): + lambda_client.create_event_source_mapping( + EventSourceArn=self.queue_arn, + FunctionName=self.name, + MaximumBatchingWindowInSeconds=1 + ) def __init__(self, fname: str, deployment_client: Optional[AWS] = None, queue_arn: Optional[str] = None, queue_url: Optional[str] = None): super().__init__() self.name = fname self._deployment_client = None self._queue_arn = None self._queue_url = None if deployment_client: self._deployment_client = deployment_client if queue_arn: self._queue_arn = queue_arn if queue_url: self._queue_url = queue_url if not self.queue_arn and not self.queue_url: self.create_queue()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# When creating the trigger for the first time, also create and store

# queue information.

if (not self.queue_arn and not self.queue_url):

# Init clients

lambda_client = self.deployment_client.get_lambda_client()

sqs_client = boto3.client(

'sqs', region_name=self.deployment_client.config.region)

# Create queue

self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]

self._queue_arn = sqs_client.get_queue_attributes(

QueueUrl=self.queue_url,

AttributeNames=["QueueArn"]

)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger

if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,

FunctionName=self.name)

["EventSourceMappings"])):

lambda_client.create_event_source_mapping(

EventSourceArn=self.queue_arn,

FunctionName=self.name,

MaximumBatchingWindowInSeconds=1

)

# When creating the trigger for the first time, also create and store

# queue information.

if (not self.queue_arn and not self.queue_url):

self.create_queue()

def create_queue(self):

# Init clients

lambda_client = self.deployment_client.get_lambda_client()

sqs_client = boto3.client(

'sqs', region_name=self.deployment_client.config.region)

# Create queue

self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]

self._queue_arn = sqs_client.get_queue_attributes(

QueueUrl=self.queue_url,

AttributeNames=["QueueArn"]

)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger

if (not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn,

FunctionName=self.name)

["EventSourceMappings"])):

lambda_client.create_event_source_mapping(

EventSourceArn=self.queue_arn,

FunctionName=self.name,

MaximumBatchingWindowInSeconds=1

)

coderabbitai · 2024-07-05T05:35:47Z

sebs/aws/triggers.py

+        # When creating the trigger for the first time, also create and store
+        # storage bucket information.
+        if (not self.bucket_name):
+            # Init clients
+            s3 = boto3.resource('s3')
+            lambda_client = self.deployment_client.get_lambda_client()
+
+            # AWS disallows underscores in bucket names
+            self._bucket_name = self.name.replace('_', '-')
+            function_arn = lambda_client.get_function(FunctionName=self.name)[
+                "Configuration"]["FunctionArn"]
+
+            # Create bucket
+            self.logging.info(f"Creating bucket {self.bucket_name}")
+
+            region = self.deployment_client.config.region
+            if (region == "us-east-1"):
+                s3.create_bucket(Bucket=self.bucket_name)
+            else:
+                s3.create_bucket(
+                    Bucket=self.bucket_name,
+                    CreateBucketConfiguration={
+                        "LocationConstraint": region
+                    }
+                )
+
+            self.logging.info("Created bucket")
+
+            lambda_client.add_permission(
+                FunctionName=self.name,
+                StatementId=str(uuid.uuid1()),
+                Action="lambda:InvokeFunction",
+                Principal="s3.amazonaws.com",
+                SourceArn=f"arn:aws:s3:::{self.bucket_name}",
+            )
+
+            # Add bucket trigger
+            bucket_notification = s3.BucketNotification(self.bucket_name)
+            bucket_notification.put(
+                NotificationConfiguration={'LambdaFunctionConfigurations': [
+                    {
+                        'LambdaFunctionArn': function_arn,
+                        'Events': [
+                            's3:ObjectCreated:*'
+                        ],
+
+                    },
+                ]})
+


Encapsulate bucket creation logic in a separate method.

The bucket creation logic within the StorageTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+ def create_bucket(self): + s3 = boto3.resource('s3') + lambda_client = self.deployment_client.get_lambda_client() + self._bucket_name = self.name.replace('_', '-') + function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"] + self.logging.info(f"Creating bucket {self._bucket_name}") + region = self.deployment_client.config.region + if region == "us-east-1": + s3.create_bucket(Bucket=self._bucket_name) + else: + s3.create_bucket( + Bucket=self._bucket_name, + CreateBucketConfiguration={"LocationConstraint": region} + ) + self.logging.info("Created bucket") + lambda_client.add_permission( + FunctionName=self.name, + StatementId=str(uuid.uuid1()), + Action="lambda:InvokeFunction", + Principal="s3.amazonaws.com", + SourceArn=f"arn:aws:s3:::{self._bucket_name}", + ) + bucket_notification = s3.BucketNotification(self._bucket_name) + bucket_notification.put( + NotificationConfiguration={'LambdaFunctionConfigurations': [ + { + 'LambdaFunctionArn': function_arn, + 'Events': ['s3:ObjectCreated:*'] + }, + ]} + ) def __init__(self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None): super().__init__() self.name = fname self._deployment_client = None self._bucket_name = None if deployment_client: self._deployment_client = deployment_client if bucket_name: self._bucket_name = bucket_name if not self.bucket_name: self.create_bucket()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# When creating the trigger for the first time, also create and store

# storage bucket information.

if (not self.bucket_name):

# Init clients

s3 = boto3.resource('s3')

lambda_client = self.deployment_client.get_lambda_client()

# AWS disallows underscores in bucket names

self._bucket_name = self.name.replace('_', '-')

function_arn = lambda_client.get_function(FunctionName=self.name)[

"Configuration"]["FunctionArn"]

# Create bucket

self.logging.info(f"Creating bucket {self.bucket_name}")

region = self.deployment_client.config.region

if (region == "us-east-1"):

s3.create_bucket(Bucket=self.bucket_name)

else:

s3.create_bucket(

Bucket=self.bucket_name,

CreateBucketConfiguration={

"LocationConstraint": region

}

)

self.logging.info("Created bucket")

lambda_client.add_permission(

FunctionName=self.name,

StatementId=str(uuid.uuid1()),

Action="lambda:InvokeFunction",

Principal="s3.amazonaws.com",

SourceArn=f"arn:aws:s3:::{self.bucket_name}",

)

# Add bucket trigger

bucket_notification = s3.BucketNotification(self.bucket_name)

bucket_notification.put(

NotificationConfiguration={'LambdaFunctionConfigurations': [

{

'LambdaFunctionArn': function_arn,

'Events': [

's3:ObjectCreated:*'

],

},

]})

def create_bucket(self):

s3 = boto3.resource('s3')

lambda_client = self.deployment_client.get_lambda_client()

self._bucket_name = self.name.replace('_', '-')

function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]

self.logging.info(f"Creating bucket {self._bucket_name}")

region = self.deployment_client.config.region

if region == "us-east-1":

s3.create_bucket(Bucket=self._bucket_name)

else:

s3.create_bucket(

Bucket=self._bucket_name,

CreateBucketConfiguration={"LocationConstraint": region}

)

self.logging.info("Created bucket")

lambda_client.add_permission(

FunctionName=self.name,

StatementId=str(uuid.uuid1()),

Action="lambda:InvokeFunction",

Principal="s3.amazonaws.com",

SourceArn=f"arn:aws:s3:::{self._bucket_name}",

)

bucket_notification = s3.BucketNotification(self._bucket_name)

bucket_notification.put(

NotificationConfiguration={'LambdaFunctionConfigurations': [

{

'LambdaFunctionArn': function_arn,

'Events': ['s3:ObjectCreated:*']

},

]}

)

# When creating the trigger for the first time, also create and store

# storage bucket information.

if (not self.bucket_name):

self.create_bucket()

sebs/openwhisk/openwhisk.py

coderabbitai

Actionable comments posted: 10

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ba67b4a and be4e4f9.

Files selected for processing (9)

sebs/aws/aws.py (4 hunks)
sebs/aws/function.py (2 hunks)
sebs/aws/triggers.py (2 hunks)
sebs/azure/azure.py (7 hunks)
sebs/azure/function.py (4 hunks)
sebs/azure/triggers.py (2 hunks)
sebs/benchmark.py (3 hunks)
sebs/gcp/function.py (2 hunks)
sebs/gcp/triggers.py (2 hunks)

Files skipped from review as they are similar to previous changes (3)

sebs/aws/aws.py
sebs/azure/triggers.py
sebs/benchmark.py

Additional context used

Ruff

sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)

8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

Additional comments not posted (16)

sebs/azure/function.py (2)
19-21: LGTM!

The typename method is straightforward and correctly implemented.

Line range hint 31-48: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

The imports are necessary and correctly used.

The imports from sebs.faas.function import Trigger and from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger are utilized in multiple files, including sebs/azure/function.py. Therefore, they are essential for the functionality implemented in the deserialize method.

sebs/azure/function.py

tests/aws/invoke_function_http.py

sebs/regression.py

sebs/benchmark.py

sebs/gcp/function.py

sebs/gcp/gcp.py

sebs/openwhisk/function.py

sebs/experiments/perf_cost.py

sebs/experiments/network_ping_pong.py

sebs/experiments/invocation_overhead.py

sebs/aws/function.py

sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger'
Length of output: 216

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for usage of import `from sebs.faas.function import Trigger`.
rg 'from sebs.faas.function import Trigger' -g '*.py' -A 5

# Test: Search for usage of import `from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger`.
rg 'from sebs.azure.triggers import HTTPTrigger, QueueTrigger, StorageTrigger' -g '*.py' -A 5
Length of output: 5976
sebs/gcp/function.py (1)
Line range hint 33-51: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

Verification successful: The imports from sebs.faas.function import Trigger and from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger are necessary and correctly used in the deserialize method.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'
Length of output: 230

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for the usage of `from sebs.faas.function import Trigger`.
rg 'from sebs.faas.function import Trigger' -A 5

# Test: Search for the usage of `from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg 'from sebs.gcp.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger' -A 5
Length of output: 5730
sebs/aws/function.py (1)
Line range hint 42-63: Ensure the imports are necessary and correct.

The deserialize method correctly imports and casts new trigger types. The logic for handling unknown trigger types is also appropriate.

However, verify that all imports are necessary and used.

Verification successful

Imports are necessary and correctly used.

The imports from sebs.faas.function import Trigger and from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger are necessary and correctly used in the deserialize method.

sebs/aws/function.py: Usage of from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger confirmed.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for unused import `from sebs.faas.function import Trigger`.
rg --type python $'from sebs.faas.function import Trigger'

# Test: Search for unused import `from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg --type python $'from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'
Length of output: 230

Script:
#!/bin/bash
# Description: Verify the usage of imports in the `deserialize` method.

# Test: Search for usage of the import `from sebs.faas.function import Trigger`.
rg -t py 'from sebs.faas.function import Trigger'

# Test: Search for usage of the import `from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger`.
rg -t py 'from sebs.aws.triggers import LibraryTrigger, HTTPTrigger, QueueTrigger, StorageTrigger'
Length of output: 1170
sebs/aws/triggers.py (11)

183-184: LGTM!

The method typename is straightforward and correct.

186-189: LGTM!

The property queue_arn is straightforward and correct.

191-194: LGTM!

The property queue_url is straightforward and correct.

197-199: LGTM!

The property deployment_client is straightforward and correct.

201-203: LGTM!

The setter for deployment_client is straightforward and correct.

205-207: LGTM!

The method trigger_type is straightforward and correct.

305-306: LGTM!

The method typename is straightforward and correct.

309-311: LGTM!

The property bucket_name is straightforward and correct.

314-316: LGTM!

The property deployment_client is straightforward and correct.

318-320: LGTM!

The setter for deployment_client is straightforward and correct.

322-324: LGTM!

The method trigger_type is straightforward and correct.

sebs/azure/azure.py (1)

39-41: LGTM!

The method typename is straightforward and correct.

sebs/aws/triggers.py

coderabbitai · 2024-07-10T23:21:51Z

sebs/aws/triggers.py

+class StorageTrigger(Trigger):
+    def __init__(
+        self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None
+    ):
+        super().__init__()
+        self.name = fname
+
+        self._deployment_client = None
+        self._bucket_name = None
+
+        if deployment_client:
+            self._deployment_client = deployment_client
+        if bucket_name:
+            self._bucket_name = bucket_name
+
+        # When creating the trigger for the first time, also create and store
+        # storage bucket information.
+        if not self.bucket_name:
+            # Init clients
+            s3 = boto3.resource("s3")
+            lambda_client = self.deployment_client.get_lambda_client()
+
+            # AWS disallows underscores in bucket names
+            self._bucket_name = self.name.replace("_", "-")
+            function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"][
+                "FunctionArn"
+            ]
+
+            # Create bucket
+            self.logging.info(f"Creating bucket {self.bucket_name}")
+
+            region = self.deployment_client.config.region
+            if region == "us-east-1":
+                s3.create_bucket(Bucket=self.bucket_name)
+            else:
+                s3.create_bucket(
+                    Bucket=self.bucket_name,
+                    CreateBucketConfiguration={"LocationConstraint": region},
+                )
+
+            self.logging.info("Created bucket")
+
+            lambda_client.add_permission(
+                FunctionName=self.name,
+                StatementId=str(uuid.uuid1()),
+                Action="lambda:InvokeFunction",
+                Principal="s3.amazonaws.com",
+                SourceArn=f"arn:aws:s3:::{self.bucket_name}",
+            )
+
+            # Add bucket trigger
+            bucket_notification = s3.BucketNotification(self.bucket_name)
+            bucket_notification.put(
+                NotificationConfiguration={
+                    "LambdaFunctionConfigurations": [
+                        {
+                            "LambdaFunctionArn": function_arn,
+                            "Events": ["s3:ObjectCreated:*"],
+                        },
+                    ]
+                }
+            )


Encapsulate bucket creation logic in a separate method.

The bucket creation logic within the StorageTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+ def create_bucket(self): + s3 = boto3.resource('s3') + lambda_client = self.deployment_client.get_lambda_client() + self._bucket_name = self.name.replace('_', '-') + function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"] + self.logging.info(f"Creating bucket {self._bucket_name}") + region = self.deployment_client.config.region + if region == "us-east-1": + s3.create_bucket(Bucket=self._bucket_name) + else: + s3.create_bucket( + Bucket=self._bucket_name, + CreateBucketConfiguration={"LocationConstraint": region} + ) + self.logging.info("Created bucket") + lambda_client.add_permission( + FunctionName=self.name, + StatementId=str(uuid.uuid1()), + Action="lambda:InvokeFunction", + Principal="s3.amazonaws.com", + SourceArn=f"arn:aws:s3:::{self._bucket_name}", + ) + bucket_notification = s3.BucketNotification(self._bucket_name) + bucket_notification.put( + NotificationConfiguration={'LambdaFunctionConfigurations': [ + { + 'LambdaFunctionArn': function_arn, + 'Events': ['s3:ObjectCreated:*'] + }, + ]} + ) def __init__(self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None): super().__init__() self.name = fname self._deployment_client = None self._bucket_name = None if deployment_client: self._deployment_client = deployment_client if bucket_name: self._bucket_name = bucket_name if not self.bucket_name: self.create_bucket()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class StorageTrigger(Trigger):

def __init__(

self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None

):

super().__init__()

self.name = fname

self._deployment_client = None

self._bucket_name = None

if deployment_client:

self._deployment_client = deployment_client

if bucket_name:

self._bucket_name = bucket_name

# When creating the trigger for the first time, also create and store

# storage bucket information.

if not self.bucket_name:

# Init clients

s3 = boto3.resource("s3")

lambda_client = self.deployment_client.get_lambda_client()

# AWS disallows underscores in bucket names

self._bucket_name = self.name.replace("_", "-")

function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"][

"FunctionArn"

]

# Create bucket

self.logging.info(f"Creating bucket {self.bucket_name}")

region = self.deployment_client.config.region

if region == "us-east-1":

s3.create_bucket(Bucket=self.bucket_name)

else:

s3.create_bucket(

Bucket=self.bucket_name,

CreateBucketConfiguration={"LocationConstraint": region},

)

self.logging.info("Created bucket")

lambda_client.add_permission(

FunctionName=self.name,

StatementId=str(uuid.uuid1()),

Action="lambda:InvokeFunction",

Principal="s3.amazonaws.com",

SourceArn=f"arn:aws:s3:::{self.bucket_name}",

)

# Add bucket trigger

bucket_notification = s3.BucketNotification(self.bucket_name)

bucket_notification.put(

NotificationConfiguration={

"LambdaFunctionConfigurations": [

{

"LambdaFunctionArn": function_arn,

"Events": ["s3:ObjectCreated:*"],

},

]

}

)

class StorageTrigger(Trigger):

def __init__(

self, fname: str, deployment_client: Optional[AWS] = None, bucket_name: Optional[str] = None

):

super().__init__()

self.name = fname

self._deployment_client = None

self._bucket_name = None

if deployment_client:

self._deployment_client = deployment_client

if bucket_name:

self._bucket_name = bucket_name

# When creating the trigger for the first time, also create and store

# storage bucket information.

if not self.bucket_name:

self.create_bucket()

def create_bucket(self):

s3 = boto3.resource('s3')

lambda_client = self.deployment_client.get_lambda_client()

self._bucket_name = self.name.replace('_', '-')

function_arn = lambda_client.get_function(FunctionName=self.name)["Configuration"]["FunctionArn"]

self.logging.info(f"Creating bucket {self._bucket_name}")

region = self.deployment_client.config.region

if region == "us-east-1":

s3.create_bucket(Bucket=self._bucket_name)

else:

s3.create_bucket(

Bucket=self._bucket_name,

CreateBucketConfiguration={"LocationConstraint": region}

)

self.logging.info("Created bucket")

lambda_client.add_permission(

FunctionName=self.name,

StatementId=str(uuid.uuid1()),

Action="lambda:InvokeFunction",

Principal="s3.amazonaws.com",

SourceArn=f"arn:aws:s3:::{self._bucket_name}",

)

bucket_notification = s3.BucketNotification(self._bucket_name)

bucket_notification.put(

NotificationConfiguration={'LambdaFunctionConfigurations': [

{

'LambdaFunctionArn': function_arn,

'Events': ['s3:ObjectCreated:*']

},

]}

)

coderabbitai · 2024-07-10T23:21:52Z

sebs/aws/triggers.py

+class QueueTrigger(Trigger):
+    def __init__(
+        self,
+        fname: str,
+        deployment_client: Optional[AWS] = None,
+        queue_arn: Optional[str] = None,
+        queue_url: Optional[str] = None,
+    ):
+        super().__init__()
+        self.name = fname
+
+        self._deployment_client = None
+        self._queue_arn = None
+        self._queue_url = None
+
+        if deployment_client:
+            self._deployment_client = deployment_client
+        if queue_arn:
+            self._queue_arn = queue_arn
+        if queue_url:
+            self._queue_url = queue_url
+
+        # When creating the trigger for the first time, also create and store
+        # queue information.
+        if not self.queue_arn and not self.queue_url:
+            # Init clients
+            lambda_client = self.deployment_client.get_lambda_client()
+            sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)
+
+            # Create queue
+            self.logging.debug(f"Creating queue {self.name}")
+
+            self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]
+            self._queue_arn = sqs_client.get_queue_attributes(
+                QueueUrl=self.queue_url, AttributeNames=["QueueArn"]
+            )["Attributes"]["QueueArn"]
+
+            self.logging.debug("Created queue")
+
+            # Add queue trigger
+            if not len(
+                lambda_client.list_event_source_mappings(
+                    EventSourceArn=self.queue_arn, FunctionName=self.name
+                )["EventSourceMappings"]
+            ):
+                lambda_client.create_event_source_mapping(
+                    EventSourceArn=self.queue_arn,
+                    FunctionName=self.name,
+                    MaximumBatchingWindowInSeconds=1,
+                )
+


Encapsulate queue creation logic in a separate method.

The queue creation logic within the QueueTrigger constructor can be encapsulated in a separate method to improve readability and maintainability.

+ def create_queue(self): + lambda_client = self.deployment_client.get_lambda_client() + sqs_client = boto3.client('sqs', region_name=self.deployment_client.config.region) + self.logging.debug(f"Creating queue {self.name}") + self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"] + self._queue_arn = sqs_client.get_queue_attributes( + QueueUrl=self.queue_url, + AttributeNames=["QueueArn"] + )["Attributes"]["QueueArn"] + self.logging.debug("Created queue") + if not len(lambda_client.list_event_source_mappings(EventSourceArn=self.queue_arn, FunctionName=self.name)["EventSourceMappings"]): + lambda_client.create_event_source_mapping( + EventSourceArn=self.queue_arn, + FunctionName=self.name, + MaximumBatchingWindowInSeconds=1 + ) def __init__(self, fname: str, deployment_client: Optional[AWS] = None, queue_arn: Optional[str] = None, queue_url: Optional[str] = None): super().__init__() self.name = fname self._deployment_client = None self._queue_arn = None self._queue_url = None if deployment_client: self._deployment_client = deployment_client if queue_arn: self._queue_arn = queue_arn if queue_url: self._queue_url = queue_url if not self.queue_arn and not self.queue_url: self.create_queue()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class QueueTrigger(Trigger):

def __init__(

self,

fname: str,

deployment_client: Optional[AWS] = None,

queue_arn: Optional[str] = None,

queue_url: Optional[str] = None,

):

super().__init__()

self.name = fname

self._deployment_client = None

self._queue_arn = None

self._queue_url = None

if deployment_client:

self._deployment_client = deployment_client

if queue_arn:

self._queue_arn = queue_arn

if queue_url:

self._queue_url = queue_url

# When creating the trigger for the first time, also create and store

# queue information.

if not self.queue_arn and not self.queue_url:

# Init clients

lambda_client = self.deployment_client.get_lambda_client()

sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)

# Create queue

self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]

self._queue_arn = sqs_client.get_queue_attributes(

QueueUrl=self.queue_url, AttributeNames=["QueueArn"]

)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger

if not len(

lambda_client.list_event_source_mappings(

EventSourceArn=self.queue_arn, FunctionName=self.name

)["EventSourceMappings"]

):

lambda_client.create_event_source_mapping(

EventSourceArn=self.queue_arn,

FunctionName=self.name,

MaximumBatchingWindowInSeconds=1,

)

class QueueTrigger(Trigger):

def __init__(

self,

fname: str,

deployment_client: Optional[AWS] = None,

queue_arn: Optional[str] = None,

queue_url: Optional[str] = None,

):

super().__init__()

self.name = fname

self._deployment_client = None

self._queue_arn = None

self._queue_url = None

if deployment_client:

self._deployment_client = deployment_client

if queue_arn:

self._queue_arn = queue_arn

if queue_url:

self._queue_url = queue_url

# When creating the trigger for the first time, also create and store

# queue information.

if not self.queue_arn and not self.queue_url:

self.create_queue()

def create_queue(self):

lambda_client = self.deployment_client.get_lambda_client()

sqs_client = boto3.client("sqs", region_name=self.deployment_client.config.region)

self.logging.debug(f"Creating queue {self.name}")

self._queue_url = sqs_client.create_queue(QueueName=self.name)["QueueUrl"]

self._queue_arn = sqs_client.get_queue_attributes(

QueueUrl=self.queue_url, AttributeNames=["QueueArn"]

)["Attributes"]["QueueArn"]

self.logging.debug("Created queue")

# Add queue trigger

if not len(

lambda_client.list_event_source_mappings(

EventSourceArn=self.queue_arn, FunctionName=self.name

)["EventSourceMappings"]

):

lambda_client.create_event_source_mapping(

EventSourceArn=self.queue_arn,

FunctionName=self.name,

MaximumBatchingWindowInSeconds=1,

)

mcopik · 2024-07-10T22:56:48Z

benchmarks/wrappers/aws/python/handler.py

    income_timestamp = datetime.datetime.now().timestamp()

+    # Queue trigger
+    if ("Records" in event and event["Records"][0]["eventSource"] == 'aws:sqs'):


One question here: are we certain we always receive a single event? do we need to add loop here?

mcopik · 2024-07-10T22:58:14Z

benchmarks/wrappers/azure/python/handler.py

+    from . import function
+    ret = function.handler(payload)
+
+    # TODO(oana)


are we missing something here?

It looks like this is missing the core part of making measurements and returning values. Maybe we can just do a bit of restructuring to have a single invocation + measurement code, called from three trigger interfaces?

Just pushed the measurements infrastructure - all invocations are end-to-end complete.

benchmarks/wrappers/azure/python/handler.py

sebs/aws/triggers.py

sebs/azure/azure.py

sebs/gcp/triggers.py

benchmarks/wrappers/gcp/python/handler.py

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between be4e4f9 and 3e52f3a.

Files selected for processing (2)

sebs/aws/triggers.py (2 hunks)
sebs/gcp/gcp.py (8 hunks)

Files skipped from review as they are similar to previous changes (2)

sebs/aws/triggers.py
sebs/gcp/gcp.py

coderabbitai

Actionable comments posted: 8

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3e52f3a and a466aab.

Files selected for processing (4)

sebs/azure/azure.py (7 hunks)
sebs/azure/triggers.py (2 hunks)
sebs/gcp/gcp.py (8 hunks)
sebs/gcp/triggers.py (2 hunks)

Files skipped from review as they are similar to previous changes (2)

sebs/azure/triggers.py
sebs/gcp/gcp.py

Additional context used

Ruff

sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)

8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

sebs/gcp/triggers.py

coderabbitai · 2024-07-10T23:53:06Z

sebs/gcp/triggers.py

+        self.logging.info(f"Invoke function {self.name}")
+
+        # Init clients
+        client = gcp_storage.Client()


Initialize client once.

The client initialization should be done once and reused to avoid unnecessary overhead.

- client = gcp_storage.Client() + if not hasattr(self, '_gcp_storage_client'): + self._gcp_storage_client = gcp_storage.Client() + client = self._gcp_storage_client

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

client = gcp_storage.Client()

if not hasattr(self, '_gcp_storage_client'):

self._gcp_storage_client = gcp_storage.Client()

client = self._gcp_storage_client

sebs/gcp/triggers.py

sebs/azure/azure.py

coderabbitai · 2024-07-10T23:53:07Z

sebs/azure/azure.py

+        from sebs.azure.triggers import QueueTrigger, StorageTrigger
+
+        azure_function = cast(AzureFunction, function)
+        resource_group = self.config.resources.resource_group(self.cli_instance)
+        storage_account = azure_function.function_storage.account_name
+
+        user_principal_name = self.cli_instance.execute("az ad user list")
+
+        storage_account_scope = self.cli_instance.execute(
+            ("az storage account show --resource-group {} --name {} --query id").format(
+                resource_group, storage_account
+            )
+        )
+
+        self.cli_instance.execute(
+            (
+                'az role assignment create --assignee "{}" \
+              --role "Storage {} Data Contributor" \
+              --scope {}'
+            ).format(
+                json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],
+                "Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",
+                storage_account_scope.decode("utf-8"),
+            )
+        )
+
+        trigger: Trigger
+        if trigger_type == Trigger.TriggerType.QUEUE:
+            trigger = QueueTrigger(function.name, storage_account)
+            self.logging.info(f"Created Queue trigger for {function.name} function")
+        elif trigger_type == Trigger.TriggerType.STORAGE:
+            trigger = StorageTrigger(function.name, storage_account)
+            self.logging.info(f"Created Storage trigger for {function.name} function")
+        else:
+            raise RuntimeError("Not supported!")
+
+        trigger.logging_handlers = self.logging_handlers
+        function.add_trigger(trigger)
+        self.cache_client.update_function(function)
+        return trigger


Add logging handlers to each trigger.

Each trigger should have logging handlers assigned for better traceability.

def create_trigger(self, function: Function, trigger_type: Trigger.TriggerType) -> Trigger: from sebs.azure.triggers import QueueTrigger, StorageTrigger azure_function = cast(AzureFunction, function) resource_group = self.config.resources.resource_group(self.cli_instance) storage_account = azure_function.function_storage.account_name user_principal_name = self.cli_instance.execute("az ad user list") storage_account_scope = self.cli_instance.execute( ("az storage account show --resource-group {} --name {} --query id").format( resource_group, storage_account ) ) self.cli_instance.execute( ( 'az role assignment create --assignee "{}" \ --role "Storage {} Data Contributor" \ --scope {}' ).format( json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"], "Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob", storage_account_scope.decode("utf-8"), ) ) trigger: Trigger if trigger_type == Trigger.TriggerType.QUEUE: trigger = QueueTrigger(function.name, storage_account) self.logging.info(f"Created Queue trigger for {function.name} function") elif trigger_type == Trigger.TriggerType.STORAGE: trigger = StorageTrigger(function.name, storage_account) self.logging.info(f"Created Storage trigger for {function.name} function") else: raise RuntimeError("Not supported!") trigger.logging_handlers = self.logging_handlers function.add_trigger(trigger) self.cache_client.update_function(function) return trigger

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from sebs.azure.triggers import QueueTrigger, StorageTrigger

azure_function = cast(AzureFunction, function)

resource_group = self.config.resources.resource_group(self.cli_instance)

storage_account = azure_function.function_storage.account_name

user_principal_name = self.cli_instance.execute("az ad user list")

storage_account_scope = self.cli_instance.execute(

("az storage account show --resource-group {} --name {} --query id").format(

resource_group, storage_account

)

)

self.cli_instance.execute(

(

'az role assignment create --assignee "{}" \

--role "Storage {} Data Contributor" \

--scope {}'

).format(

json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],

"Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",

storage_account_scope.decode("utf-8"),

)

)

trigger: Trigger

if trigger_type == Trigger.TriggerType.QUEUE:

trigger = QueueTrigger(function.name, storage_account)

self.logging.info(f"Created Queue trigger for {function.name} function")

elif trigger_type == Trigger.TriggerType.STORAGE:

trigger = StorageTrigger(function.name, storage_account)

self.logging.info(f"Created Storage trigger for {function.name} function")

else:

raise RuntimeError("Not supported!")

trigger.logging_handlers = self.logging_handlers

function.add_trigger(trigger)

self.cache_client.update_function(function)

return trigger

from sebs.azure.triggers import QueueTrigger, StorageTrigger

azure_function = cast(AzureFunction, function)

resource_group = self.config.resources.resource_group(self.cli_instance)

storage_account = azure_function.function_storage.account_name

user_principal_name = self.cli_instance.execute("az ad user list")

storage_account_scope = self.cli_instance.execute(

("az storage account show --resource-group {} --name {} --query id").format(

resource_group, storage_account

)

)

self.cli_instance.execute(

(

'az role assignment create --assignee "{}" \

--role "Storage {} Data Contributor" \

--scope {}'

).format(

json.loads(user_principal_name.decode("utf-8"))[0]["userPrincipalName"],

"Queue" if trigger_type == Trigger.TriggerType.QUEUE else "Blob",

storage_account_scope.decode("utf-8"),

)

)

trigger: Trigger

if trigger_type == Trigger.TriggerType.QUEUE:

trigger = QueueTrigger(function.name, storage_account)

self.logging.info(f"Created Queue trigger for {function.name} function")

elif trigger_type == Trigger.TriggerType.STORAGE:

trigger = StorageTrigger(function.name, storage_account)

self.logging.info(f"Created Storage trigger for {function.name} function")

else:

raise RuntimeError("Not supported!")

trigger.logging_handlers = self.logging_handlers

function.add_trigger(trigger)

self.cache_client.update_function(function)

return trigger

coderabbitai

Actionable comments posted: 16

Outside diff range, codebase verification and nitpick comments (6)

benchmarks/wrappers/aws/python/handler.py (1)
Line range hint 10-105: Review the updated event handling logic.

The modifications to handle different AWS event sources (SQS and S3) are well-implemented. The logic to check the event source and process it accordingly is clear and aligns with the PR objectives. However, there are a few areas that could be improved for clarity and robustness.

Clarification on event handling: The previous comment by mcopik on line 15 raises a valid point about handling multiple events. This should be addressed to ensure the function can handle multiple records if they exist.

Error handling: There is no error handling around the JSON parsing and other operations which might fail. Adding try-except blocks would improve the robustness.

Code duplication: The logic to handle the HTTP trigger seems to be repeated and could be refactored into a separate function for better maintainability.

Consider these improvements:
     if ("Records" in event and event["Records"][0]["eventSource"] == 'aws:sqs'):
+        try:
             event = json.loads(event["Records"][0]["body"])
+        except json.JSONDecodeError:
+            logging.error("Invalid JSON format in SQS message")
+            return {"statusCode": 400, "body": "Invalid JSON format in SQS message"}

     if ("Records" in event and "s3" in event["Records"][0]):
+        try:
             bucket_name = event["Records"][0]["s3"]["bucket"]["name"]
             file_name = event["Records"][0]["s3"]["object"]["key"]
             obj = storage_inst.get_object(bucket_name, file_name)
             event = json.loads(obj['Body'].read())
+        except Exception as e:
+            logging.error(f"Error processing S3 object: {e}")
+            return {"statusCode": 500, "body": "Error processing S3 object"}
benchmarks/wrappers/azure/python/handler.py (1)
Line range hint 11-122: Review the updated event handling logic for Azure functions.

The modifications to handle different Azure event sources (HTTP, Queue, and Blob Storage) are well-implemented. The logic to check the event source and process it accordingly is clear and aligns with the PR objectives. However, there are a few areas that could be improved for clarity and robustness.

Clarification on event handling: The previous comments suggest that there might be missing core parts of making measurements and returning values. This should be addressed to ensure the function can handle multiple records if they exist.

Error handling: There is no error handling around the JSON parsing and other operations which might fail. Adding try-except blocks would improve the robustness.

Code duplication: The logic to handle the HTTP trigger seems to be repeated and could be refactored into a separate function for better maintainability.

Consider these improvements:
     if 'connection_string' in req_json:
+        try:
             os.environ['STORAGE_CONNECTION_STRING'] = req_json['connection_string']
+        except Exception as e:
+            logging.error(f"Failed to set connection string: {e}")
+            return func.HttpResponse(status_code=500, body="Failed to set connection string")

     payload = msg.get_json()
+    try:
         payload['request-id'] = context.invocation_id
         payload['income-timestamp'] = income_timestamp
         stats = measure(payload)
         queue_client.send_message(stats)
+    except Exception as e:
+        logging.error(f"Error processing queue message: {e}")
+        return func.HttpResponse(status_code=500, body="Error processing queue message")
Tools

Ruff

2-2: base64 imported but unused

Remove unused import: base64

(F401)

5-5: azure.identity.ManagedIdentityCredential imported but unused

Remove unused import: azure.identity.ManagedIdentityCredential

(F401)

6-6: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)
benchmarks/wrappers/gcp/python/handler.py (1)

Line range hint 68-123: Refactor suggested for measure function.

The measure function is crucial as it is used by all handlers to gather measurements. However, its implementation is complex and could benefit from refactoring to improve clarity and maintainability.

Consider breaking down the function into smaller, more manageable functions that handle specific parts of the measurement process. This will make the code easier to maintain and test.

Tools

Ruff

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)

sebs/aws/aws.py (1)

Line range hint 495-530: Review the updates to the create_trigger method for supporting QueueTrigger and StorageTrigger.

The updates to the create_trigger method to support QueueTrigger and StorageTrigger are significant. The method now includes conditions to instantiate these triggers, assign logging handlers, and log the creation of these triggers. This enhances the feedback mechanism for users regarding trigger management.

Ensure that the creation of these new triggers is handled efficiently and that the logs provide clear and useful information about the trigger creation process. It's also important to check that these triggers are correctly registered and functional within the AWS environment.

sebs/benchmark.py (1)

Line range hint 475-514: Review the commented-out changes in the build method.

The commented-out section in the build method suggests a planned change to include an optional Trigger.TriggerType parameter. This change could significantly expand the method's functionality to accommodate more complex scenarios involving triggers.

Clarify the status of this planned change. If it is to be implemented, ensure that all necessary updates are made consistently across the codebase. If not, consider removing the commented-out code to avoid confusion.
sebs/azure/azure.py (1)
Line range hint 222-260: Approved: Method package_code correctly packages code for deployment.

This method is well-implemented for handling different language configurations and trigger types. However, consider using a context manager for file operations to ensure files are properly closed after operations.

Use a context manager for file operations:
with open(json_out, "w") as file:
    json.dump(self.create_function_json(func_name, EXEC_FILES[language_name]), file, indent=2)
Tools

Ruff

258-258: Use context handler for opening files

(SIM115)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a466aab and dac2840.

Files selected for processing (18)

benchmarks/wrappers/aws/python/handler.py (2 hunks)
benchmarks/wrappers/aws/python/queue.py (1 hunks)
benchmarks/wrappers/azure/python/handler.py (4 hunks)
benchmarks/wrappers/azure/python/queue.py (1 hunks)
benchmarks/wrappers/gcp/python/handler.py (2 hunks)
benchmarks/wrappers/gcp/python/queue.py (1 hunks)
config/systems.json (4 hunks)
docs/modularity.md (2 hunks)
requirements.gcp.txt (1 hunks)
scripts/run_experiments.py (1 hunks)
sebs/aws/aws.py (5 hunks)
sebs/aws/triggers.py (2 hunks)
sebs/azure/azure.py (7 hunks)
sebs/azure/triggers.py (2 hunks)
sebs/benchmark.py (3 hunks)
sebs/gcp/gcp.py (9 hunks)
sebs/gcp/triggers.py (2 hunks)
tests/aws/create_function.py (1 hunks)

Additional context used

Ruff

benchmarks/wrappers/azure/python/handler.py

2-2: base64 imported but unused

Remove unused import: base64

(F401)

5-5: azure.identity.ManagedIdentityCredential imported but unused

Remove unused import: azure.identity.ManagedIdentityCredential

(F401)

6-6: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)

benchmarks/wrappers/gcp/python/handler.py

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)

sebs/gcp/triggers.py

5-5: os imported but unused

Remove unused import: os

(F401)

8-8: googleapiclient.errors.HttpError imported but unused

Remove unused import: googleapiclient.errors.HttpError

(F401)

sebs/azure/triggers.py

11-11: azure.storage.queue.QueueClient imported but unused

Remove unused import: azure.storage.queue.QueueClient

(F401)

sebs/benchmark.py

8-8: typing.Optional imported but unused

Remove unused import: typing.Optional

(F401)

sebs/azure/azure.py

258-258: Use context handler for opening files

(SIM115)

Additional comments not posted (20)

benchmarks/wrappers/gcp/python/handler.py (3)
7-15: Function handler_http is well-implemented.

The renaming and modifications to handle HTTP requests with additional metadata are correctly implemented.

3-3: Remove unused import.

The import google.cloud.storage is flagged as unused and should be removed to clean up the code.

Apply this diff to remove the unused import:
- from google.cloud import storage as gcp_storage
Likely invalid or redundant comment.

Tools

Ruff

3-3: google.cloud.storage imported but unused

Remove unused import: google.cloud.storage

(F401)

17-35: Function handler_queue is well-implemented.

The function correctly handles messages from a queue, including decoding and processing base64-encoded data. Ensure that the encoding and decoding are consistent across all components that interact with this queue.

Run the following script to verify the consistency of data encoding:

Verification successful

Base64 encoding and decoding are consistently used across components.

The handler_queue function in benchmarks/wrappers/gcp/python/handler.py decodes base64-encoded data, and other components like sebs/gcp/triggers.py and sebs/azure/triggers.py encode data using base64. This indicates a consistent approach to data encoding and decoding across the codebase.

benchmarks/wrappers/gcp/python/handler.py: Decodes base64 data.

sebs/gcp/triggers.py: Encodes data using base64.

sebs/azure/triggers.py: Encodes data using base64.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the consistency of data encoding across components.

# Test: Search for base64 encoding and decoding usage. Expect: Consistent usage across components.
rg --type python -A 5 $'base64'
Length of output: 61

Script:
#!/bin/bash
# Description: Verify the consistency of data encoding across components.

# Test: Search for base64 encoding and decoding usage. Expect: Consistent usage across components.
rg --type py -A 5 $'base64'
Length of output: 3990
sebs/azure/triggers.py (2)

61-177: Review: QueueTrigger class implementation.

The QueueTrigger class is well-implemented with methods to handle queue operations effectively. It includes error handling and dynamic queue creation which are crucial for robustness.

Constructor: Properly initializes and creates queues if they are not provided.

Sync and Async Invocations: These methods are implemented to handle message sending and receiving correctly.

Serialization/Deserialization: Ensures that the trigger's state can be saved and restored, which is useful for distributed systems.

Overall, the implementation aligns with the PR's objectives to enhance Azure's trigger capabilities.

Line range hint 180-398: Review: StorageTrigger class implementation.

The StorageTrigger class effectively manages interactions with Azure Blob Storage. It includes comprehensive handling of blob operations and dynamic resource management.

Constructor: Handles the creation of blob containers if they do not exist and sets up necessary configurations.

Sync and Async Invocations: Implements methods to upload blobs and wait for results, which are essential for responsive cloud applications.

Serialization/Deserialization: Facilitates the saving and restoring of the trigger's state, enhancing the trigger's usability in distributed environments.

This class supports the PR's goals by extending the storage capabilities on Azure.

sebs/azure/azure.py (3)

39-41: Approved: Method typename correctly implemented.

This method is straightforward and correctly returns the string "Azure".

354-359: Approved: Method update_function correctly handles HTTP triggers.

This method is correctly implemented to add an HTTP trigger based on the function name suffix. It is specific to its use case and does not require changes at this time.

575-642: Approved: Method create_trigger supports multiple trigger types effectively.

This method is well-implemented to support the creation of HTTP, queue, and storage triggers. The conditional logic and logging statements are appropriately used to manage different trigger types.

config/systems.json (3)

21-22: Approved: Addition of Python 3.9 base images.

The addition of Python 3.9 base images across various platforms is correctly implemented and enhances the flexibility of the environment setup.

Also applies to: 74-75, 118-119

75-75: Approved: Addition of 'queue.py' to deployment files.

The inclusion of 'queue.py' in the deployment files list for AWS, Azure, and GCP is correctly implemented and indicates an expansion of the deployment capabilities to include queue management.

Also applies to: 119-119, 170-170

123-124: Approved: Addition of Azure-specific packages.

The inclusion of 'azure-storage-queue' and 'azure-identity' in the packages list for Azure is correctly implemented and crucial for integrating Azure queue services and identity management into the deployment process.

tests/aws/create_function.py (1)

38-39: Approved: Inclusion of 'function/queue.py' in package files.

The addition of 'function/queue.py' to the package files list for both Python and Node.js is correctly implemented. This ensures that the queue.py file is included in the deployment or testing process, enhancing the functionality related to the handling of queues.

sebs/aws/triggers.py (2)

133-176: Review of QueueTrigger class:

The QueueTrigger class is well-structured and handles the interaction with AWS SQS effectively. However, there are a few areas that could be improved:

Error Handling: Ensure that all AWS SDK calls are wrapped in try-except blocks to handle potential exceptions, especially in the create_queue method.

Resource Management: Consider checking if the queue exists before attempting to create it to avoid unnecessary API calls.

Logging: Add more detailed logging at each step of the queue interaction to aid in debugging and monitoring.

248-326: Review of StorageTrigger class:

The StorageTrigger class handles the interaction with AWS S3 effectively. However, there are a few areas that could be improved:

Error Handling: Ensure that all AWS SDK calls are wrapped in try-except blocks to handle potential exceptions, especially in the create_bucket method.

Resource Management: Consider checking if the bucket exists before attempting to create it to avoid unnecessary API calls.

Logging: Add more detailed logging at each step of the bucket interaction to aid in debugging and monitoring.

scripts/run_experiments.py (1)

448-448: Clarification needed on the integration of queue.py.

The AI-generated summary mentions the addition of queue.py to the initialization process, but it's not explicitly shown in the provided code. Could you clarify how queue.py is integrated and its impact on the functionality of the script?

sebs/gcp/gcp.py (5)

143-190: Method create_trigger_resource is robust and well-implemented.

The method handles the creation of queue and storage triggers effectively, with appropriate checks for existing resources and detailed logging. The return structure with the trigger configuration is well-designed to integrate with the function creation process.

Line range hint 311-331: Modifications to create_function effectively integrate trigger resource management.

The inclusion of trigger resource creation before function deployment ensures that all necessary resources are available, which is crucial for the successful deployment of functions. The method is well-implemented and integrates the trigger information seamlessly into the function creation process.

380-416: Modifications to create_trigger expand support and enhance feedback.

The method now supports the creation of queue and storage triggers, which aligns with the PR's objectives to introduce these trigger types. The enhanced logging provides valuable feedback on the creation process, which is beneficial for monitoring and debugging.

Line range hint 457-474: Modifications to update_function ensure trigger resource availability and integration.

The method effectively ensures that trigger resources are available before updating the function, which is important for consistency and reliability. The integration of trigger information into the function update request is well-handled and crucial for the successful update of functions.

428-441: Modifications to cached_function effectively manage different trigger types.

The method now handles LibraryTrigger, QueueTrigger, and StorageTrigger, assigning appropriate logging handlers and deployment clients. This is crucial for managing the lifecycle of functions and their triggers, ensuring that they can log appropriately and interact effectively with the deployment environment.

coderabbitai · 2024-09-09T08:28:48Z

requirements.gcp.txt

 google-cloud-monitoring==2.0.0
 google-api-python-client-stubs
 google-cloud-logging==2.0.0
+google-cloud-pubsub=2.23.0


Correct the version specifier typo.

The dependency google-cloud-pubsub should use == for specifying the version, not =. This typo could prevent the correct installation of the package.

Apply this diff to correct the typo:

-google-cloud-pubsub=2.23.0 +google-cloud-pubsub==2.23.0

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

google-cloud-pubsub=2.23.0

google-cloud-pubsub==2.23.0

coderabbitai · 2024-09-09T08:28:48Z

benchmarks/wrappers/aws/python/queue.py

+class queue:
+    client = None
+
+    def __init__(self, queue_name: str, account_id: str, region: str):
+        self.client = boto3.client('sqs', region_name=region)
+        self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"
+
+    def send_message(self, message: str):
+        self.client.send_message(
+            QueueUrl=self.queue_url,
+            MessageBody=message,
+        )


Rename the class and add error handling.

Consider renaming the class from queue to avoid conflicts with Python's built-in queue module. Additionally, add error handling for AWS client interactions to improve robustness.

Suggested class name change and error handling:

-import boto3 +import boto3 +from botocore.exceptions import ClientError -class queue: +class AWSQueue: client = None def __init__(self, queue_name: str, account_id: str, region: str): self.client = boto3.client('sqs', region_name=region) self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}" def send_message(self, message: str): try: self.client.send_message( QueueUrl=self.queue_url, MessageBody=message, ) except ClientError as e: print(f"An error occurred: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class queue:

client = None

def __init__(self, queue_name: str, account_id: str, region: str):

self.client = boto3.client('sqs', region_name=region)

self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"

def send_message(self, message: str):

self.client.send_message(

QueueUrl=self.queue_url,

MessageBody=message,

)

import boto3

from botocore.exceptions import ClientError

class AWSQueue:

client = None

def __init__(self, queue_name: str, account_id: str, region: str):

self.client = boto3.client('sqs', region_name=region)

self.queue_url = f"https://sqs.{region}.amazonaws.com/{account_id}/{queue_name}"

def send_message(self, message: str):

try:

self.client.send_message(

QueueUrl=self.queue_url,

MessageBody=message,

)

except ClientError as e:

print(f"An error occurred: {e}")

coderabbitai · 2024-09-09T08:28:48Z

benchmarks/wrappers/gcp/python/queue.py

+class queue:
+    client = None
+
+    def __init__(self, topic_name: str, project_id: str):
+        self.client = pubsub_v1.PublisherClient()
+        self.topic_name = 'projects/{project_id}/topics/{topic}'.format(
+            project_id=project_id,
+            topic=topic_name,
+        )
+
+    def send_message(self, message: str):
+        self.client.publish(self.topic_name, message.encode("utf-8"))


Rename the class and add error handling.

Consider renaming the class from queue to avoid conflicts with Python's built-in queue module. Additionally, add error handling for GCP client interactions to improve robustness.

Suggested class name change and error handling:

-from google.cloud import pubsub_v1 +from google.cloud import pubsub_v1 +from google.api_core.exceptions import GoogleAPICallError, RetryError -class queue: +class GCPQueue: client = None def __init__(self, topic_name: str, project_id: str): self.client = pubsub_v1.PublisherClient() self.topic_name = 'projects/{project_id}/topics/{topic}'.format( project_id=project_id, topic=topic_name, ) def send_message(self, message: str): try: self.client.publish(self.topic_name, message.encode("utf-8")) except (GoogleAPICallError, RetryError) as e: print(f"An error occurred: {e}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class queue:

client = None

def __init__(self, topic_name: str, project_id: str):

self.client = pubsub_v1.PublisherClient()

self.topic_name = 'projects/{project_id}/topics/{topic}'.format(

project_id=project_id,

topic=topic_name,

)

def send_message(self, message: str):

self.client.publish(self.topic_name, message.encode("utf-8"))

from google.cloud import pubsub_v1

from google.api_core.exceptions import GoogleAPICallError, RetryError

class GCPQueue:

client = None

def __init__(self, topic_name: str, project_id: str):

self.client = pubsub_v1.PublisherClient()

self.topic_name = 'projects/{project_id}/topics/{topic}'.format(

project_id=project_id,

topic=topic_name,

)

def send_message(self, message: str):

try:

self.client.publish(self.topic_name, message.encode("utf-8"))

except (GoogleAPICallError, RetryError) as e:

print(f"An error occurred: {e}")

coderabbitai · 2024-09-09T08:28:48Z

benchmarks/wrappers/azure/python/queue.py

+    def send_message(self, message: str):
+        self.client.send_message(message)


Add error handling in send_message.

The send_message method currently does not handle any potential exceptions that might occur during the message sending process. It's recommended to add error handling to improve the robustness of this method.

Consider wrapping the call in a try-except block:

def send_message(self, message: str): + try: self.client.send_message(message) + except Exception as e: + logging.error(f"Failed to send message: {e}") + raise

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def send_message(self, message: str):

self.client.send_message(message)

def send_message(self, message: str):

try:

self.client.send_message(message)

except Exception as e:

logging.error(f"Failed to send message: {e}")

raise

coderabbitai · 2024-09-09T08:28:48Z

benchmarks/wrappers/azure/python/queue.py

+class queue:
+    client = None
+
+    def __init__(self, queue_name: str, storage_account: str):
+        account_url = f"https://{storage_account}.queue.core.windows.net"
+        managed_credential = ManagedIdentityCredential()
+        self.client = QueueClient(account_url,
+                            queue_name=queue_name,
+                            credential=managed_credential)
+
+    def send_message(self, message: str):
+        self.client.send_message(message)


Refactor class name and improve thread safety.

The class name queue should be in CamelCase to adhere to Python's PEP 8 style guide. Also, using a class-level variable for client is not thread-safe. Consider initializing the client in the __init__ method to ensure each instance has its own client.

Here's a suggested refactor:

-class queue: +class Queue: - client = None + def __init__(self, queue_name: str, storage_account: str): + self.client = None account_url = f"https://{storage_account}.queue.core.windows.net" managed_credential = ManagedIdentityCredential() - self.client = QueueClient(account_url, + self.client = QueueClient(account_url, queue_name=queue_name, credential=managed_credential)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class queue:

client = None

def __init__(self, queue_name: str, storage_account: str):

account_url = f"https://{storage_account}.queue.core.windows.net"

managed_credential = ManagedIdentityCredential()

self.client = QueueClient(account_url,

queue_name=queue_name,

credential=managed_credential)

def send_message(self, message: str):

self.client.send_message(message)

class Queue:

def __init__(self, queue_name: str, storage_account: str):

self.client = None

account_url = f"https://{storage_account}.queue.core.windows.net"

managed_credential = ManagedIdentityCredential()

self.client = QueueClient(account_url,

queue_name=queue_name,

credential=managed_credential)

def send_message(self, message: str):

self.client.send_message(message)

coderabbitai · 2024-09-09T08:28:49Z

sebs/aws/aws.py

        function
        - function.py
        - storage.py
+        - queue.py


Review the updates to the package_code function with the new trigger parameter.

The inclusion of the trigger parameter in the package_code function is a significant update. It allows the function to handle different types of triggers when packaging code, which can affect how the function is deployed and triggered on AWS.

Ensure that the implementation of this parameter is consistent across different parts of the codebase where package_code is called. It's crucial that all calls to this function pass the correct trigger type to avoid deployment issues.

Also applies to: 136-136

coderabbitai · 2024-09-09T08:28:49Z

sebs/aws/aws.py

+        from sebs.aws.triggers import LibraryTrigger, QueueTrigger, StorageTrigger

        for trigger in function.triggers(Trigger.TriggerType.LIBRARY):
            trigger.logging_handlers = self.logging_handlers
            cast(LibraryTrigger, trigger).deployment_client = self
        for trigger in function.triggers(Trigger.TriggerType.HTTP):
            trigger.logging_handlers = self.logging_handlers
+        for trigger in function.triggers(Trigger.TriggerType.QUEUE):
+            trigger.logging_handlers = self.logging_handlers
+            cast(QueueTrigger, trigger).deployment_client = self
+        for trigger in function.triggers(Trigger.TriggerType.STORAGE):
+            trigger.logging_handlers = self.logging_handlers
+            cast(StorageTrigger, trigger).deployment_client = self


Review the handling of new trigger types in the cached_function method.

The modifications to the cached_function method to handle QueueTrigger and StorageTrigger are crucial for supporting these new types of triggers. The method now iterates over triggers of these types, setting their logging handlers and deployment clients accordingly.

Verify that the integration of these new triggers does not interfere with the existing functionalities of other trigger types. Additionally, ensure that the logging and deployment client settings are correctly applied to these triggers for consistent behavior across all trigger types.

coderabbitai · 2024-09-09T08:28:49Z

sebs/benchmark.py

 import subprocess
 from abc import abstractmethod
-from typing import Any, Callable, Dict, List, Tuple
+from typing import Any, Callable, Dict, List, Optional, Tuple


Remove unused import: Optional.

The import of Optional is unused in this file, which can lead to confusion and unnecessary clutter in the code.

Remove the unused import to clean up the code and improve its readability.

-from typing import Any, Callable, Dict, List, Optional, Tuple +from typing import Any, Callable, Dict, List, Tuple

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from typing import Any, Callable, Dict, List, Optional, Tuple

from typing import Any, Callable, Dict, List, Tuple

Tools

Ruff

8-8: typing.Optional imported but unused

Remove unused import: typing.Optional

(F401)

coderabbitai · 2024-09-09T08:28:49Z

sebs/gcp/triggers.py

+class QueueTrigger(Trigger):
+    def __init__(
+        self,
+        fname: str,
+        queue_name: str,
+        region: str,
+        result_queue: Optional[GCPQueue] = None
+    ):
+        super().__init__()
+        self.name = fname
+        self._queue_name = queue_name
+        self._region = region
+        self._result_queue = result_queue
+
+        # Create result queue for communicating benchmark results back to the
+        # client.
+        if (not self._result_queue):
+            self._result_queue = GCPQueue(
+                fname,
+                QueueType.RESULT,
+                self.region
+            )
+            self._result_queue.create_queue()
+
+    @staticmethod
+    def typename() -> str:
+        return "GCP.QueueTrigger"
+
+    @property
+    def queue_name(self) -> str:
+        assert self._queue_name
+        return self._queue_name
+
+    @property
+    def region(self) -> str:
+        assert self._region
+        return self._region
+
+    @property
+    def result_queue(self) -> GCPQueue:
+        assert self._result_queue
+        return self._result_queue
+
+    @staticmethod
+    def trigger_type() -> Trigger.TriggerType:
+        return Trigger.TriggerType.QUEUE
+
+    def sync_invoke(self, payload: dict) -> ExecutionResult:
+
+        self.logging.info(f"Invoke function {self.name}")
+
+        # Init client
+        pub_sub = build("pubsub", "v1", cache_discovery=False)
+
+        # Prepare payload
+        # GCP is very particular with data encoding...
+        serialized_payload = base64.b64encode(json.dumps(payload).encode("utf-8"))
+
+        # Publish payload to queue
+        begin = datetime.datetime.now()
+        pub_sub.projects().topics().publish(
+            topic=self.queue_name,
+            body={
+                "messages": [{"data": serialized_payload.decode("utf-8")}],
+            },
+        ).execute()
+
+        response = ""
+        while (response == ""):
+            response = self.result_queue.receive_message()
+
+        end = datetime.datetime.now()
+
+        result = ExecutionResult.from_times(begin, end)
+        result.parse_benchmark_output(json.loads(response))
+        return result
+
+    def async_invoke(self, payload: dict) -> concurrent.futures.Future:
+
+        pool = concurrent.futures.ThreadPoolExecutor()
+        fut = pool.submit(self.sync_invoke, payload)
+        return fut
+
+    def serialize(self) -> dict:
+        return {
+            "type": "Queue",
+            "name": self.name,
+            "queue_name": self.queue_name,
+            "region": self.region,
+            "result_queue": self.result_queue.serialize()
+        }
+
+    @staticmethod
+    def deserialize(obj: dict) -> Trigger:
+        return QueueTrigger(
+            obj["name"],
+            obj["queue_name"],
+            obj["region"],
+            GCPQueue.deserialize(obj["result_queue"])
+        )
+


Approved with suggestions: Implementation of QueueTrigger and StorageTrigger.

The implementation of both QueueTrigger and StorageTrigger classes is robust and tailored for GCP interactions. The methods for synchronous and asynchronous invocation are well-defined.

Suggestions:

Consider adding more detailed logging for debugging and monitoring purposes.

Review the error handling to ensure that all potential issues are gracefully managed.

Optimize the resource management, especially in methods that involve network calls and data serialization.

Also applies to: 226-326

coderabbitai · 2024-09-09T08:28:49Z

sebs/gcp/triggers.py

 import concurrent.futures
 import datetime
 import json
+import os


Approved: Removal of unused imports.

The removal of unused imports os and googleapiclient.errors.HttpError is correctly identified by static analysis tools and should be implemented to clean up the code.

- import os - from googleapiclient.errors import HttpError

Also applies to: 8-8

Tools

Ruff

5-5: os imported but unused

Remove unused import: os

(F401)

coderabbitai

Actionable comments posted: 4

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dac2840 and 7de164e.

Files selected for processing (8)

config/systems.json (3 hunks)
docs/platforms.md (1 hunks)
sebs.py (3 hunks)
sebs/azure/azure.py (7 hunks)
sebs/cache.py (1 hunks)
sebs/faas/system.py (1 hunks)
sebs/gcp/gcp.py (9 hunks)
sebs/local/local.py (1 hunks)

Files skipped from review due to trivial changes (2)

docs/platforms.md
sebs/cache.py

Additional context used

Ruff

sebs/azure/azure.py

259-259: Use context handler for opening files

(SIM115)

Additional comments not posted (13)

config/systems.json (1)

74-75: LGTM!

The changes look good:

Adding queue.py to the deployment files across multiple sections indicates that this file is now part of the deployment process.

Adding azure-storage-queue and azure-identity packages suggests an expansion of functionality related to Azure services.

The formatting changes (new lines before the added packages) may improve readability.

Also applies to: 120-122

sebs.py (3)

181-181: LGTM!

The code changes to add "queue" and "storage" trigger choices are approved.

Line range hint 232-257: LGTM!

The code changes to handle the new trigger types in the invoke function are approved.

Line range hint 223-223: LGTM!

The code change to add the optional trigger parameter to the package_code function is approved.
sebs/azure/azure.py (3)
257-261: LGTM!

The code changes to utilize the create_function_json method in the package_code function are approved.

Tools

Ruff

259-259: Use context handler for opening files

(SIM115)

356-361: LGTM!

The code changes to conditionally create an HTTPTrigger in the update_function method are approved.

585-647: Approved: Method create_trigger supports multiple trigger types.

The changes to the create_trigger method significantly enhance the functionality by supporting both queue and storage triggers. The method effectively sets up the necessary Azure resources and permissions.

The method also raises a RuntimeError for unsupported trigger types, which is a good practice for error handling.

Verify that the RuntimeError is raised for unsupported trigger types:

Verification successful

Verified: RuntimeError is raised for unsupported trigger types in create_trigger.

The create_trigger method in sebs/azure/azure.py correctly raises a RuntimeError for unsupported trigger types, confirming the implementation of error handling as expected.

Location: sebs/azure/azure.py
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type python -A 5 $'raise RuntimeError$"Not supported!"$'
Length of output: 95

Script:
#!/bin/bash
# Description: Verify that `create_trigger` raises a `RuntimeError` for unsupported trigger types.

# Test: Search for the `RuntimeError` being raised. Expect: The `RuntimeError` is raised for unsupported trigger types.
rg --type py -A 5 $'raise RuntimeError$"Not supported!"$'
Length of output: 1744
sebs/gcp/gcp.py (6)

143-192: LGTM!

The create_trigger_resource method is well-implemented and enhances the functionality related to trigger resources in GCP. It effectively checks the trigger type, creates the necessary resources, logs the creation process, and handles potential conflicts if the resources already exist.

251-252: LGTM!

The code changes to write additional dependencies to the requirements.txt file in the package_code function are approved.

311-314: LGTM!

The code changes to call create_trigger_resource before creating the function in the create_function method are approved. This ensures that the required trigger resources are created before the function deployment process.

380-416: LGTM!

The code changes to the create_trigger method to handle different trigger types, including HTTP, Queue, and Storage triggers, are approved. The method now imports QueueTrigger and StorageTrigger classes and includes appropriate logging for each trigger type, providing a more structured approach to managing triggers.

428-441: LGTM!

The code changes to the cached_function method to handle different trigger types, including LibraryTrigger, QueueTrigger, and StorageTrigger, are approved. These changes enhance the functionality of the method by allowing it to handle different trigger types.

457-460: LGTM!

The code changes to call create_trigger_resource before updating the function in the update_function method are approved. This ensures that the required trigger resources are verified before the function deployment process.

coderabbitai · 2024-09-10T09:14:16Z

sebs/local/local.py

        language_version: str,
        benchmark: str,
        is_cached: bool,
+        trigger: Optional[Trigger.TriggerType],


Ensure proper handling of the new trigger parameter.

The trigger parameter has been added to the package_code method but is not utilized within the method body. Consider adding logic to handle this parameter or removing it if unnecessary.

coderabbitai · 2024-09-10T09:14:16Z

sebs/faas/system.py

        language_version: str,
        benchmark: str,
        is_cached: bool,
+        trigger: Optional[Trigger.TriggerType],


Ensure proper handling of the new trigger parameter in the actual implementation.

The trigger parameter has been added to the abstract package_code method. When implementing this method in the derived classes, ensure that the trigger parameter is being utilized correctly. If not needed, consider removing it from the abstract method definition.

coderabbitai · 2024-09-10T09:14:16Z

sebs/azure/azure.py

+    def create_function_json(self, benchmark, exec_files) -> Dict:
+        trigger = benchmark.split("-")[-1]
+
+        if trigger == "queue":
+            return {
+                "scriptFile": exec_files,
+                "entryPoint": "handler_queue",
+                "bindings": [
+                    {
+                        "name": "msg",
+                        "type": "queueTrigger",
+                        "direction": "in",
+                        "queueName": benchmark,
+                        "connection": "AzureWebJobsStorage",
+                    }
+                ],
+            }
+        elif trigger == "storage":
+            return {
+                "scriptFile": exec_files,
+                "entryPoint": "handler_storage",
+                "bindings": [
+                    {
+                        "name": "blob",
+                        "type": "blobTrigger",
+                        "direction": "in",
+                        "path": benchmark,
+                        "connection": "AzureWebJobsStorage",
+                    }
+                ],
+            }
+        return {  # HTTP
+            "scriptFile": exec_files,
+            "entryPoint": "handler_http",
+            "bindings": [
+                {
+                    "authLevel": "anonymous",
+                    "type": "httpTrigger",
+                    "direction": "in",
+                    "name": "req",
+                    "methods": ["get", "post"],
+                },
+                {"type": "http", "direction": "out", "name": "$return"},
+            ],
+        }


Approved: Method create_function_json enhances flexibility.

This method correctly handles different trigger types and dynamically generates the function configuration. However, consider refactoring the repeated dictionary structures into a helper function for improved readability and maintainability.

Consider using a helper function to manage repeated dictionary structures:

def binding_config(name, type, direction, connection=None, path=None, methods=None): config = { "name": name, "type": type, "direction": direction } if connection: config["connection"] = connection if path: config["path"] = path if methods: config["methods"] = methods return config # Example usage in create_function_json "bindings": [binding_config("msg", "queueTrigger", "in", connection="AzureWebJobsStorage")]

coderabbitai · 2024-09-10T09:14:16Z

sebs/gcp/gcp.py

+    def get_trigger_resource_name(self, func_name: str) -> str:
+        trigger = func_name.split("-")[-1]
+
+        assert trigger == "queue" or trigger == "storage"
+
+        if trigger == "queue":
+            return "projects/{project_name}/topics/{topic}".format(
+                project_name=self.config.project_name, topic=func_name
+            )
+        else:
+            return "projects/{project_name}/buckets/{bucket}".format(
+                project_name=self.config.project_name, bucket=func_name
+            )


Method get_trigger_resource_name is well-implemented but consider improving error handling.

The method effectively constructs the resource name for queue and storage triggers. However, using assert for error handling is not recommended in production code as it can be disabled with certain Python flags, and it's not informative for the user. Consider raising a more descriptive exception if the trigger type is not supported.

Replace the assert statement with a more descriptive exception:

if trigger == "queue": # ... elif trigger == "storage": # ... else: raise ValueError(f"Unsupported trigger type: {trigger}")

orosca and others added 6 commits May 6, 2024 15:23

Small Python version changes

4bd6f44

AWS queue and storage triggers

728288e

GCP queue and storage triggers

9c3a016

Azure queue and storage triggers

63ab522

Merge branch 'master' into oanarosca/triggers

94d3449

Linting

0f7454a

mcopik requested changes May 13, 2024

View reviewed changes

orosca added 3 commits May 20, 2024 18:35

Address comments, lint

8f96eda

Cache prep

107b53f

Address comments

ba67b4a

coderabbitai bot reviewed Jul 5, 2024

View reviewed changes

mcopik added 4 commits July 11, 2024 01:12

[aws] Linting

94a675a

[azure] Linting

bb0ade5

[gcp] Linting

97d6345

[system] Fix incorrect callback type

be4e4f9

coderabbitai bot reviewed Jul 10, 2024

View reviewed changes

mcopik reviewed Jul 10, 2024

View reviewed changes

mcopik added 2 commits July 11, 2024 01:39

[gcp] Help mypy determine the type

debbda0

[aws] Fix return type

3e52f3a

coderabbitai bot reviewed Jul 10, 2024

View reviewed changes

mcopik added 2 commits July 11, 2024 01:43

[azure] fix var name confusing mypy

4bd7a20

[gcp] [azure] Fix linting issues

a466aab

coderabbitai bot reviewed Jul 10, 2024

View reviewed changes

mcopik mentioned this pull request Jul 26, 2024

Standardize invocation output #68

Open

orosca added 3 commits September 9, 2024 00:24

Measurements infrastructure with queues

0b310c1

Fix

f8f3162

Wrap up and clean up

dac2840

coderabbitai bot reviewed Sep 9, 2024

View reviewed changes

Merge branch 'master' into oanarosca/triggers

7de164e

coderabbitai bot reviewed Sep 10, 2024

View reviewed changes

mcopik mentioned this pull request Feb 20, 2025

New experiments #239

Open

6 tasks

	cfg._trigger = config["trigger"] if "trigger" in config else {}
	cfg._trigger = config.get("trigger", {})

-        client = gcp_storage.Client()
+        if not hasattr(self, '_gcp_storage_client'):
+            self._gcp_storage_client = gcp_storage.Client()
+        client = self._gcp_storage_client

		def send_message(self, message: str):
		self.client.send_message(message)

	from typing import Any, Callable, Dict, List, Optional, Tuple
	from typing import Any, Callable, Dict, List, Tuple

[#64] Queue and Storage triggers for AWS, GCP and Azure #201

Are you sure you want to change the base?

[#64] Queue and Storage triggers for AWS, GCP and Azure #201

Uh oh!

Conversation

oanarosca commented May 6, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

mcopik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Jul 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

mcopik Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

mcopik Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

mcopik Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

oanarosca Sep 9, 2024

Choose a reason for hiding this comment

oanarosca commented May 6, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 5, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)