-
Notifications
You must be signed in to change notification settings - Fork 846
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2924 from redpanda-data/azure-data-lake-output
feat: add Azure Data Lake Gen2 Output
- Loading branch information
Showing
7 changed files
with
408 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
146 changes: 146 additions & 0 deletions
146
docs/modules/components/pages/outputs/azure_data_lake_gen2.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
= azure_data_lake_gen2 | ||
:type: output | ||
:status: beta | ||
:categories: ["Services","Azure"] | ||
|
||
|
||
|
||
//// | ||
THIS FILE IS AUTOGENERATED! | ||
|
||
To make changes, edit the corresponding source file under: | ||
|
||
https://github.com/redpanda-data/connect/tree/main/internal/impl/<provider>. | ||
|
||
And: | ||
|
||
https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl | ||
//// | ||
// © 2024 Redpanda Data Inc. | ||
component_type_dropdown::[] | ||
Sends message parts as files to an Azure Data Lake Gen2 filesystem. Each file is uploaded with the filename specified with the `path` field. | ||
Introduced in version 4.38.0. | ||
```yml | ||
# Config fields, showing default values | ||
output: | ||
label: "" | ||
azure_data_lake_gen2: | ||
storage_account: "" | ||
storage_access_key: "" | ||
storage_connection_string: "" | ||
storage_sas_token: "" | ||
filesystem: messages-${!timestamp("2006")} # No default (required) | ||
path: ${!counter()}-${!timestamp_unix_nano()}.txt | ||
max_in_flight: 64 | ||
``` | ||
In order to have a different path for each file you should use function | ||
interpolations described xref:configuration:interpolation.adoc#bloblang-queries[here], which are | ||
calculated per message of a batch. | ||
Supports multiple authentication methods but only one of the following is required: | ||
- `storage_connection_string` | ||
- `storage_account` and `storage_access_key` | ||
- `storage_account` and `storage_sas_token` | ||
- `storage_account` to access via https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity#DefaultAzureCredential[DefaultAzureCredential^] | ||
If multiple are set then the `storage_connection_string` is given priority. | ||
If the `storage_connection_string` does not contain the `AccountName` parameter, please specify it in the | ||
`storage_account` field. | ||
== Performance | ||
This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field `max_in_flight`. | ||
== Fields | ||
=== `storage_account` | ||
The storage account to access. This field is ignored if `storage_connection_string` is set. | ||
*Type*: `string` | ||
*Default*: `""` | ||
=== `storage_access_key` | ||
The storage account access key. This field is ignored if `storage_connection_string` is set. | ||
*Type*: `string` | ||
*Default*: `""` | ||
=== `storage_connection_string` | ||
A storage account connection string. This field is required if `storage_account` and `storage_access_key` / `storage_sas_token` are not set. | ||
*Type*: `string` | ||
*Default*: `""` | ||
=== `storage_sas_token` | ||
The storage account SAS token. This field is ignored if `storage_connection_string` or `storage_access_key` are set. | ||
*Type*: `string` | ||
*Default*: `""` | ||
=== `filesystem` | ||
The data lake storage filesystem name for uploading the messages to. | ||
This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. | ||
*Type*: `string` | ||
```yml | ||
# Examples | ||
filesystem: messages-${!timestamp("2006")} | ||
``` | ||
=== `path` | ||
The path of each message to upload within the filesystem. | ||
This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. | ||
*Type*: `string` | ||
*Default*: `"${!counter()}-${!timestamp_unix_nano()}.txt"` | ||
```yml | ||
# Examples | ||
path: ${!counter()}-${!timestamp_unix_nano()}.json | ||
path: ${!meta("kafka_key")}.json | ||
path: ${!json("doc.namespace")}/${!json("doc.id")}.json | ||
``` | ||
=== `max_in_flight` | ||
The maximum number of messages to have in flight at a given time. Increase this to improve throughput. | ||
*Type*: `int` | ||
*Default*: `64` | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.