Skip to content

CLDSRV-636: SSE with both internal & external KMS (HF 9.2.0.12) #5788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 14, 2025

Conversation

BourgoisMickael
Copy link
Contributor

@BourgoisMickael BourgoisMickael commented Apr 11, 2025

This is generic but simplified for file to aws use case and will be extended in latest version to handle any backend.

  • Store all KMS keys with an arn describing backend provider (see ARSN-485: Prefix KMS Keys with an arn (HF cldsrv 9.2.0.12) Arsenal#2349)
    • option (hideScalityArn) to keep backward compatibility
  • Migrate KMS Keys with sseMigration config
    • For bucket, replace file key with aws key
    • For object reformat key into an arn
  • Allow usage of keys from previous KMS provider (file) to decript and read files (or for ongoing MPU)

Overall example of a new Key Format and migration

flowchart TD
    GetObject:::red --> bktArn
    objArn --NO GetObject--> objUpdate[Update Object: Prefix KMS key with arn]:::red
    %% Get 0-1 Links

    %% Common route
    bktArn{Bucket SSE is arn?}:::purple
        --NO--> bktKey["
            Create new external KMS Key with arn prefix
            Update Bucket attribute
        "]:::purple
    bktArn
        --Then--> objArn{Object SSE is arn?}:::purple
        --Then--> ...:::purple
        --> KMS
    subgraph KMS[KMS#nbsp;Object#nbsp;Decrypt/Encrypt]
        extract[Extract KMS provider and KeyId from arn]:::purple
        --> client[Pick or instantiate KMS client based on provider from Key arn]:::purple
        --> send["
            Backend Decrypts/Encrypts with KeyId
            But if Arn is sent, client still extract KeyId from arn
        "]:::purple
    end
    %% 2-7 Links common

    PutObject:::blue --> bktArn
    objArn --NO PutObject--> PutPrefix[Prefix KMS Key with arn if missing]:::blue
    %% PUT 8-9 Links

    classDef red stroke:red
    classDef blue stroke:blue
    classDef purple stroke:purple

    linkStyle 0,1 stroke:red
    linkStyle 8,9 stroke:blue
    linkStyle 2,3,4,5,6,7 stroke:purple
Loading

@bert-e
Copy link
Contributor

bert-e commented Apr 11, 2025

Hello bourgoismickael,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Apr 11, 2025

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

@scality scality deleted a comment from bert-e Apr 12, 2025
// this should trigger vault account key update as well
return kms.createBucketKey(bucket, log, (err, { masterKeyArn }) => {
if (err) {
return cb(err, bucket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we log something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors should already be logged by kms implementation. But we could append a message to the error to describe the error is due to the SSE migration

sse.masterKeyId = masterKeyArn;
}
if (updateConfigured) {
sse.configuredMasterKeyId = masterKeyArn;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are sse.masterKeyId and sse.configuredMasterKeyId not suppose to be different?
Does it break something if we have them both be the same value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they should be different, I need to fix this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually nothing should break, right now they are different because they are set at different times, but having the same key is fine

@@ -51,6 +52,7 @@ function objectGet(authInfo, request, returnTagCount, log, callback) {
};

return metadataValidateBucketAndObj(mdValParams, log,
(err, bucket, objMD) => updateEncryption(err, bucket, objMD, objectKey, log, {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fail the GET/HEAD object call if update encryption fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it will fail, the error is passed to the next callback.

But we could provide a flag through config to ignore migration failure and still return the object.

The only reason where the update could fail but not the get is if you can't write, because disk is full but still wants to read from it.

@@ -309,17 +377,21 @@ class KMS {
cryptoScheme: serverSideEncryptionInfo.cryptoScheme,
decipher: null,
};

// shadowing global client for key - implName already used can't be shadowed here
const { client, implName: _impl, key } = getClientForKey(serverSideEncryptionInfo.masterKeyId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always need to check the client because there is no way to know that a migration never happened before? If we had a way to know that no migration happened before, we would be safe to use the default KMS, right? How is the impact of getClientForKey on the GET object latency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know the migration never happened before, we'd rather know we should use the previous KMS instead of the new default external KMS.

But the way to know for now for outscale is if the scality kms arnPrefix is present or not.

I've also pre instantiated the previous KMS as file is the only allowed for now in migration, to avoid latency of first GET after startup.

The impact on GET object latency is:

But as the GetObject migrate to new scality arn format, it should be expected the first Get on a bucket and object is slower for migration, and then we fall in the case where it's a scality arnPrefix

@@ -196,10 +261,13 @@ class KMS {
log.debug('using user configured kms master key id');
masterKeyId = configuredMasterKeyId;
}
// shadowing global client for key
// but should not happen to cipher for another client as Puts should use current KMS
const { client, implName, key } = getClientForKey(masterKeyId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is getClientForKey needed for PUT object? We know that the client will always be the default one. What is the impact on our PUT object latency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed if kms key is passed in scality arn format on a PUT request. Then the function will extract the KeyId from the arn and validate the arn scality format.

It also returns the client identifier so we can block usage of other client like if an arn for the file KMS is used when there is a configured external KMS, it should be forbidden: https://github.com/scality/cloudserver/pull/5788/files#diff-c506ce0254987c99cd5ff5a10cd1ecabc6e1bc59e643fe2401103b9feb7b27a6R332-R339


And for any future case where multiple KMS providers can be used it will become useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for PUT latency, it will be same as GET:

The impact on PUT object latency is:

@BourgoisMickael BourgoisMickael force-pushed the improvement/CLDSRV-636-kms-provider branch from a7f5219 to b40068c Compare April 17, 2025 01:15
> This is a scheduled Ubuntu 20.04 brownout. Ubuntu 20.04 LTS runner will be removed on 2025-04-15. For more details, see actions/runner-images#11101

(cherry picked from commit 5c4a1b3)
Avoid print the warning if scality-kms is not configured:

```json
{"name":"S3","time":1742220545805,"error":{"code":"MODULE_NOT_FOUND","requireStack":["/home/mickael/scality/cloudserver/lib/kms/wrapper.js","/home/mickael/scality/cloudserver/lib/data/wrapper.js","/home/mickael/scality/cloudserver/lib/utilities/healthcheckHandler.js","/home/mickael/scality/cloudserver/lib/utilities/internalHandlers.js","/home/mickael/scality/cloudserver/lib/server.js","/home/mickael/scality/cloudserver/index.js"]},"level":"warn","message":"scality kms unavailable. Using file kms backend unless mem specified.","hostname":"mickael-xps","pid":139701}
```

(cherry picked from commit 9e8c505)
@BourgoisMickael BourgoisMickael force-pushed the improvement/CLDSRV-636-kms-provider branch 3 times, most recently from 642fb3f to 9cd6ec0 Compare May 2, 2025 03:42
@BourgoisMickael BourgoisMickael changed the base branch from hotfix/7.70.21 to hotfix/7.70.21-outscale May 2, 2025 18:16
@BourgoisMickael BourgoisMickael requested a review from anurag4DSB May 7, 2025 15:44
Copy link
Contributor

@anurag4DSB anurag4DSB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes

@BourgoisMickael BourgoisMickael force-pushed the improvement/CLDSRV-636-kms-provider branch from 8ff7b58 to cd4bca8 Compare May 8, 2025 22:03
@BourgoisMickael BourgoisMickael changed the base branch from hotfix/7.70.21-outscale to hotfix/7.70.21.outscale May 13, 2025 16:13
Issue: S3C-9996

MPU was not accepting configuredMasterKeyId (from object or bucket)
for Create and Complete MPU.

But used the configuredMasterKeyId (from object or bucket)
for UploadPart making it possible to have parts encrypted with a
different key than the key in object metadata after completion.

Resulting in potential error on GetObject later:
 - If bucket had configuredMasterKeyId but no masterKeyId
   (no AES before or aws:kms without configuredKey)
   it would crash on decryption.
 - If bucket had configuredMasterKeyId and a masterKeyId
   it would be used to decrypt parts potentially
   encrypted with a configuredMasterKeyId, decrypting gibberish
 - On UploadPart object level encryption could be provided (outside sdk/cli)
   which is bad.

___

The expected behavior fixed here is to use the provided SSE on CreateMPU
and then for UploadPart and CompleteMPU use the SSE provided at CreateMPU.

The MPU calls also return the SSE headers now.
The CopyObject returned SSE headers are fixed as well, the configuredMasterKeyId was ignored

This fix allow seamless continuation of ongoing MPU during SSE migration.

___

There is no migration to fix already existing completed MPU.
@BourgoisMickael BourgoisMickael force-pushed the improvement/CLDSRV-636-kms-provider branch from cd4bca8 to c2f045f Compare May 13, 2025 17:44
@tcarmet tcarmet merged commit 37162e0 into hotfix/7.70.21.outscale May 14, 2025
8 checks passed
@tcarmet tcarmet deleted the improvement/CLDSRV-636-kms-provider branch May 14, 2025 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants