Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broaden transaction data requirements #421

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

sander
Copy link

@sander sander commented Feb 11, 2025

This enables the QES creation use case where transaction data contains parameters for advanced e-signature creation. In these cases, the transaction data do not get hashed in their JSON representation format using a single hash algorithm. Instead, the X.509 credential format specifies specific ways of processing, that still meet the broadened requirements.

Highlight of changes:

  • Leave it up to credential format specs to define transaction data support and requirements.
  • Clarify that W3C VC and AnonCreds do not have specified transaction data support.
  • Clarify how to support transaction data with mdoc.
  • Move transaction_data_hashes and transaction_data_hashes_alg into SD-JWT VC format spec.
    • Keep sha-256 a mandatory default.
    • Remove sha-256 interoperability promotion remark.
    • Clarify what to hash.
    • Clarify how to represent the digest.

Closes #423

Closes #418

Relates to #259

Does not address #442, #443

This enables the QES creation use case where transaction data contains parameters for advanced e-signature creation. In these cases, the transaction data do not get hashed in their JSON representation format using a single hash algorithm. Instead, the X.509 credential format specifies specific ways of processing, that still meet the broadened requirements.
@sander sander marked this pull request as draft February 11, 2025 09:57
@Sakurann Sakurann added this to the Final 1.0 milestone Feb 17, 2025
@babisRoutis
Copy link

Dear @sander

Can you please update the description of the PR and replace "Context #421" with "Closes #421"?
This will associate the PR with the issue.

@sander
Copy link
Author

sander commented Feb 18, 2025

Done, thanks @babisRoutis.

@babisRoutis
Copy link

Done, thanks @babisRoutis.

Pointed description to issue 423.

@@ -314,7 +314,7 @@ This enables the Wallet to assess the Verifier's capabilities, allowing it to tr

* `type`: REQUIRED. String that identifies the type of transaction data . This value determines parameters that can be included in the `transaction_data` object. The specific values are out of scope of this specification. It is RECOMMENDED to use collision-resistant names for `type` values.
* `credential_ids`: REQUIRED. Array of strings each referencing a Credential requested by the Verifier that can be used to authorize this transaction. In [@!DIF.PresentationExchange], the string matches the `id` field in the Input Descriptor. In the Digital Credentials Query Language, the string matches the `id` field in the Credential Query. If there is more than one element in the array, the Wallet MUST use only one of the referenced Credentials for transaction authorization.
* `transaction_data_hashes_alg`: OPTIONAL. Array of strings each representing a hash algorithm identifier, one of which MUST be used to calculate hashes in `transaction_data_hashes` response parameter. The value of the identifier MUST be a hash algorithm value from the "Hash Name String" column in the IANA "Named Information Hash Algorithm" registry [@IANA.Hash.Algorithms] or a value defined in another specification and/or profile of this specification. If this parameter is not present, a default value of `sha-256` MUST be used. To promote interoperability, implementations MUST support the sha-256 hash algorithm.
* `transaction_data_hashes_alg`: OPTIONAL. Array of strings each representing a hash algorithm identifier, one of which MUST be used to calculate hashes in `transaction_data_hashes` response parameter. The value of the identifier MUST be a hash algorithm value from the "Hash Name String" column in the IANA "Named Information Hash Algorithm" registry [@IANA.Hash.Algorithms] or a value defined in another specification and/or profile of this specification. If this parameter is not present, a default value of `sha-256` SHOULD be used. To promote interoperability, implementations SHOULD support the sha-256 hash algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more explicit if we remove the default behaviour. Something along the lines of "default value of sha-256 MUST be used unless otherwise conveyed in the credential" if I understood your motivation correctly?

I'd keep the "implementations MUST support the sha-256 hash algorithm." - we want everyone to at least support sha-256 for interop imho.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more explicit if we remove the default behaviour. Something along the lines of "default value of sha-256 MUST be used unless otherwise conveyed in the credential" if I understood your motivation correctly?

Agreed. Is “MUST unless” a proper formulation? Alternatively, specify here that the credential format SHOULD specify default behaviour, and that if the credential does not specify default behaviour, the default behaviour MUST be to use SHA-256.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the cleanest way for the spec would be to describe priorities - something like this?

  1. If transaction_data_hashes_alg is provided, the hash function provided in that parameter MUST be used
  2. If a hash function is provided in the credential and transaction_data_hashes_alg is omitted, the hash function contained in the credential MUST be used; the way this hash function is provided in the different Credential Formats is defined in Appendix B.
  3. sha-256 SHOULD be used otherwise

Copy link
Author

@sander sander Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the cleanest way for the spec would be to describe priorities - something like this?

  1. If transaction_data_hashes_alg is provided, the hash function provided in that parameter MUST be used
  2. If a hash function is provided in the credential and transaction_data_hashes_alg is omitted, the hash function contained in the credential MUST be used; the way this hash function is provided in the different Credential Formats is defined in Appendix B.
  3. sha-256 SHOULD be used otherwise

Sounds good. Note re: 2, that Appendix B will only contain some credential formats: additional formats will be specified outside of the OID4VP spec.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current proposal, instead sha-256 is a mandatory default for SD-JWT, and we leave it up to other credential format specs what they take as default behaviour. What do you think?

@@ -1196,7 +1196,7 @@ The Wallet that received the `transaction_data` parameter in the request MUST in

Where to include the`transaction_data_hashes` parameter in the response is specific to each credential format and is defined by each Credential Format, such as those in (#format_specific_parameters).

* `transaction_data_hashes`: Array of hashes, where each hash is calculated using a hash function over the strings received in the `transaction_data` request parameter. Each hash value ensures the integrity of, and maps to, the respective transaction data object. Where in the response this parameter is included is defined by each Credential Format, but it has to be included in the mechanism used for the proof of possession of the Credential that is signed using the user-controlled key.
* `transaction_data_hashes`: Representation of hashes, where each hash is calculated using a hash function over the data in the strings received in the `transaction_data` request parameter. Each hash value ensures the integrity of, and maps to, the respective transaction data object. Where in the response this parameter is included is defined by each Credential Format, but it has to be included in the mechanism used for the proof of possession of the Credential that is signed using the user-controlled key.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation of "a hash function over the strings" was to compute a hash over the base64-encoded data in the transaction_data, that is for sha-256: transaction_data_hashes[i] = base64_urlsafe_nopad_encode(sha256(transaction_data[i])).

Now with the phrasing "the data in the strings" this must mean that the hash is to be computed over the (base64-)decoded transaction data, that is transaction_data_hashes[i] = base64_urlsafe_nopad_encode(sha256(base64_urlsafe_decode(transaction_data[i]))).

Is this an intended change in behaviour or was my previous interpretation just incorrect?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an intended change, and I share your interpretation. With “the data in the strings” I mean a generalisation over “the strings”, so that other representations of the data are allowed, such as needed in advanced e-signatures.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we do not leave room for interpretation and specifically give this information, whether the hash is computed over encoded data or decoded data, etc. because not everyone will interpret the same way or understand all subtleties when reading the text.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current version, I’ve tried to clarify this in the SD-JWT VC section. But upon writing, I started to prefer the second interpretation by @matzf. It seems to be somewhat more robust since base64url-encoded data seems harder to corrupt than an UTF-8 string representation. So I’ve included it like this in the proposal, what do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer, I only meant to ask for clarification not propose a change. 😉

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matzf Clear. Still I tried to include a choice for one interpretation in this PR. Would you say this one is workable as well? Otherwise we may need to postpone to a future PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing "calculated using a hash function over the data in the base64url-decoded strings" seems clear now. The original phrasing "calculated using a hash function over the strings" also seemed clear enough. Only the intermediate state on which I commented seemed quite ambiguous.

Regarding which semantics to adopt, I don't see a compelling reason to prefer either variant, but I'd avoid changing it now.

@tlodderstedt
Copy link
Collaborator

@c2bo @sander After re-reading the spec and playing with some examples, I would argue transaction_data_hashes_alg and transaction_data_hashes are designed and might work well for sd-jwt. I'm not so sure for mdoc and I'm pretty sure it does not work well for x.509 (QES). For example, the transaction data type for QES (see below) already contains identifiers for algorithms (hashAlgorithmOID) based on pre-existing standards (ETSi and CSC) and QES/CMS standards define what needs to be incorporated into the signature, some of this data not even passed in the request but obtained from other sources, like time stamping services (in this cases the "T" in "Ades-B-T").

{
    "type": "https://cloudsignatureconsortium.org/2025/qes",
    "credential_ids": [
        "certificate_by_policy"
    ],
    "signatureQualifier": "eu_eidas_qes",
    "documentDigests": [
        {
            "signature_format": "P",
            "conformance_level": "Ades-B-T",
            "signed_envelope_property": "Certification",
            "label": "Example Contract",
            "href": "https://protected.rp.example/contract-01.pdf?token=HS9naJKWwp901hBcK348IUHiuH8374",
            "checksum": "sha256-sTOgwOm+474gFj0q0x1iSNspKqbcse4IeiqlDg/HWuI=",
            "access": {
                "type": "OTP",
                "oneTimePassword": "51623"
            }
        }
    ],
    "hashAlgorithmOID": "2.16.840.1.101.3.4.2.1"
}

Based on those thoughts, I would suggest to make the way transaction data is included or referenced truly a credential format specific thing. That would mean to move transaction_data_hashes_alg and transaction_data_hashes to appendix B.4 and change the second and third paragraph in section 8.4 to something like

"The Wallet that received the transaction_data parameter in the request MUST include a representation or reference to the data in the in the respective credential presentation. How this is done is specific to each credential format and is defined by each Credential Format, such as those in Appendix B. If the wallet does not support transaction_data parameter, it MUST return an error."

@sander
Copy link
Author

sander commented Mar 10, 2025

@tlodderstedt agreed. My original PR was to keep the change as small as possible, but your proposal is the better change.

For mdoc queries, the transaction data is expected to be returned as DeviceSignedItems. That means that the document type specification should include attribute definitions matching the transaction data requirements. This is something we cannot specify in detail in OpenID4VCI. Instead CSC should ensure that rulebooks get published/extended with device-signed attribute definitions confirming QES creation in the QTSP-centric model. This could be an extension of the PID rulebook in the case of one-time signatures for example. For the Wallet-centric model, we’re good already.

Since OID4VCI also specifies a W3C VC credential format, what would be the best way to include transaction data there?

@@ -2384,6 +2389,19 @@ __Claim `birthdate`__:
* Contents:
`["6Ij7tM-a5iVPGboS5tmvVA", "birthdate", "1940-01-01"]`

### Transaction Data

The SD JWT VC format supports `transaction_data` as specified in (#transaction_data). If used, it is recommended that the transaction data type specification includes the following parameter, in addition to `type` and `credential_ids` from (#new_parameters):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just recommended or is this how the data must be represented in SD-JWTs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can still include new mandatory requirements at this stage, let’s indeed make it mandatory.


The SD JWT VC format supports `transaction_data` as specified in (#transaction_data). If used, it is recommended that the transaction data type specification includes the following parameter, in addition to `type` and `credential_ids` from (#new_parameters):

* `transaction_data_hashes_alg`: OPTIONAL. Array of strings each representing a hash algorithm identifier, one of which MUST be used to calculate hashes in `transaction_data_hashes` response parameter. The value of the identifier MUST be a hash algorithm value from the "Hash Name String" column in the IANA "Named Information Hash Algorithm" registry [@IANA.Hash.Algorithms] or a value defined in another specification and/or profile of this specification.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be better to have an array of objects containing the hash and the corresponding hash alg?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, yes: #442. I did not yet look into this. Seems like it needs a separate PR.

If the Wallet received the `transaction_data` parameter in the request, it MUST include in the Key Binding JWT as a top level claim a `transaction_data_hashes` parameter defined below.

* `transaction_data_hashes`: Array of base64url-encoded hashes, where each hash is calculated using a hash function over the data in the base64url-decoded strings received in the `transaction_data` request parameter. Each hash value ensures the integrity of, and maps to, the respective transaction data object. If `transaction_data_hashes_alg` was specified in the request, the hash function MUST be one of its values. If `transaction_data_hashes_alg` was not specified in the request, the hash function MUST be `sha-256`.
* `transaction_data_hashes_alg`: REQUIRED when this parameter was present in the `transaction_data` request parameter. String representing the hash algorithm identifier used to calculate hashes in `transaction_data_hashes` response parameter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does the text have the transaction_data_hashes_alg twice?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which two instances do you mean?


* `transaction_data_hashes`: Array of hashes, where each hash is calculated using a hash function over the strings received in the `transaction_data` request parameter. Each hash value ensures the integrity of, and maps to, the respective transaction data object. Where in the response this parameter is included is defined by each Credential Format, but it has to be included in the mechanism used for the proof of possession of the Credential that is signed using the user-controlled key.
* `transaction_data_hashes_alg`: REQUIRED when this parameter was present in the `transaction_data` request parameter. String representing the hash algorithm identifier used to calculate hashes in `transaction_data_hashes` response parameter.
The Wallet that received the `transaction_data` parameter in the request MUST include a representation or reference to the data in the in the respective credential presentation. How this is done is part of the specification of any Credential Format that supports it, such as some of those in (#format_specific_parameters). If the Wallet does not support `transaction_data` parameter, it MUST return an error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Wallet does not support transaction_data parameter, it MUST return an error.

What would that error be? We have already defined invalid_transaction_data in clause 8.5 but this applies for cases where the wallet supports transaction_data but there is something wrong inside the structure. Should we define a new error code for when transaction_data is not supported?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes. But I’m not sure we can still introduce a new error code at this stage. The ambiguity was already in the draft before this proposed change. It seems more appropriate to have a separate PR for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will create a separate issue for this .

Co-authored-by: Torsten Lodderstedt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broaden transaction data requirements Clarification on transaction_data_hashes
7 participants