Skip to content

HIVE-29392: Add Support for CTR and GCM cipher transformations in AES…#6257

Closed
tanishq-chugh wants to merge 2 commits into
apache:masterfrom
tanishq-chugh:HIVE-29392
Closed

HIVE-29392: Add Support for CTR and GCM cipher transformations in AES…#6257
tanishq-chugh wants to merge 2 commits into
apache:masterfrom
tanishq-chugh:HIVE-29392

Conversation

@tanishq-chugh

Copy link
Copy Markdown
Contributor

… UDFs

What changes were proposed in this pull request?

Extend support for cipher transformations in AES UDFs with Counter (CTR) & Galois/Counter Mode (GCM) modes that provide stronger security as they use Initialization Vector.

Why are the changes needed?

Currently, The AES UDFs only support one cipher transformation - AES/ECB/PKCS5Padding, which is inherently weak, as it produces the same ciphertext for identical blocks of plain text.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually Tested

@ayushtkn ayushtkn left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fundamental question: If I create a table using this UDF with say X as the session config in HIVE_UDF_AES_CIPHER_TRANSFORMATION and some Y in another session while reading. The data gets corrupted?

  • How do I figure it out once write what was the value of the config while writing?
  • If I wrote a column with say X as the value of the UDF and another existing column with Y as the value, how do I read both the columns in one query

@tanishq-chugh

tanishq-chugh commented Jan 6, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for checking this @ayushtkn , and the concern is absolutely valid. As of now, there's no way to figure out the value once write is done and also, using different values at time of write/read will lead to data corruption.

I am considering two options moving ahead.

  1. Going ahead with modifying the UDF definition of it allowing a maximum of 3 arguments instead of session level config.
    3rd argument being optional for this transformation value, with default to the current value. This is needed for backward compatibility.
    Figuring out the transformation value at the time of read, would still remain the user responsibility. (Query History could be of help here).

  2. In addition to modifying the UDF definition, Figuring out the mode at time of decryption could be automated if we have only GCM & AES (default mode) options, as falling back to legacy mode could be an option when decryption runs into AEADBadTagException.

Let me know your thoughts on this.

@sonarqubecloud

sonarqubecloud Bot commented Jan 7, 2026

Copy link
Copy Markdown

@github-actions

github-actions Bot commented Mar 9, 2026

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions github-actions Bot added the stale label May 23, 2026
@github-actions github-actions Bot closed this May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants