Skip to content

Conversation

@szehon-ho
Copy link
Member

@szehon-ho szehon-ho commented May 22, 2025

This bug was noticed by @gengliangwang , thanks!

What changes were proposed in this pull request?

Improve HiveExternalCatalog.alterTable to update column comments if they are changed in the schema.

Why are the changes needed?

Users use this method assuming comments should be updated.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Add test to HiveDDLSuite

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label May 22, 2025
@szehon-ho szehon-ho force-pushed the hive_alter_table_comment branch 2 times, most recently from 24fdfe6 to 2cbbaeb Compare May 22, 2025 22:13
@szehon-ho szehon-ho force-pushed the hive_alter_table_comment branch from 2cbbaeb to 05b278f Compare May 22, 2025 22:14
@szehon-ho szehon-ho changed the title [SPARK-52272][SQL] HiveExternalCatalog alter table loses comment updates [SPARK-52272][SQL] HiveExternalCatalog alter table should update column comments May 22, 2025
}

// alter column comments
val newCommentMap = tableDefinition.schema.map(f => (f.name, f.getComment())).toMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work with nested columns. For example, a.b.c

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, let me add to this

val newDef = tableDefinition.copy(
storage = newStorage,
schema = newSchema,
schema = newSchemaWithComments,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we can't use the schema from the input parameter tableDefinition directly. cc @cloud-fan @yaooqinn do you know the context?

Copy link
Member Author

@szehon-ho szehon-ho May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gengliangwang looking at this, actually there is a separate method for alter the schema https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L706 maybe we rushed a bit to solution and the problem is the caller should call this one instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is true, ideally this method should handle update the schema as well. Otherwise the caller needs to make two separate alter calls and its more expensive

@szehon-ho
Copy link
Member Author

szehon-ho commented May 24, 2025

As this look like its by design: #16944 (comment)

I close it in favor of : #51007

@szehon-ho szehon-ho closed this May 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants