-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-52272][SQL] HiveExternalCatalog alter table should update column comments #50991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
24fdfe6
to
2cbbaeb
Compare
2cbbaeb
to
05b278f
Compare
@@ -684,11 +684,23 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat | |||
} else { | |||
(oldTableDef.schema, oldTableDef.partitionColumnNames) | |||
} | |||
|
|||
// alter column comments | |||
val newCommentMap = tableDefinition.schema.map(f => (f.name, f.getComment())).toMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work with nested columns. For example, a.b.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, let me add to this
// // Add old table's owner if we need to restore | ||
val owner = Option(tableDefinition.owner).filter(_.nonEmpty).getOrElse(oldTableDef.owner) | ||
val newDef = tableDefinition.copy( | ||
storage = newStorage, | ||
schema = newSchema, | ||
schema = newSchemaWithComments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we can't use the schema from the input parameter tableDefinition
directly. cc @cloud-fan @yaooqinn do you know the context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gengliangwang looking at this, actually there is a separate method for alter the schema https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L706 maybe we rushed a bit to solution and the problem is the caller should call this one instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is true, ideally this method should handle update the schema as well. Otherwise the caller needs to make two separate alter calls and its more expensive
As this look like its by design: #16944 (comment) I close it in favor of : #51007 |
This bug was noticed by @gengliangwang , thanks!
What changes were proposed in this pull request?
Improve HiveExternalCatalog.alterTable to update column comments if they are changed in the schema.
Why are the changes needed?
Users use this method assuming comments should be updated.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Add test to HiveDDLSuite
Was this patch authored or co-authored using generative AI tooling?
No