Skip to content

Commit 4717987

Browse files
cutiechipan3793
authored andcommitted
[KYUUBI #7113] Skip Hadoop classpath check if flink-shaded-hadoop jar exists in Flink lib directory
### Why are the changes needed? This change addresses an issue where the Flink engine in Kyuubi would perform a Hadoop classpath check even when a ‎`flink-shaded-hadoop` jar is already present in the Flink ‎`lib` directory. In such cases, the check is unnecessary and may cause confusion or warnings in environments where the shaded jar is used instead of a full Hadoop classpath. By skipping the check when a ‎`flink-shaded-hadoop` jar exists, we improve compatibility and reduce unnecessary log output. ### How was this patch tested? The patch was tested by deploying Kyuubi with a Flink environment that includes a ‎`flink-shaded-hadoop` jar in the ‎`lib` directory and verifying that the classpath check is correctly skipped. Additional tests ensured that the check still occurs when neither the Hadoop classpath nor the shaded jar is present. Unit tests and manual verification steps were performed to confirm the fix. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #7113 from cutiechi/fix/flink-classpath-missing-hadoop-check. Closes #7113 99a4bf8 [cutiechi] fix(flink): fix process builder suite 7b99987 [cutiechi] fix(flink): remove hadoop cp add ea33258 [cutiechi] fix(flink): update flink hadoop classpath doc 6bb3b1d [cutiechi] fix(flink): optimize hadoop class path messages c548ed6 [cutiechi] fix(flink): simplify classpath detection by merging hasHadoopJar conditions 9c16d54 [cutiechi] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala 0f729dc [cutiechi] fix(flink): skip hadoop classpath check if flink-shaded-hadoop jar exists Authored-by: cutiechi <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
1 parent 8c5f461 commit 4717987

File tree

2 files changed

+32
-8
lines changed

2 files changed

+32
-8
lines changed

docs/deployment/engine_on_yarn.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,11 @@ $ echo "export HADOOP_CONF_DIR=/path/to/hadoop/conf" >> $KYUUBI_HOME/conf/kyuubi
219219

220220
#### Required Environment Variable
221221

222-
The `FLINK_HADOOP_CLASSPATH` is required, too.
222+
The `FLINK_HADOOP_CLASSPATH` is required unless the necessary Hadoop client jars (such as `hadoop-client` or
223+
`flink-shaded-hadoop`) have already been placed in the Flink lib directory (`$FLINK_HOME/lib`).
224+
225+
If the jars are not present in `$FLINK_HOME/lib`, you must set `FLINK_HADOOP_CLASSPATH` to include the appropriate
226+
Hadoop client jars.
223227

224228
For users who are using Hadoop 3.x, Hadoop shaded client is recommended instead of Hadoop vanilla jars.
225229
For users who are using Hadoop 2.x, `FLINK_HADOOP_CLASSPATH` should be set to hadoop classpath to use Hadoop

kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -225,21 +225,41 @@ class FlinkProcessBuilder(
225225
env.get("HBASE_CONF_DIR").foreach(classpathEntries.add)
226226
env.get("HIVE_CONF_DIR").foreach(classpathEntries.add)
227227
val hadoopCp = env.get(FLINK_HADOOP_CLASSPATH_KEY)
228-
hadoopCp.foreach(classpathEntries.add)
229228
val extraCp = conf.get(ENGINE_FLINK_EXTRA_CLASSPATH)
230229
extraCp.foreach(classpathEntries.add)
231-
if (hadoopCp.isEmpty && extraCp.isEmpty) {
232-
warn(s"The conf of ${FLINK_HADOOP_CLASSPATH_KEY} and " +
233-
s"${ENGINE_FLINK_EXTRA_CLASSPATH.key} is empty.")
230+
231+
val hasHadoopJar = {
232+
val files = Paths.get(flinkHome)
233+
.resolve("lib")
234+
.toFile
235+
.listFiles(new FilenameFilter {
236+
override def accept(dir: File, name: String): Boolean = {
237+
name.startsWith("hadoop-client") ||
238+
name.startsWith("flink-shaded-hadoop")
239+
}
240+
})
241+
files != null && files.nonEmpty
242+
}
243+
244+
if (!hasHadoopJar) {
245+
hadoopCp.foreach(classpathEntries.add)
246+
}
247+
248+
if (!hasHadoopJar && hadoopCp.isEmpty && extraCp.isEmpty) {
249+
warn(s"No Hadoop client jars found in $flinkHome/lib, and the conf of " +
250+
s"$FLINK_HADOOP_CLASSPATH_KEY and ${ENGINE_FLINK_EXTRA_CLASSPATH.key} is empty.")
234251
debug("Detected development environment.")
235252
mainResource.foreach { path =>
236253
val devHadoopJars = Paths.get(path).getParent
237254
.resolve(s"scala-$SCALA_COMPILE_VERSION")
238255
.resolve("jars")
239256
if (!Files.exists(devHadoopJars)) {
240-
throw new KyuubiException(s"The path $devHadoopJars does not exists. " +
241-
s"Please set ${FLINK_HADOOP_CLASSPATH_KEY} or ${ENGINE_FLINK_EXTRA_CLASSPATH.key}" +
242-
s" for configuring location of hadoop client jars, etc.")
257+
throw new KyuubiException(
258+
s"The path $devHadoopJars does not exist. Please set " +
259+
s"${FLINK_HADOOP_CLASSPATH_KEY} or ${ENGINE_FLINK_EXTRA_CLASSPATH.key} " +
260+
s"to configure the location of Hadoop client jars. Alternatively," +
261+
s"you can place the required hadoop-client or flink-shaded-hadoop jars " +
262+
s"directly into the Flink lib directory: $flinkHome/lib.")
243263
}
244264
classpathEntries.add(s"$devHadoopJars${File.separator}*")
245265
}

0 commit comments

Comments
 (0)