-
Notifications
You must be signed in to change notification settings - Fork 118
Use a headless service to give a hostname to the driver. #483
Conversation
Required since SPARK-21642 was added upstream.
s" managed via a Kubernetes service.") | ||
|
||
val preferredServiceName = s"$kubernetesResourceNamePrefix$DRIVER_SVC_POSTFIX" | ||
val resolvedServiceName = if (preferredServiceName.length <= MAX_SERVICE_NAME_LENGTH) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the name prefix is derived from the app name, but we don't want the app name to be constrained because of service name limits. Nevertheless this still is far from ideal and I would appreciate suggestions on how we can do this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the service name isn't nearly as visible to users as pod names are, I don't think naming is all that important here. What you have seems fine to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might expose the driver UI using the DNS name now. So, I think it would be good to make sure it's intelligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a great way to shorten the name given a long app name though. It should be fine to make the name less intelligible in the corner cases where the given app name is long.
We should cherry pick SPARK-21642 into this PR to confirm this works with it |
I wouldn't consider that a requirement for merging this into branch-2.2-kubernetes. We can double check when we propose to upstream master. |
If we're doing this change so that this works with that commit, we should test against it at the same time as we make the fix. Otherwise we're not sure this works and we'll have to figure this out again in the future, but with context lost. What's the harm of cherry picking it in? |
It's fine because we're hard-setting |
Speaking of which - I just thought of another way to do this, which is to just have But I like the idea of having a FQDN associated with every driver pod, and using a FQDN based approach fits the spirit of what SPARK-21642 was trying to accomplish in the first place, anyways. The tradeoffs are worth considering though. |
The upstream JIRA had FQDN to solve an SSL cert issue. Maybe we conditionally create the DNS name IF we have SSL turned on? I think that would be reasonable. In the other cases, |
How much overhead is it to create the headless service? I would rather not fork the code paths if possible to reduce the complexity. |
.build() | ||
|
||
val namespace = submissionSparkConf.get(KUBERNETES_NAMESPACE) | ||
val driverHostname = s"${driverService.getMetadata.getName}.$namespace.svc.cluster.local" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this entire string need to be <= 63 characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just each component between the dots
Yeah, it wouldn't be too bad. I think we should just create the service then. |
It is possible for |
Is the instability significant and frequent enough in most clusters to warrant the extra complexity here? Basically I don't think the cases that we would optimize for with a forked approach to use the IP address vs. the headless service, would be worthwhile if the headless service would work for us most of the time. I also foresee needing a service in general to expose the Spark UI in the not so distant future. |
I don't know the answer. I used to see it fail more often in kube 1.5. And kube 1.6 made it better, it seems. It will really depend on how much resource kube-dns has. Or, maybe it has HA that will mitigate the risk often enough. @foxish Do we know how well kube-dns perform in general? |
@kimoonkim I haven't seen kube-dns perform badly in production clusters. It may occasionally take a bit longer to resolve DNS names, but that's not common. Let's add the headless service anyway, and keep our eyes peeled for issues reported because of it; at that point, we can consider forking the code paths and overriding |
SGTM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm taking that as a +1 from @kimoonkim and @foxish so will merge this when builds are green.
Thanks @mccheah !
…k-on-k8s#483) * Use a headless service to give a hostname to the driver. Required since SPARK-21642 was added upstream. * Fix scalastyle. * Add back import * Fix conflict properly. * Fix orchestrator test.
…k-on-k8s#483) * Use a headless service to give a hostname to the driver. Required since SPARK-21642 was added upstream. * Fix scalastyle. * Add back import * Fix conflict properly. * Fix orchestrator test.
Required since SPARK-21642 was added upstream. Closes #482.