-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support running task pods in other Kubernetes namespaces #17738
Conversation
...verlord-extensions/src/main/java/org/apache/druid/k8s/overlord/common/DruidK8sConstants.java
Outdated
Show resolved
Hide resolved
@@ -264,6 +264,7 @@ private Map<String, String> getJobLabels(KubernetesTaskRunnerConfig config, Task | |||
return ImmutableMap.<String, String>builder() | |||
.putAll(config.getLabels()) | |||
.put(DruidK8sConstants.LABEL_KEY, "true") | |||
.put(DruidK8sConstants.ORIGINAL_NAMESPACE_KEY, config.getOverlordNamespace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it work well when the namespace is undefined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under KubernetesTaskRunnerConfig, the namespace is annotated as non-null. Hence, we should be able to get the namespace.
|
||
The extension uses `druid.indexer.runner.capacity` to limit the number of k8s jobs in flight. A good initial value for this would be the sum of the total task slots of all the middle managers you were running before switching to K8s based ingestion. The K8s task runner uses one thread per Job that is created, so setting this number too large can cause memory issues on the overlord. Additionally set the variable `druid.indexer.runner.namespace` to the namespace in which you are running druid. | ||
|
||
Other configurations required are: | ||
`druid.indexer.runner.type: k8s` and `druid.indexer.task.encapsulatedTask: true` | ||
|
||
### Running Task Pods in Another Namespace | ||
|
||
It is possible to run task pods in a different namespace from the rest of your Druid cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this feature is for pretty advanced kubernetes users and some of the complexity seems to be around trying to support the K8sTaskAdapter based pod adapters, would it make sense to just require using the pod template adapter if you want to use this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @georgew5656, do you mean that we only add this functionality for PodTemplateTaskAdapter, and remove support in K8sTaskAdapter because Single/Multi ContainerTaskAdapter do not need to use this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense because for non pod template mode, tasks are scheduled in the same ns with overlord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Co-authored-by: Frank Chen <[email protected]>
8001e3f
to
a14434c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…e adapter for deploying jobs in other namespaces
Changed logic to only support pod template adapters as suggested. Please take a look again. |
@GWphua I am also going through the PR. Will come back to you by the end of the week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the changes since @FrankChen021's review and they look good to me. There is documentation and test coverage, and the change is done in a way where the risk appears low if the new druid.indexer.runner.overlordNamespace
parameter is not set.
@GWphua thank you for your contribution! |
Description
This PR aims to allow task pods to run in a namespace different from the rest of the cluster.
Running Kubernetes Task Pods in Another Namespace
Currently, Druid already supports running task pods in another namespace by specifying
druid.indexer.runner.namespace: taskPodNamespace
. If we only have one Druid cluster that spins all these task pods up, we will not need this PR.However, running task pods from multiple clusters will be a problem. Let us imagine this scenario:
Dealing with Kubernetes Task Pods from Multiple Namespaces
The Overlord service filters peon jobs by labels. Hence, the intuition behind this PR is to simply label the peon jobs (
druid.overlord.namespace=C1
) so that the Overlord can filter jobs that are relevant to its cluster.Hence, we introduce the new config:
druid.indexer.runner.overlordNamespace
. Here's an usage example using the above-mentioned namespaces C1 and X.druid.indexer.runner.namespace
.druid.indexer.runner.namespace: X
anddruid.indexer.runner.overlordNamespace: C1
.Documentation
Added documentation for running Kubernetes Task Pods in another namespace.
Release note
You can now run task pods in a namespace different from the rest of the cluster.
Key changed/added classes in this PR
k8s-jobs.md
KubernetesTaskRunnerConfig.java
KubernetesTaskRunnerFactory.java
KubernetesPeonClient.java
K8sTaskAdapter.java
PodTemplateTaskAdapter.java
DruidK8sConstants.java
This PR has: