Skip to content

feat: Airflow Listener integration #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

adwk67
Copy link
Member

@adwk67 adwk67 commented Apr 4, 2025

Description

fixes: #580

Jenkins tests 🟢
Openshift tests 🟢 (logging test passed when run again in isolation):-

--- FAIL: kuttl (1785.90s)
    --- FAIL: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-server-verification-tls_openshift-true_executor-kubernetes (300.02s)
        --- PASS: kuttl/harness/opa_airflow-2.10.4_opa-latest-1.0.1_openshift-true (333.59s)
        --- PASS: kuttl/harness/orphaned-resources_airflow-latest-2.10.4_openshift-true (177.55s)
        --- PASS: kuttl/harness/smoke_airflow-2.10.4_openshift-true_executor-kubernetes (172.15s)
        --- PASS: kuttl/harness/external-access_airflow-2.10.4_openshift-true (191.09s)
        --- PASS: kuttl/harness/cluster-operation_airflow-latest-2.10.4_openshift-true (289.47s)
        --- PASS: kuttl/harness/oidc_airflow-2.10.4_openshift-true (169.49s)
        --- PASS: kuttl/harness/resources_airflow-latest-2.10.4_openshift-true (169.28s)
        --- FAIL: kuttl/harness/logging_airflow-2.10.4_openshift-true_executor-kubernetes (836.07s)
        --- PASS: kuttl/harness/mount-dags-configmap_airflow-latest-2.10.4_openshift-true_executor-kubernetes (168.22s)
        --- PASS: kuttl/harness/overrides_airflow-latest-2.10.4_openshift-true (191.15s)
        --- PASS: kuttl/harness/mount-dags-gitsync_airflow-latest-2.10.4_openshift-true_executor-kubernetes (311.44s)
FAIL

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

Preview Give feedback

Reviewer

Preview Give feedback

Acceptance

Preview Give feedback

@adwk67 adwk67 marked this pull request as ready for review April 10, 2025 13:12
@adwk67 adwk67 moved this to Development: Waiting for Review in Stackable Engineering Apr 10, 2025
@adwk67 adwk67 self-assigned this Apr 10, 2025
@maltesander maltesander self-requested a review April 16, 2025 07:53
@maltesander maltesander moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Apr 16, 2025
Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First batch, tests work fine, will do some more manual testing.

Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works. Just some more nitpicking / clarification.

Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not a complete review, but a few notes..)

Comment on lines 924 to 932
// all listeners will use ephemeral volumes as they can/should
// be removed when the pods are *terminated* (ephemeral PVCs will
// survive re-starts)
pb.add_listener_volume_by_listener_class(
LISTENER_VOLUME_NAME,
&listener_class.to_string(),
&recommended_labels,
)
.context(AddVolumeSnafu)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients don't care about which airflow webserver they connect to, right? In that case, we should use a group listener instead (https://docs.stackable.tech/home/stable/listener-operator/volume/#_listeners_stackable_techlistener_name), so we only end up with one load balancer that knows how to route everywhere.

Copy link
Member Author

@adwk67 adwk67 Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I've understood this correctly, the operator would then be responsible for creating a listener-by-name (like we do for the kafka bootstrap listener) if the webserver role (or one of its rolegroups) declares a listener class as external-stable or external-unstable, and then this group listener is passed as the name to ListenerReference::ListenerClass for the ListenerOperatorVolumeSourceBuilder for the pvcs? And this listener doesn't care if webserver rolegroups specify different classes (e.g. one as external-stable and another as external-unstable)? I'll do this per role-group, as we have done for Kafka.

Copy link
Member Author

@adwk67 adwk67 Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, but for Kafka we create a bootstrap-listener per role-group.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 924 to 926
// all listeners will use ephemeral volumes as they can/should
// be removed when the pods are *terminated* (ephemeral PVCs will
// survive re-starts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use persistent volumes for web UIs IMO, since upstream load balancers will depend on probably end up hard-coding the target addresses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 508 to 535
#[derive(Clone, Debug, Default, Display, Deserialize, Eq, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "PascalCase")]
pub enum CurrentlySupportedListenerClasses {
pub enum SupportedListenerClasses {
#[default]
#[serde(rename = "cluster-internal")]
#[strum(serialize = "cluster-internal")]
ClusterInternal,

#[serde(rename = "external-unstable")]
#[strum(serialize = "external-unstable")]
ExternalUnstable,

#[serde(rename = "external-stable")]
#[strum(serialize = "external-stable")]
ExternalStable,
}

impl CurrentlySupportedListenerClasses {
pub fn k8s_service_type(&self) -> String {
impl Atomic for SupportedListenerClasses {}

impl SupportedListenerClasses {
pub fn discoverable(&self) -> bool {
match self {
CurrentlySupportedListenerClasses::ClusterInternal => "ClusterIP".to_string(),
CurrentlySupportedListenerClasses::ExternalUnstable => "NodePort".to_string(),
CurrentlySupportedListenerClasses::ExternalStable => "LoadBalancer".to_string(),
SupportedListenerClasses::ClusterInternal => false,
SupportedListenerClasses::ExternalUnstable => true,
SupportedListenerClasses::ExternalStable => true,
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're using listener-operator, the listener class name should just be an opaque string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users are free to define their own custom listenerclasses, or even to redefine our preset ones (though the latter isn't really recommended since it could lead to upgrade strife..).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation for kafka-operator, a PVC is created for the bootstrap listener even when it is not defined by the user (defaulting to cluster-internal). I guess there is no easy way around that, other than looking up the listener class and checking its type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kafka-operator behaves correctly there; even when running cluster-internally we want the Listener -> Service to be the ingress point that we expose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See dcc21f9. As we discussed offline, I've used a consistent address for the Webserver UI whatever the listener class.

@adwk67
Copy link
Member Author

adwk67 commented Apr 28, 2025

Re-ran tests locally: 🟢

@adwk67 adwk67 requested review from maltesander and nightkr April 28, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

Integrate Airflow Operator with Listener Operator
3 participants