Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALB is not created by the controller #4028

Open
Mostafa-wael opened this issue Jan 22, 2025 · 2 comments
Open

ALB is not created by the controller #4028

Mostafa-wael opened this issue Jan 22, 2025 · 2 comments

Comments

@Mostafa-wael
Copy link

I installed the AWS load balancer controller using Helm and it fails to provision an ALB on AWS when I create new ingresses.

Steps to reproduce
For most of the steps, I was following these guide 1, guide 2, and guide 3.

First of all. I associated the IAM OIDC Provider with my EKS cluster using eksctl utils associate-iam-oidc-provider --region=us-east-1 --cluster=<name> --approve

Roles & Policies

I used this terraform file to create the required roles and policies, and install the helm chart:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    helm = {
      source  = "hashicorp/helm"
      version = "~> 2.9"
    }
  }
  required_version = ">= 1.5.0"
}

provider "aws" {
  region  = "us-east-1" # Change to your desired region
  profile = "default"   # Change to your AWS CLI profile if necessary
}

variable "cluster_name" {
  description = "The name of the EKS cluster"
  type        = string
}

variable "vpc_id" {
  description = "The ID of the VPC"
  type        = string
}

data "aws_iam_policy_document" "aws_lbc" {
  statement {
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["pods.eks.amazonaws.com"]
    }

    actions = [
      "sts:AssumeRole",
      "sts:TagSession"
    ]
  }
}

resource "aws_iam_role" "aws_lbc" {
  name               = "AmazonEKSLoadBalancerControllerRole"
  assume_role_policy = data.aws_iam_policy_document.aws_lbc.json
}


# I tried this command too: (aws iam create-policy --policy-name AWSLoadBalancerControllerIAMPolicy  --policy-document file://AWSLoadBalancerControllerIAMPolicy.json)
resource "aws_iam_policy" "aws_lbc" {
 # curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.11.0/docs/install/iam_policy.json
  policy = file("./iam/AWSLoadBalancerControllerIAMPolicy.json")
  name   = "AWSLoadBalancerControllerIAMPolicy"
}

resource "aws_iam_role_policy_attachment" "aws_lbc" {
  policy_arn = aws_iam_policy.aws_lbc.arn
  role       = aws_iam_role.aws_lbc.name
}

resource "aws_eks_pod_identity_association" "aws_lbc" {
  cluster_name    = var.cluster_name
  namespace       = "kube-system"
  service_account = "aws-load-balancer-controller"
  role_arn        = aws_iam_role.aws_lbc.arn
}

data "aws_eks_cluster" "eks" {
  name = "<name>"
}

data "aws_eks_cluster_auth" "eks" {
  name = "name>"
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.eks.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks.certificate_authority[0].data)
    token                  = data.aws_eks_cluster_auth.eks.token
  }
}

# I deployed this after creating the service account
resource "helm_release" "aws_lbc" {
  name       = "aws-load-balancer-controller"
  repository = "https://aws.github.io/eks-charts"
  chart      = "aws-load-balancer-controller"
  namespace  = "kube-system"
  version    = "1.11.0"
  cleanup_on_fail = true

  set {
    name  = "clusterName"
    value = var.cluster_name
  }

  set{
    name = "serviceAccount.create"
    value = "false"
  }

  set {
    name  = "serviceAccount.name"
    value = "aws-load-balancer-controller"
  }

  set {
    name  = "vpcId"
    value = var.vpc_id
  }
  set{
    name = "region"
    value = "us-east-1"
  }

  set{
    name = "replicaCount"
    value = 1
  }
  
}

I also tried this AmazonEKSLoadBalancerControllerRole trust relationship:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:I am::<accountID>:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/<id>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.us-east-1.amazonaws.com/id/<id>:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"
                }
            }
        }
    ]
}

Furthermore, I tried installing the helm chart using the helm cli instead of terraform:

helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  --version 1.11.0 \
  --namespace kube-system \
  --set clusterName=<name> \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller \
  --set vpcId=<> \
  --set region=us-east-1 \
  --set replicaCount=1 

Ingress

This is my ingress manifest:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: <>
  namespace: testing
  annotations:
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:<>:certificate/<>
    alb.ingress.kubernetes.io/group.name: new-shared-k8s-alb-group
    alb.ingress.kubernetes.io/healthcheck-path: /healthz
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
    alb.ingress.kubernetes.io/load-balancer-name: new-shared-k8s-alb-group
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-TLS-1-2-2017-01
    alb.ingress.kubernetes.io/ssl-redirect: "443"
    alb.ingress.kubernetes.io/target-type: IP
   # kubernetes.io/ingress.class: alb # I tried this too
spec:
  ingressClassName: alb
  rules:
    - host: <>
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: <>
                port:
                  number: 80
  tls:
    - hosts:
        - <>

Service Account

I created a service account before installing the chart using:

eksctl create iamserviceaccount \
    --cluster=<name> \
    --namespace=kube-system \
    --name=aws-load-balancer-controller \
    --attach-policy-arn=arn:aws:iam::<>:policy/AWSLoadBalancerControllerIAMPolicy \
    --override-existing-serviceaccounts \
    --approve

It created the service account with this auto-generated annotation

Name:                aws-load-balancer-controller
Namespace:           kube-system
Labels:              app.kubernetes.io/managed-by=eksctl
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::<>:role/eksctl-<clusterName>-addon-iamservice-Role1-mP2d1JUPOppx
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>
Events:              <none>

It didn't work, so I updated it to eks.amazonaws.com/role-arn to arn:aws:iam::<>:role/AmazonEKSLoadBalancerControllerRole :

Name:                aws-load-balancer-controller
Namespace:           kube-system
Labels:              app.kubernetes.io/managed-by=eksctl
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::<>:role/AmazonEKSLoadBalancerControllerRole
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>

unfortunately, it didn't make any difference.

Expected outcome

An ALB should be created and I should see an ALB assigned to my ingress resources.

Current outcome

  • In my ingress, I got: Failed build model due to operation error Elastic Load Balancing v2: DescribeLoadBalancers, get identity: get credentials: failed to refresh cached credentials, failed to load credentials, exceeded maximum number of attempts, 10, request send failed, Get "http://169.254.170.23/v1/credentials": dial tcp 169.254.170.23:80: i/o timeout
  • In the controller: {"level":"error","ts":"2025-01-21T20:37:36Z","msg":"Reconciler error","controller":"ingress","object":{"name":"new-shared-k8s-alb-group"},"namespace":"","name":"new-shared-k8s-alb-group","reconcileID":"18b4f2a0-0d46-46b3-a89a-e43cf2c58dd1","error":"operation error Elastic Load Balancing v2: DescribeLoadBalancers, get identity: get credentials: failed to refresh cached credentials, failed to load credentials, exceeded maximum number of attempts, 10, request send failed, Get \"http://169.254.170.2/ and 2025/01/21 20:49:32 http: TLS handshake error from 10.0.21.244:35898: EOF
  • Then I installed cert-manager using kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.0/cert-manager.yaml and the error became {"level":"error","ts":"2025-01-21T21:48:06Z","msg":"Reconciler error","controller":"ingress","object":{"name":"new-shared-k8s-alb-group"},"namespace":"","name":"new-shared-k8s-alb-group","reconcileID":"6cbe3d55-9104-469f-855f-c1d279b76d36","error":"operation error Elastic Load Balancing v2: DescribeLoadBalancers, get identity: get credentials: failed to refresh cached credentials, failed to load credentials, exceeded maximum number of attempts, 10, request send failed, Get \"http://169.254.170.23/v1/credentials\": dial tcp 169.254.170.23:80: i/o timeout"}

Environment

  • AWS Load Balancer controller version v2.11.0
  • Chart version 1.11.0
  • Using EKS (yes/no), if so version? v1.25.16-eks-2d5f260

Additional Context

I installed these CRDs too:

wget https://raw.githubusercontent.com/aws/eks-charts/master/stable/aws-load-balancer-controller/crds/crds.yaml
kubectl apply -f crds.yaml
# I tried this too
kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller/crds?ref=master"
@andreybutenko
Copy link
Contributor

Hi! Sorry for the trouble

As an initial troubleshooting step, could you please increase the hop limit for reaching IMDSv2 according to this guide? https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/deploy/installation/#using-the-amazon-ec2-instance-metadata-server-version-2-imdsv2

My teammates have recently resolved similar issues with this solution :)

@ivan-gta
Copy link

ivan-gta commented Feb 6, 2025

I had the same issue, appears that in my ingress definition I had to use the old annotation and not the new one:
kubernetes.io/ingress.class: alb - works (despite de deprecation warning when applied)
kubernetes.io/ingressClassName: alb - does not create any loadbalancer ! Spend a few hours debugging and verifying everything before I tried this older annotation. Please fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants