Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is no debug level output in DescribeImageQuery #7723

Open
spkane opened this issue Feb 11, 2025 · 4 comments
Open

There is no debug level output in DescribeImageQuery #7723

spkane opened this issue Feb 11, 2025 · 4 comments
Labels
feature New feature or request good-first-issue Good for newcomers help-wanted Extra attention is needed triage/accepted Indicates that the issue has been accepted as a valid issue

Comments

@spkane
Copy link

spkane commented Feb 11, 2025

Description

func (a AL2) DescribeImageQuery(ctx context.Context, ssmProvider ssm.Provider, k8sVersion string, amiVersion string) (DescribeImageQuery, error) {
ids := map[string][]Variant{}
for path, variants := range map[string][]Variant{
fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2/%s/image_id", k8sVersion, lo.Ternary(
amiVersion == v1.AliasVersionLatest,
"recommended",
fmt.Sprintf("amazon-eks-node-%s-%s", k8sVersion, amiVersion),
)): {VariantStandard},
fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2-arm64/%s/image_id", k8sVersion, lo.Ternary(
amiVersion == v1.AliasVersionLatest,
"recommended",
fmt.Sprintf("amazon-eks-arm64-node-%s-%s", k8sVersion, amiVersion),
)): {VariantStandard},
fmt.Sprintf("/aws/service/eks/optimized-ami/%s/amazon-linux-2-gpu/%s/image_id", k8sVersion, lo.Ternary(
amiVersion == v1.AliasVersionLatest,
"recommended",
fmt.Sprintf("amazon-eks-gpu-node-%s-%s", k8sVersion, amiVersion),
)): {VariantNeuron, VariantNvidia},
} {
imageID, err := ssmProvider.Get(ctx, ssm.Parameter{
Name: path,
IsMutable: amiVersion == v1.AliasVersionLatest,
})
if err != nil {
continue
}
ids[imageID] = variants
}
// Failed to discover any AMIs, we should short circuit AMI discovery
if len(ids) == 0 {
return DescribeImageQuery{}, fmt.Errorf(`failed to discover any AMIs for alias "al2@%s"`, amiVersion)
}
return DescribeImageQuery{
Filters: []ec2types.Filter{{
Name: lo.ToPtr("image-id"),
Values: lo.Keys(ids),
}},
KnownRequirements: lo.MapValues(ids, func(variants []Variant, _ string) []scheduling.Requirements {
return lo.Map(variants, func(v Variant, _ int) scheduling.Requirements { return v.Requirements() })
}),
}, nil
}

Observed Behavior:
Info and Debug logging levels produce the same messages regarding AMI discovery.

karpenter-6bfb558fcf-v5k2k controller {"level":"ERROR","time":"2025-02-10T23:49:31.847Z","logger":"controller","caller":"controller/controller.go:261","message":"Reconciler error","commit":"62a726c","controller":"nodeclass.status","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","EC2NodeClass":{"name":"default"},"namespace":"","name":"default","reconcileID":"bcad9de4-242f-4799-94f6-a829ce1566fa","error":"getting amis, getting AMI queries, failed to discover any AMIs for alias \"al2@latest\""}

Expected Behavior:
With Debug logging enabled, I would expect to see more messages that show the exact SSM name that was queried and the response (network timeout, permission denied, or some unexpected response from the API) that was received.

e.g.

karpenter-6bfb558fcf-v5k2k controller {"level":"ERROR","time":"2025-02-10T23:49:31.847Z","logger":"controller","caller":"controller/controller.go:261","message":"Reconciler error","commit":"62a726c","controller":"nodeclass.status","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","EC2NodeClass":{"name":"default"},"namespace":"","name":"default","reconcileID":"bcad9de4-242f-4799-94f6-a829ce1566fa","error":"getting amis, getting AMI queries, failed to discover any AMIs for alias \"al2@latest\""}

karpenter-6bfb558fcf-v5k2k controller {"ssm_query": "/aws/service/eks/optimized-ami/1.29/amazon-linux-2/recommended/image_id", "region": "us-east-2", response: "None", error: true}

Reproduction Steps (Please include YAML):

Versions:

  • Chart Version: karpenter-1.0.1
  • Kubernetes Version (kubectl version): Server Version: v1.29.12-eks-2d5f260
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@spkane spkane added bug Something isn't working needs-triage Issues that need to be triaged labels Feb 11, 2025
@jmdeal
Copy link
Contributor

jmdeal commented Feb 11, 2025

Closing as a duplicate of #7544.

@jmdeal jmdeal closed this as completed Feb 11, 2025
@spkane
Copy link
Author

spkane commented Feb 11, 2025

@jmdeal This isn't really a duplicate of the other issue unless you are going to add a note in the other issue to add debug output, which was really my point here, while the other issue is about the functionality not working in all circumstances.

@jmdeal
Copy link
Contributor

jmdeal commented Feb 11, 2025

That was the last outstanding action item in that issue, the root cause of the actual issue was an IAM misconfiguration. It does seem reasonable to close that issue out though and track this here though. I'll reopen.

@jmdeal
Copy link
Contributor

jmdeal commented Feb 11, 2025

I think it also would be appropriate to provide this information through the existing error log, rather than additional debug logs. IIRC the main reason this wasn't logged originally is we shipped a version of Karpenter with support for the AL2023 nvidia and neuron AMIs before they were published and didn't want to show errors for SSM parameters which didn't exist. Given that's no longer the case, including this info in an error seems reasonable.

@jmdeal jmdeal added feature New feature or request triage/accepted Indicates that the issue has been accepted as a valid issue good-first-issue Good for newcomers and removed bug Something isn't working needs-triage Issues that need to be triaged labels Feb 12, 2025
@jonathan-innis jonathan-innis added the help-wanted Extra attention is needed label Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request good-first-issue Good for newcomers help-wanted Extra attention is needed triage/accepted Indicates that the issue has been accepted as a valid issue
Projects
None yet
Development

No branches or pull requests

3 participants