Modification of the pod observation #14

janLo · 2020-12-16T19:15:27Z

This is mainly a RFC as I'm aware that it changes the current behavior quite a bit - and I'm fine to keep it as a fork. Still I'm curious what others think of these changes:

I've removed the PodFailed branch from the pod observation as in my case, when I hit this branch, the reason and message was always empty. So I got a lot of events "Pod/xyz: " without any further helpful metadata. If there is a pod-change I always search for terminated containers.
Every failed container causes a sentry event now and not only the first one. They will be grouped by a common fingerprint though.
I've used the generateName field instead of uuid for the fingerprint (if the object has no owner) as this is the stable for a lot of solutions working with some kind of pod template (in my case Jenkins and Gitlab-Runner). If the generateName is not provided it uses the name in a mangled form like the sentry-kubernetes implementation did to allow better grouping.
I took the message out of the fingerprints as its often containing very specific information like the container ids preventing any grouping.

The podRailed case seemed pretty useless to me as the message and reason was always empty so I got a lot of meaningless events. I also think that any Pod-Failure that is not due to failed containers will show up as events and get better reported as such. Another disadvantage of the former implementation was, that it reported only ever one container failure per pod.

This introduces a name mangleing like the sentry-kubernets implementation does for more aggressive issue grouping.

The message often contains unique information like container-ids or node names causing basically no grouping to happen.

…VENT_LEVELS... - Leaving SKIP_EVENT_LEVELS unset Defaults to old behavior, "normal" type events will be skipped - Setting SKIP_EVENT_LEVELS to empty No event type will be skipped - Setting SKIP_EVENT_LEVELS to a comma separated event type list like "SKIP_EVENT_LEVELS=normal,warning" All events of those types will be skipped. The filtering mechanism is case-insensitive, so "normal,warning" is the same as "nOrMal,WARning".

Fix Pod Term Detection and Make SkipLevels Configurable

…el setting

feat: favor a namespace's skipLevel annotation for the global skipLevel env

added: - NS annotation secunet.sentry/skip-event-reasons: allow configuration of skip event reasons per namespace - env SKIP_EVENT_REASONS: global skip by reason config if NS has no skip config annotation - Pod annotation secunet.sentry/ignore-pod-updates=true: allows for suppression of pod update event handling through the forwarder changed: - skip config declaration format now supports specifying a resource type before a criteria: [involved object type:]criteria[,...] E.g. normal,Pod:warning,Service:error See: parseSkipConfig(...) at skip.go:71

feat: skip by reason and skip pod updates...

salzig · 2021-07-28T12:44:31Z

@wichert what's your opionion on this?

janLo added 4 commits December 16, 2020 20:14

feat: use the event reason as message if message is empty

75666ce

feat: add the controller that caused the event as extra

c22b10a

feat: Group on the generateName if available else the name.

c313bd1

janLo changed the title ~~Refactor pod observation~~ Modification of the pod observation Dec 16, 2020

janLo and others added 11 commits December 17, 2020 10:30

feat(fingerprint): mangle names in the same way as sentry-kubernetes

742d422

This introduces a name mangleing like the sentry-kubernets implementation does for more aggressive issue grouping.

fix: take the event message out of the fingerprint

c6a8f6f

The message often contains unique information like container-ids or node names causing basically no grouping to happen.

fix: Access proper term state and debork isNewTermination(...)

d6d4f63

Merge pull request #1 from paxbit/refactor-pod-observation

75683bc

Fix Pod Term Detection and Make SkipLevels Configurable

feat: favor a namespace's skipLevel annotation for the global skipLev…

9546afb

…el setting

feat: safeguard ns init procedure

477771d

Merge pull request #2 from paxbit/refactor-pod-observation

9ea63f7

feat: favor a namespace's skipLevel annotation for the global skipLevel env

fix: annotation prefix "mismatch"

402512f

Merge pull request #4 from paxbit/refactor-pod-observation

f28ead0

feat: skip by reason and skip pod updates...

Tenzer mentioned this pull request Sep 27, 2021

Failed CronJob runs get re-raised until cleaned up (and don't have message) #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Modification of the pod observation #14

Modification of the pod observation #14

Uh oh!

janLo commented Dec 16, 2020 •

edited

Loading

Uh oh!

salzig commented Jul 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Modification of the pod observation #14

Are you sure you want to change the base?

Modification of the pod observation #14

Uh oh!

Conversation

janLo commented Dec 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

salzig commented Jul 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

janLo commented Dec 16, 2020 •

edited

Loading