Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudapi: git sha version api and journal for workers #4484

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lzap
Copy link
Contributor

@lzap lzap commented Nov 20, 2024

Adds a runtime information similarily as in image builder and a logrus hook that is used to put git sha to every single log message. This way we can tell which version was log created with since composer and workers sometimes are running different versions.

This data is then leveraged in a new simple /version cloud API call that returns the data similarily to IB. My plan is to call this from IB too so IB reports both its and composer version info.

Finally, I noticed that workers create pretty unreadable logs, it is a string-encoded JSON:

{"message":"{\"channel\":\"staging\",\"file\":\"/opt/app-root/src/internal/common/slogger/logrus.go:29\",\"func\":\"github.com/osbuild/osbuild-composer/internal/common/slogger.(*simpleLogrus).log\",\"job_dependencies\":\"[8402b9c0-ac94-4952-8609-1ad4e575a28a e30715cd-0a59-4993-a530-fdb2189ad2bb]\",\"job_id\":\"be17e941-d8f5-4de2-9eb3-e7a1b3ee41d8\",\"job_type\":\"manifest-id-only\",\"level\":\"info\",\"msg\":\"Dequeued job\",\"time\":\"2024-11-20T12:10:37Z\"}\n","ident":"osbuild-composer","host":"composer-5dcd8bfdf9-bhrv6"}

This patch builds on top of native journald capability which I added last year and is used in on-prem mode. Instead using standard output/err worker now detects if there is a systemd-journald running and uses its native API to send logs in structured form.

This means all logs are native now, it can be inspected with journalctl and what is more important they should get into Kibana/Splunk as regular JSON improving our searching capabilities:

journalctl -o verbose
...
    BUILD_TIME=2024-11-20T11:24:49Z
    BUILD_COMMIT=ff2660
    PRIORITY=3
    MESSAGE=Requesting job failed: Post "https://0:8881/api/worker/v1/jobs": dial tcp 0.0.0.0:8881: connect: connection refused
    _PID=254883
    _COMM=__debug_bin3968
    _EXE=/home/lzap/osbuild-composer/cmd/osbuild-worker/__debug_bin396826732
    _CMDLINE=/home/lzap/osbuild-composer/cmd/osbuild-worker/__debug_bin396826732 0:8881
    _SOURCE_REALTIME_TIMESTAMP=1732108805201853
Wed 2024-11-20 14:20:05.201871 CET [s=83cfd176837d4a15b19c452be25c8593;i=1ed8;b=3d1a7f8752454afda1a56b25debf50d9;m=664c2be936;t=62758020be16c;x=299933d4e616281a]
    _SELINUX_CONTEXT=kernel
    _BOOT_ID=3d1a7f8752454afda1a56b25debf50d9
    _MACHINE_ID=90304ac7a1274d6ca3abfee98b7975c1
    _HOSTNAME=dev.home.lan
    _RUNTIME_SCOPE=system
...

I know this is sort of two features in one PR but these are tiny changes, the API endpoint is super small and I wrote a test.

}
return ctx.JSON(http.StatusOK, version)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the very same code in image-builder too, this is a copy-paste literally.

Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick drive-by wondering inline

import "runtime/debug"

var (
// Git SHA commit (first 4 characters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment (4 chars) seems to not match the implementaion AFACIT

BuildCommit = "HEAD"
}

BuildGoVersion = bi.GoVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bi actually defined (i.e. non-nil) if we get a !ok before (if not we would crash here)? I wonder if it wouldn't be cleaner to define the defaults directly in var BuildCommit = "N/A" and just return on if !ok { return }(?)

Copy link
Contributor

@schuellerf schuellerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comment additionally to the already mentioned ones.
In image builder we changed the way some paths are handled so they don't need authentication, which might be neat for /version but I'd approve as this might be easier in a followup PR

cmd/osbuild-worker/config.go Outdated Show resolved Hide resolved
@lzap lzap force-pushed the version-api branch 2 times, most recently from c1efe97 to 5df2627 Compare November 27, 2024 08:38
RegisterHandlers(e.Group(path, mws...), &handler)

// no auth endpoints
e.GET("/status", func(c echo.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the paths without auth 😎
But should it be

Suggested change
e.GET("/status", func(c echo.Context) error {
e.GET("/version", func(c echo.Context) error {

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know right, but this is what is in the image-builder :-) Do you want me to break this pattern? I am fine with that, can also create both maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think /version exists as it's in the openapi.v2.yml so this might already result in both existing (which is good)
Rather somehow assure /version not to be authenticated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok added /version with non-auth middleware and also adding status/ready as fixed paths that simply return 200 without ANY middleware because we do not this to be logged or have metrics as k8s calls this every now and then.

@lzap lzap force-pushed the version-api branch 2 times, most recently from 5953c49 to ac5e75d Compare November 27, 2024 11:08
@@ -1,3 +1,6 @@
exclude-dirs:
- vendor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For full transparency, I sneaked this hunk in with the last amend, I realized that go linter lints also vendor directories and that slows it significantly. Can do a separate PR if requested.

@@ -17,6 +17,21 @@ servers:
description: current domain

paths:
/version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, but when thinking about it … that's the version of osbuild-composer not the worker, right? (in contrast to the PR title and commit subject :-/ )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API is "push" in this case - so the worker would have to "push" it's version, maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I started the patch with intention to include sha git in all log messages which was done. Then the version endpoint was added, yeah, this is just for the composer.

We can definitely add workers at some point, out of scope for this patch I would say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this could be either split in two commits or at least the description (PR and commit) should denote that this is touching worker and composer?

cmd/osbuild-worker/main.go Outdated Show resolved Hide resolved
@lzap lzap force-pushed the version-api branch 2 times, most recently from e229e0d to c4c9682 Compare November 29, 2024 10:54
@lzap
Copy link
Contributor Author

lzap commented Nov 29, 2024

I have split it into multiple commits, rebased.

Copy link
Contributor

@schuellerf schuellerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@lzap
Copy link
Contributor Author

lzap commented Nov 29, 2024

The only test failing is edge-farm, rebasing.

@lzap
Copy link
Contributor Author

lzap commented Dec 4, 2024

@croissanne @ezr-ondrej @mvo5 please re-review tests are good I want to get this one merged to increase readability of logs from composer and continue further improving logs with standardized correlation thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants