Skip to content

Commit

Permalink
RHOAIENG-16517: chore(tests): add new changed files rebuild helper fo…
Browse files Browse the repository at this point in the history
…r GitHub Actions PRs (#800)
  • Loading branch information
jiridanek authored Dec 11, 2024
1 parent 7419685 commit 059a835
Show file tree
Hide file tree
Showing 11 changed files with 516 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/build-notebooks-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ jobs:
steps:
- uses: actions/checkout@v4

- uses: actions/setup-go@v5
with:
cache-dependency-path: "**/*.sum"

- name: Determine targets to build based on changed files
run: |
set -x
Expand Down
13 changes: 13 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# https://stackoverflow.com/questions/18136918/how-to-get-current-relative-directory-of-your-makefile
ROOT_DIR := $(dir $(realpath $(lastword $(MAKEFILE_LIST))))
SELF ::= $(firstword $(MAKEFILE_LIST))

# https://tech.davis-hansson.com/p/make/
SHELL := bash
Expand Down Expand Up @@ -98,6 +99,18 @@ define image
)
endef

####################################### Build helpers #######################################

# https://stackoverflow.com/questions/78899903/how-to-create-a-make-target-which-is-an-implicit-dependency-for-all-other-target
skip-init-for := deploy% undeploy% test% scan-image-vulnerabilities
ifneq (,$(filter-out $(skip-init-for),$(MAKECMDGOALS) $(.DEFAULT_GOAL)))
$(SELF): bin/buildinputs
endif

bin/buildinputs: scripts/buildinputs/buildinputs.go scripts/buildinputs/go.mod scripts/buildinputs/go.sum
$(info Building a Go helper for Dockerfile dependency analysis...)
go build -C "scripts/buildinputs" -o "$(ROOT_DIR)/$@" ./...

####################################### Buildchain for Python 3.9 using ubi9 #######################################

# Build and push base-ubi9-python-3.9 image to the registry
Expand Down
17 changes: 17 additions & 0 deletions ci/cached-builds/gha_pr_changed_files.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import json
import logging
import os
import pathlib
Expand Down Expand Up @@ -48,9 +49,22 @@ def should_build_target(changed_files: list[str], target_directories: list[str])
"""Returns truthy if there is at least one changed file necessitating a build.
Falsy (empty) string is returned otherwise."""
for directory in target_directories:
# detect change in the Dockerfile directory
for changed_file in changed_files:
if changed_file.startswith(directory):
return changed_file
# detect change in any of the files outside
stdout = subprocess.check_output([PROJECT_ROOT / "bin/buildinputs", directory + "/Dockerfile"],
text=True, cwd=PROJECT_ROOT)
logging.debug(f"{directory=} {stdout=}")
if stdout == "\n":
# no dependencies
continue
dependencies: list[str] = json.loads(stdout)
for dependency in dependencies:
for changed_file in changed_files:
if changed_file.startswith(dependency):
return changed_file
return ""


Expand Down Expand Up @@ -78,3 +92,6 @@ def test_analyze_build_directories(self):
assert set(directories) == {"base/ubi9-python-3.9",
"intel/base/gpu/ubi9-python-3.9",
"jupyter/intel/pytorch/ubi9-python-3.9"}

def test_should_build_target(self):
assert "" == should_build_target(["README.md"], ["jupyter/datascience/ubi9-python-3.11"])
12 changes: 12 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,15 @@ Create a Python 3.11 version based on the Python 3.9 version for each one in the
```sh
python scripts/new_python_based_image.py --context-dir . --source 3.9 --target 3.11 --match ./ --log-level DEBUG
```

## buildinputs/

CLI tool written in Go that computes the list of input files required to build a Dockerfile.
This is useful to determine what images need to be built on CI when testing a GitHub Pull Request.

### Examples

```shell
make bin/buildinputs
bin/buildinputs jupyter/datascience/ubi9-python-3.11/Dockerfile 2>/dev/null
```
36 changes: 36 additions & 0 deletions scripts/buildinputs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# buildinputs

Tool to determine what files are required to build a given Dockerfile.

## Design

There are multiple possible solutions to this problem.

### Convert to LLB and analyze the LLB (chosen approach)

At present, the tool works by converting the Dockerfile into LLB (BuildKit's protobuf-based representation) and then
analysing that to get the referenced files.

### Parse the Dockerfile and resolve the AST

Docker provides Go functions to parse a Dockerfile to AST.
It does not provide public functions for the AST resolution, however.

The procedure can be copied either from the LLB convertor code or from regular build code.

It requires handling the following instructions (at minimum):

* `ARG`, `ENV` (file paths will require arg substitution)
* `ADD`, `COPY` (the two main ways to pull files into the build)
* `RUN` (has a `--mount=bind` flag)
* `ONBUILD` in a parent image (that would be probably fine to ignore, as it's not OCI)

### Parse (lex) the Dockerfile ourselves

This is also doable, and it would avoid having to use Go.

The main limitation is that if the implementation we have is incomplete,
then using a new Dockerfile feature (HEREDOC, ...) would require first implementing it in our parser,
which limits further progress.

Another difficulty is that we would surely have bugs and that would decrease the team's confidence in the CI system.
14 changes: 14 additions & 0 deletions scripts/buildinputs/buildinputs.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
package main

import (
"flag"
"fmt"
)

func main() {
flag.Parse()
for _, dockerfile := range flag.Args() {
deps := getDockerfileDeps(dockerfile)
fmt.Println(deps)
}
}
49 changes: 49 additions & 0 deletions scripts/buildinputs/buildinputs_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
package main

import (
"encoding/json"
"os"
"path/filepath"
"testing"
)

func globDockerfiles(dir string) ([]string, error) {
files := make([]string, 0)
err := filepath.Walk(dir, func(path string, f os.FileInfo, err error) error {
if filepath.Base(path) == "Dockerfile" {
files = append(files, path)
}
return nil
})

return files, err
}

// TestParseAllDockerfiles checks there are no panics when processing all Dockerfiles we have
func TestParseAllDockerfiles(t *testing.T) {
projectRoot := "../../"
dockerfiles := noErr2(globDockerfiles(projectRoot))

if len(dockerfiles) < 6 {
t.Fatalf("not enough Dockerfiles found, got %+v", dockerfiles)
}

for _, dockerfile := range dockerfiles {
t.Run(dockerfile, func(t *testing.T) {
result := getDockerfileDeps(dockerfile)
if result == "" {
// no deps in the dockerfile
return
}
data := make([]string, 0)
noErr(json.Unmarshal([]byte(result), &data))
for _, path := range data {
stat := noErr2(os.Stat(filepath.Join(projectRoot, path)))
if stat.IsDir() {
// log this very interesting observation
t.Logf("dockerfile copies in a whole directory: %s", path)
}
}
})
}
}
114 changes: 114 additions & 0 deletions scripts/buildinputs/dockerfile.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
package main

import (
"context"
"encoding/json"
"fmt"
"github.com/containerd/platforms"
"github.com/moby/buildkit/client/llb"
"github.com/moby/buildkit/client/llb/sourceresolver"
"github.com/moby/buildkit/frontend/dockerfile/dockerfile2llb"
"github.com/moby/buildkit/frontend/dockerfile/parser"
"github.com/moby/buildkit/frontend/dockerui"
"github.com/moby/buildkit/solver/pb"
"github.com/opencontainers/go-digest"
ocispecs "github.com/opencontainers/image-spec/specs-go/v1"
"github.com/pkg/errors"
"log"
"os"
"strings"
)

func getDockerfileDeps(dockerfile string) string {
ctx := context.Background()
data := noErr2(os.ReadFile(dockerfile))

st, _, _, _, err := dockerfile2llb.Dockerfile2LLB(ctx, data, dockerfile2llb.ConvertOpt{
// building an image requires fetching the metadata for its parent
// this fakes a parent so that this tool does not need to do network i/o
MetaResolver: &testResolver{
// random digest value
digest: "sha256:a1c7d58d98df3f9a67eda799200655b923ebc7a41cad1d9bb52723ae1c81ad17",
dir: "/",
platform: "linux/amd64",
},
Config: dockerui.Config{
BuildArgs: map[string]string{"BASE_IMAGE": "fake-image"},
BuildPlatforms: []ocispecs.Platform{{OS: "linux", Architecture: "amd64"}},
},
Warn: func(rulename, description, url, fmtmsg string, location []parser.Range) {
log.Printf(rulename, description, url, fmtmsg, location)
},
})
noErr(err)

definition := noErr2(st.Marshal(ctx))
return getOpSourceFollowPaths(definition)
}

func getOpSourceFollowPaths(definition *llb.Definition) string {
// https://earthly.dev/blog/compiling-containers-dockerfiles-llvm-and-buildkit/
// https://stackoverflow.com/questions/73067660/what-exactly-is-the-frontend-and-backend-of-docker-buildkit

ops := make([]llbOp, 0)
for _, dt := range definition.Def {
var op pb.Op
if err := op.UnmarshalVT(dt); err != nil {
panic("failed to parse op")
}
dgst := digest.FromBytes(dt)
ent := llbOp{Op: &op, Digest: dgst, OpMetadata: definition.Metadata[dgst].ToPB()}
ops = append(ops, ent)
}

for _, op := range ops {
switch op := op.Op.Op.(type) {
case *pb.Op_Source:
if strings.HasPrefix(op.Source.Identifier, "docker-image://") {
// no-op
} else if strings.HasPrefix(op.Source.Identifier, "local://") {
paths := op.Source.Attrs[pb.AttrFollowPaths]
return paths
} else {
panic(fmt.Errorf("unexpected prefix %v", op.Source.Identifier))
}
}
}
return ""
}

// llbOp holds data for a single loaded LLB op
type llbOp struct {
Op *pb.Op
Digest digest.Digest
OpMetadata *pb.OpMetadata
}

// testResolver provides a fake parent image manifest for the build
type testResolver struct {
digest digest.Digest
dir string
platform string
}

func (r *testResolver) ResolveImageConfig(ctx context.Context, ref string, opt sourceresolver.Opt) (string, digest.Digest, []byte, error) {
var img struct {
Config struct {
Env []string `json:"Env,omitempty"`
WorkingDir string `json:"WorkingDir,omitempty"`
User string `json:"User,omitempty"`
} `json:"config,omitempty"`
}

img.Config.WorkingDir = r.dir

if opt.Platform != nil {
r.platform = platforms.Format(*opt.Platform)
}

dt, err := json.Marshal(img)
if err != nil {
return "", "", nil, errors.WithStack(err)
}
return ref, r.digest, dt, nil
}
60 changes: 60 additions & 0 deletions scripts/buildinputs/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
module dockerfile

go 1.22.0

require (
github.com/containerd/platforms v0.2.1
github.com/moby/buildkit v0.18.1
github.com/opencontainers/go-digest v1.0.0
github.com/opencontainers/image-spec v1.1.0
github.com/pkg/errors v0.9.1
)

require (
github.com/AdaLogics/go-fuzz-headers v0.0.0-20230811130428-ced1acdcaa24 // indirect
github.com/agext/levenshtein v1.2.3 // indirect
github.com/containerd/containerd v1.7.24 // indirect
github.com/containerd/errdefs v0.3.0 // indirect
github.com/containerd/log v0.1.0 // indirect
github.com/containerd/ttrpc v1.2.5 // indirect
github.com/containerd/typeurl/v2 v2.2.3 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/docker/go-connections v0.5.0 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/hashicorp/go-multierror v1.1.1 // indirect
github.com/in-toto/in-toto-golang v0.5.0 // indirect
github.com/klauspost/compress v1.17.11 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
github.com/moby/locker v1.0.1 // indirect
github.com/moby/patternmatcher v0.6.0 // indirect
github.com/moby/sys/signal v0.7.1 // indirect
github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 // indirect
github.com/secure-systems-lab/go-securesystemslib v0.4.0 // indirect
github.com/shibumi/go-pathspec v1.3.0 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/tonistiigi/fsutil v0.0.0-20241121093142-31cf1f437184 // indirect
github.com/tonistiigi/go-csvvalue v0.0.0-20240710180619-ddb21b71c0b4 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.46.1 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace v0.46.1 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0 // indirect
go.opentelemetry.io/otel v1.28.0 // indirect
go.opentelemetry.io/otel/metric v1.28.0 // indirect
go.opentelemetry.io/otel/sdk v1.28.0 // indirect
go.opentelemetry.io/otel/trace v1.28.0 // indirect
golang.org/x/crypto v0.27.0 // indirect
golang.org/x/net v0.29.0 // indirect
golang.org/x/sync v0.8.0 // indirect
golang.org/x/sys v0.26.0 // indirect
golang.org/x/text v0.18.0 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094 // indirect
google.golang.org/grpc v1.66.3 // indirect
google.golang.org/protobuf v1.35.1 // indirect
)
Loading

0 comments on commit 059a835

Please sign in to comment.