Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add kubernetes daemonsets for fluentd 1.17 #36618

Merged
merged 6 commits into from
Dec 17, 2024

Conversation

EyeCantCU
Copy link
Member

@EyeCantCU EyeCantCU commented Dec 12, 2024

Adds Kubernetes DaemonSets for fluentd. Note that fluentd is vendored in the bundle so is not included as a package dependency. Upstream includes two separate installations of fluentd which is not necessary. Add for all supported Ruby versions (3.1-3.3) and 3.4 (in preview)

There isn't value in splitting away the entrypoint scripts or configuration for each so those have been bundled together

@octo-sts octo-sts bot added the bincapz/pass bincapz/pass Bincapz (aka. malcontent) scan didn't detect any CRITICALs on the scanned packages. label Dec 12, 2024
@EyeCantCU EyeCantCU force-pushed the fluentd-daemonset branch 6 times, most recently from 8094e34 to 9ea59b6 Compare December 13, 2024 00:06
@mritunjaysharma394 mritunjaysharma394 enabled auto-merge (squash) December 13, 2024 02:15
Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

Looking at the build output, I actually don't see any errors - the build appears to have completed successfully. Let me break down what I see:

• Detected Error: None - the build completed successfully
• Error Category: N/A
• Failure Point: N/A
• Root Cause Analysis: The build output shows successful:

  • Package dependency scanning
  • Shared object dependency resolution
  • Package creation for all fluentd variants
  • APK index generation
  • Workspace cleanup

• Suggested Fix: No fix needed - the build is working as intended

• Explanation: The output shows:

  1. All shared library dependencies were found
  2. All packages were generated successfully
  3. The APK index was updated properly
  4. The build environment was cleaned up normally

• Additional Notes:

  • The build generated multiple fluentd variant packages (elasticsearch7, azureblob, etc.)
  • All required Ruby dependencies were properly resolved
  • The package versions align with the specified 1.17.1.1.2 version
  • The build system handled the multi-package generation correctly

• References: N/A - since there's no error to resolve

If you're seeing a different error in the CI system, please share that specific error message, as the build output provided shows a successful build completion.

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory

• Error Category: Configuration

• Failure Point: Package scanning phase when looking for binary dependencies

• Root Cause Analysis: The build system expects to find the fluentd executable in usr/bin/fluentd, but it's not being installed to the correct location during the build process. This is likely because the bundle install is putting the executable in the vendor/bundle directory instead of the standard system path.

• Suggested Fix: Add a step to copy the fluentd executable to the correct location in the pipeline:

pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Existing steps...
      
      # Create bin directory and copy fluentd executable
      mkdir -p ${{targets.contextdir}}/usr/bin
      cp ${{targets.contextdir}}/fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd ${{targets.contextdir}}/usr/bin/fluentd

• Explanation: The fix ensures the fluentd executable is placed in the standard system path (usr/bin) where the package scanner expects to find it. This maintains proper filesystem hierarchy standard (FHS) compliance while preserving the bundled gems in the vendor directory.

• Additional Notes:

  • The error is non-fatal for package creation but should be fixed for proper package structure
  • The fix maintains isolation of bundled gems while providing system-wide executable access
  • Similar pattern is common for Ruby applications packaged as system packages

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory"

• Error Category: Build Configuration

• Failure Point: Package scanning phase, specifically when looking for executable files

• Root Cause Analysis: The build system expects to find the fluentd executable in usr/bin/fluentd, but it's not being installed there during the build process. This is because the bundle install is putting the executables in the vendor/bundle directory instead of the standard system locations.

• Suggested Fix: Add a step in the pipeline to copy the fluentd executable to the correct location:

pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Previous steps...
      
      # Create bin directory and copy fluentd executable
      mkdir -p ${{targets.contextdir}}/usr/bin
      cp ${{targets.contextdir}}/fluentd/vendor/bundle/ruby/3.3.0/bin/fluentd ${{targets.contextdir}}/usr/bin/fluentd

• Explanation: The build is successfully installing fluentd and its dependencies, but the executable needs to be in the standard system path (usr/bin) for the package scanner to find it. This fix ensures the executable is properly placed where the system expects it.

• Additional Notes:

  • The error is non-fatal (packages are still being generated) but should be fixed for proper packaging
  • This is a common issue when packaging Ruby gems that install executables
  • The fix maintains the bundler-managed environment while providing system-level access to the executable

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory"

• Error Category: Configuration

• Failure Point: Package scanning phase when looking for binary files

• Root Cause Analysis: The build system is expecting to find the fluentd executable in usr/bin/fluentd, but it's not being installed there during the build process. This is likely because the bundle install is being done to a custom path but the executable isn't being properly symlinked.

• Suggested Fix:
Add these steps to the pipeline section after the bundle install:

- uses: bundle/copy
  with:
    directory: ${{targets.contextdir}}/fluentd/vendor/bundle
    name: fluentd

- runs: |
    mkdir -p ${{targets.contextdir}}/usr/bin
    ln -s /fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd ${{targets.contextdir}}/usr/bin/fluentd

• Explanation:
The fix does two things:

  1. Properly copies the bundled gems to the final package location
  2. Creates the expected symlink in usr/bin pointing to the actual executable
    This matches the expected filesystem layout that the package scanner is looking for.

• Additional Notes:

  • The error occurs during package scanning, not during runtime
  • This is a common pattern for Ruby applications installed via bundler
  • The symlink approach maintains the proper Ruby load path while satisfying package requirements

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory"

• Error Category: Configuration

• Failure Point: Package scanning phase when looking for binary files

• Root Cause Analysis: The build system expects to find the fluentd binary in usr/bin/fluentd, but it's not being installed to the correct location during the build process. This is likely because the package configuration isn't properly setting up the binary installation path.

• Suggested Fix:
Add a step in the pipeline to install the fluentd binary:

pipeline:
  - uses: git-checkout
    # ... existing checkout config ...
  
  # Add this new step after bundle install
  - runs: |
      mkdir -p ${{targets.destdir}}/usr/bin
      ln -s /fluentd/vendor/bundle/ruby/$(basename $(gem env home))/bin/fluentd ${{targets.destdir}}/usr/bin/fluentd

• Explanation:

  • The error occurs because the package scanner expects executables in standard system paths
  • The fix creates the expected directory structure and symlinks the fluentd binary from the gem installation to the standard system path
  • This maintains the bundled gem installation while satisfying the package scanner requirements

• Additional Notes:

  • The binary path issue doesn't affect functionality since the entrypoint script uses the full gem path
  • This is primarily to satisfy the package build system requirements
  • Alternative approach could be to modify the scanner configuration, but symlinking follows standard practices

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory"

• Error Category: Configuration

• Failure Point: Package scanning phase when looking for binary files

• Root Cause Analysis: The build system expects to find the fluentd binary in usr/bin/fluentd, but it's not being installed to the correct location during the build process. This is likely because the bundle install is placing executables in the vendor/bundle directory instead of the standard system binary location.

• Suggested Fix:
Add a step in the pipeline to symlink or copy the fluentd binary:

pipeline:
  # ... existing steps ...
  - runs: |
      mkdir -p ${{targets.destdir}}/usr/bin
      ln -s /fluentd/vendor/bundle/ruby/3.3.0/bin/fluentd ${{targets.destdir}}/usr/bin/fluentd

• Explanation: The fix creates the expected usr/bin directory structure and creates a symlink to the fluentd executable from its installed location in the vendored gems directory to the standard system binary location. This satisfies the package scanner's expectation while maintaining the bundled gem structure.

• Additional Notes:

  • The error is common when packaging Ruby applications that use bundler
  • The fix maintains the isolation provided by vendor/bundle while making the binary accessible
  • Alternative approach would be to modify the package scanner configuration, but providing the expected binary location is more standard

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Failed to open usr/bin/fluentd: open usr/bin/fluentd: no such file or directory"

• Error Category: Configuration

• Failure Point: Package scanning phase when looking for binary files

• Root Cause Analysis: The build system expects to find the fluentd binary in usr/bin/fluentd, but it's not being installed to the correct location during the build process. This is likely because the package configuration isn't specifying where to install the binary.

• Suggested Fix: Add an installation step in the pipeline to copy the fluentd binary to the correct location:

pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Previous steps...
      
      # Create binary directory and install fluentd
      mkdir -p ${{targets.contextdir}}/usr/bin
      ln -s /fluentd/vendor/bundle/ruby/$(basename $(gem env home))/bin/fluentd ${{targets.contextdir}}/usr/bin/fluentd

• Explanation: The error occurs because Melange expects certain files to be in standard system locations. By creating the usr/bin directory and linking the fluentd binary there, we satisfy the package scanner's requirements while maintaining the correct runtime paths.

• Additional Notes:

  • The fix uses a symlink to maintain compatibility with the bundled gem installation
  • The binary path is dynamically determined using gem env home to handle different Ruby versions
  • This maintains the existing functionality while satisfying package build requirements

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during gem fetching

• Root Cause Analysis: The build is failing due to a network timeout while trying to fetch Ruby gems from rubygems.org. This is likely due to either temporary network issues or restrictive timeout settings.

• Suggested Fix:

  1. Add retry mechanism to bundler:
pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Configure bundler with longer timeouts and retries
      bundle config connect_timeout 60
      bundle config retry 3
      bundle config specific_platform true
      
      # Rest of the existing bundle commands...
      bundle config silence_root_warning true
      bundle config set path ${{targets.contextdir}}/fluentd/vendor/bundle
      bundle config set clean 'true'
      bundle config set deployment 'true'
      bundle install --gemfile ./Gemfile --jobs $(nproc)

• Explanation:

  • The increased timeout gives more time for network operations
  • Retry mechanism helps handle temporary network issues
  • Setting specific_platform ensures proper gem selection for the target architecture

• Additional Notes:

  • Consider adding explicit gem source mirrors if rubygems.org continues to be unreliable
  • The error shows bundler version mismatch (2.5.23 vs 2.4.19) but this isn't the primary issue
  • The timeout issue might be more prevalent in CI environments due to network constraints

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

Looking at the build output, I don't actually see an error - the build appears to have completed successfully. The output shows:

  1. All packages were built successfully
  2. The index was updated with the new packages
  3. The workspace was cleaned up properly
  4. No error messages or failures were reported

The output includes normal scanning of shared object dependencies and package generation for all the fluentd variants (elasticsearch7, azureblob, cloudwatch, etc.).

• Detected Error: None found - build completed successfully
• Error Category: N/A
• Failure Point: N/A
• Root Cause Analysis: The build process executed as expected
• Suggested Fix: No fix needed - the build is working correctly
• Explanation: The output shows successful:

  • Package builds
  • Dependency scanning
  • APK index generation
  • Cleanup operations
    • Additional Notes: The large amount of output is normal for this type of build as it generates multiple subpackages and scans many dependencies
    • References: N/A

If you're seeing a specific error that's not shown in this output, please provide that error message so I can help diagnose the actual issue.

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during subpackage build for azureblob variant

• Root Cause Analysis: The build is failing due to network connectivity issues when trying to fetch Ruby gems from rubygems.org. This appears to be a timeout issue during gem installation.

• Suggested Fix:

  1. Add retry mechanism for bundle install:
runs: |
  # Add retry logic for bundle install
  for i in $(seq 1 3); do
    bundle install --gemfile ./Gemfile --jobs $(nproc) && break || {
      if [ $i -lt 3 ]; then
        echo "Bundle install attempt $i failed! Retrying..."
        sleep 5
      else
        echo "Bundle install failed after 3 attempts!"
        exit 1
      fi
    }
  done
  1. Add explicit timeout configuration:
environment:
  contents:
    packages:
      - ...existing packages...
  variables:
    BUNDLER_TIMEOUT: "60"
    BUNDLE_RETRY: "3"

• Explanation: Network issues during gem installation are common in CI environments. Adding retry logic with backoff helps handle temporary network glitches, while explicit timeout configuration gives more time for large dependencies to download.

• Additional Notes:

  • The error occurs during the bundle install phase when trying to fetch the fluentd gem
  • Current behavior shows no retry mechanism
  • Bundle version mismatch (2.5.23 vs 2.4.19) is not the root cause but should be monitored
  • Consider adding a bundle config timeout setting if issues persist

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during gem fetching

• Root Cause Analysis: The build is failing due to a network timeout when trying to fetch Ruby gems from rubygems.org. This is typically caused by either network connectivity issues or temporary rubygems.org service disruptions.

• Suggested Fix:

  1. Add retry mechanism to bundler:
pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Configure bundler with retry
      bundle config set --local retry 3
      bundle config set --local timeout 60
      
      # Rest of your existing bundle commands...
  1. Alternative: Add explicit source mirror configuration:
pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      bundle config mirror.https://rubygems.org https://rubygems.pkg.github.com
      # Rest of your existing bundle commands...

• Explanation:

  • Adding retry configuration allows bundler to attempt multiple times if the initial connection fails
  • Increasing the timeout gives more time for slow connections
  • Using a mirror can provide more reliable access to gems

• Additional Notes:

  • This is a transient error that might resolve on retry
  • Consider adding error handling around the bundle install command
  • If the issue persists, you might want to consider vendoring the gems or using a gem server proxy

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "linter 'object' failed on package 'ruby3.1-fluentd-kubernetes-daemonset-1.17-kinesis': package contains intermediate object file 'fluentd/vendor/bundle/ruby/3.1.0/gems/cool.io-1.9.0/ext/cool.io/buffer.o'"

• Error Category: Build Configuration

• Failure Point: Package linting phase during the build process

• Root Cause Analysis: The build process is including intermediate compilation objects (.o files) in the final package, which is not allowed by the Wolfi OS packaging guidelines. These object files are artifacts from the compilation process that should not be included in the final package.

• Suggested Fix: Add a cleanup step in the pipeline to remove object files after gem installation:

pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # ... existing bundle install commands ...
      
      # Add cleanup of object files after bundle install
      find ${{targets.contextdir}}/fluentd/vendor/bundle -name "*.o" -type f -delete
      find ${{targets.contextdir}}/fluentd/vendor/bundle -name "*.lo" -type f -delete

• Explanation: The cleanup step removes intermediate compilation artifacts while keeping the necessary compiled shared objects (.so files) needed for the gems to function. This satisfies the Wolfi OS packaging requirements while maintaining package functionality.

• Additional Notes:

  • This is a common issue with Ruby gems that include native extensions
  • The .o files are only needed during compilation and can be safely removed
  • This approach is consistent with other distribution packaging practices
  • Make sure to test the package functionality after cleanup to ensure no required files are removed

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "linter 'object' failed on package 'ruby3.2-fluentd-kubernetes-daemonset-1.17-kinesis': package contains intermediate object file 'fluentd/vendor/bundle/ruby/3.2.0/gems/llhttp-ffi-0.5.0/ext/aarch64-darwin/llhttp/api.o'"

• Error Category: Build Configuration

• Failure Point: Package linting phase, specifically the object file linter

• Root Cause Analysis: The package includes compiled object files for a different architecture (aarch64-darwin) that shouldn't be included in the final package.

• Suggested Fix:
Add a cleanup step in the pipeline to remove intermediate object files:

pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # ... existing build steps ...
      
      # Add cleanup step before packaging
      find ${{targets.contextdir}}/fluentd/vendor/bundle -name "*.o" -type f -delete

• Explanation:

  • The cleanup step removes all .o files which are intermediate compilation artifacts
  • This prevents architecture-specific object files from being included in the package
  • Removing these files doesn't affect functionality since the final compiled versions are already built
  • This addresses the linter error while maintaining package compatibility

• Additional Notes:

  • This is a common issue with Ruby gems that include native extensions
  • The object files are only needed during compilation, not at runtime
  • Similar cleanup may be needed for other intermediate build artifacts
  • Consider adding cleanup for other temp files (.a, .la, etc.) if present

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during gem fetching

• Root Cause Analysis: The build is failing due to a network timeout while trying to fetch Ruby gems from rubygems.org. This is commonly caused by temporary network issues or restrictive timeouts.

• Suggested Fix:

  1. Add network retry settings to the bundle install command:
runs: |
  # Add environment variables for bundler
  export BUNDLE_RETRY=5
  export BUNDLE_JOBS=4
  export BUNDLE_TIMEOUT=60
  
  # Install bundle with modified settings
  bundle config silence_root_warning true
  bundle config set path ${{targets.contextdir}}/fluentd/vendor/bundle
  bundle config set clean 'true'
  bundle config set deployment 'true'
  bundle install --gemfile ./Gemfile --jobs $(nproc) --retry 5 --timeout 60

• Explanation:

  • The increased retry count (5) allows multiple attempts to fetch gems
  • Setting explicit timeout (60 seconds) gives more time for network operations
  • Reducing concurrent jobs can help prevent network congestion
  • These settings make the build more resilient to temporary network issues

• Additional Notes:

  • This is a common issue in CI environments with varying network conditions
  • The error is not related to the package itself but rather the build environment
  • If issues persist, consider using a gem mirror or caching service
  • Alternative approach would be to vendor the gems, but this isn't recommended for Wolfi's security model

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step when fetching gem metadata

• Root Cause Analysis: The build is failing due to a network timeout while attempting to fetch Ruby gems from rubygems.org. This is likely due to either network connectivity issues or timeout settings being too restrictive.

• Suggested Fix:

  1. Add retry mechanism to the bundle install command:
bundle install --gemfile ./Gemfile --jobs $(nproc) --retry 5 --timeout 60
  1. Add explicit source configuration:
- runs: |
    bundle config mirror.https://rubygems.org https://rubygems.org
    bundle config --local frozen false
    bundle config --local build.nokogiri --use-system-libraries
  1. Consider adding network-related packages to environment:
environment:
  contents:
    packages:
      - ca-certificates
      - openssl
      # ... existing packages ...

• Explanation:

  • Increasing retry attempts and timeout values helps handle temporary network issues
  • Explicit mirror configuration ensures proper source resolution
  • Additional SSL/TLS packages ensure proper network connectivity
  • The current error is typical of network connectivity issues in isolated build environments

• Additional Notes:

  • This is a common issue in CI environments with restrictive networking
  • The fix addresses both immediate connectivity and potential SSL certificate issues
  • Consider checking if rubygems.org is accessible from the build environment

• References:

Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during package build

• Root Cause Analysis: The build is failing due to a network timeout when trying to fetch Ruby gems from rubygems.org. This is likely due to either temporary network issues or restrictive timeout settings.

• Suggested Fix:

  1. Add retry logic to the bundle install command:
runs: |
  bundle config set retry 5
  bundle config set timeout 60
  # Rest of your existing bundle commands...
  bundle install --gemfile ./Gemfile --jobs $(nproc) --retry 5 --timeout 60
  1. Consider adding explicit sources to the Gemfile:
source 'https://rubygems.org'

• Explanation:

  • Increasing retry attempts helps handle temporary network issues
  • Setting a longer timeout gives more time for gem downloads
  • Explicit source definition can help with connection reliability
  • These changes make the build more resilient to network instability

• Additional Notes:

  • This is a common issue in CI environments with network constraints
  • The error is not related to the package itself but rather the build environment
  • Consider caching gems if builds are frequent
  • If issues persist, you might want to consider using a gem mirror

• References:

@EyeCantCU EyeCantCU force-pushed the fluentd-daemonset branch 2 times, most recently from 37a7600 to cf6dd1b Compare December 13, 2024 22:14
Copy link
Contributor

octo-sts bot commented Dec 13, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Network error while fetching https://rubygems.org/quick/Marshal.4.8/fluentd-1.17.1.gemspec.rz (Failed to open TCP connection to rubygems.org:443 (execution expired))"

• Error Category: Network/Dependency

• Failure Point: Bundle install step during gem fetching

• Root Cause Analysis: The build is failing due to a network timeout while trying to fetch Ruby gems from rubygems.org. This appears to be a temporary network connectivity issue or timeout during the gem installation process.

• Suggested Fix:

  1. Add retry mechanism for bundle install:
pipeline:
  - working-directory: ./docker-image/v${{vars.fluentdMM}}/debian-${{range.key}}
    runs: |
      # Install bundle with increased timeouts and retries
      bundle config connect_timeout 30
      bundle config timeout 60
      for i in $(seq 1 3); do
        bundle install --gemfile ./Gemfile --jobs $(nproc) --retry 5 && break || sleep 15
      done

• Explanation:

  • Adding timeout configurations helps with slow connections
  • The retry loop attempts the bundle install up to 3 times
  • The sleep between retries allows for network recovery
  • The increased retry count in bundle install itself provides additional resilience

• Additional Notes:

  • This is likely a transient network issue rather than a package problem
  • The error occurs after successfully switching bundler versions
  • The build environment has all required system dependencies installed
  • Consider adding --verbose to bundle install for more detailed debugging if issues persist

• References:

@EyeCantCU EyeCantCU force-pushed the fluentd-daemonset branch 2 times, most recently from 3df98dd to 078b7c0 Compare December 16, 2024 02:19
@mritunjaysharma394 mritunjaysharma394 merged commit 192ea53 into wolfi-dev:main Dec 17, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bincapz/pass bincapz/pass Bincapz (aka. malcontent) scan didn't detect any CRITICALs on the scanned packages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants