Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please support something like "allow-failure" for a given job #2347

Closed
mvo5 opened this issue Apr 1, 2020 · 172 comments
Closed

Please support something like "allow-failure" for a given job #2347

mvo5 opened this issue Apr 1, 2020 · 172 comments
Labels
enhancement New feature or request external

Comments

@mvo5
Copy link

mvo5 commented Apr 1, 2020

Edit from @vanZeben:

As this is a GitHub Actions platform feature request, discussion for this feature has been moved to https://github.com/orgs/community/discussions/15452, please go and put your support behind that community discussion instead of this issue.


We use github actions in our "snapd" project and we love them.

One small feature we would love to see is a way to mark a test job as "allow-failure" (or a term along these lines) [0]. This would simply mean that the overall /pulls overview page would show the PR as with the little green tick-mark (and maybe in the tooltip 5/6 OK, 1 ignored). It would still show as a failure in the details view (maybe with a different icon?).

Our use-case is that we have some CI environments that fail frequently because of external factors like repository mirrors that are our of sync etc. We still want to run the CI on these systems but not get distracted too much by these out-of-our-control issues.

Hope this makes sense.

Thanks!
Michael

[0] E.g.

jobs:
  spread:
    runs-on: ubuntu-latest
    allow-failure: true
    steps:
    - do-some-flaky-stuff
@mvo5 mvo5 added the enhancement New feature or request label Apr 1, 2020
@thboop
Copy link
Collaborator

thboop commented Apr 1, 2020

Hey @mvo5 Thanks for the feature request!

We do support marking a step to allow failure via continue-on-error.

We also support marking specific checks as required
That will allow you to merge pr's without a particular job succeeding.

I think the latter should solve your issue at the job level, is there a reason that doesn't well for you?

@mvo5
Copy link
Author

mvo5 commented Apr 1, 2020

Hey @thboop. Thanks for your quick reply!

Yeah, it's really just about the little green tick at the pull-request overview page. AFAICT when one job (even if it's not required) fails the overview PR list will show this PR as failed. Having a way to mark certain jobs as not rquired would still show the pulls as green (or yellow?) instead of red.

But I do understand this is a bit of a specific request, so feel free to close it if you think it's a bit too odd. We had it with our old CI system and I liked it.

@fingolfin
Copy link

I don't think it is that specific. People have requested this before: https://github.community/t5/GitHub-Actions/continue-on-error-allow-failure-UI-indication/td-p/37033

I just came here via Google as I was surprised I couldn't find anything like this in the documentation. It's standard with e.g. Travis CI

@jhenstridge
Copy link

I guess one way to handle this would be for jobs with an allow-failure flag set to generate a neutral conclusion on failure instead of failure. As I understand it, this would let the overall check suite conclusion be success if such a job fails.

@jadephilipoom
Copy link

I'm also an Actions user who would love to see this feature. My team's project has a long build and a few jobs that are flaky and frequently take a while to fix. We do just ignore failures on those for merging PRs, but being able to make the green tick agree with that convention would be really nice.

@jakesylvestre
Copy link

This would be great

@ghostsquad
Copy link

in the interim, what if there was a step that simply posted the "allowed failures" in a comment (that same comment updated each time the workflow ran). Similar CodeCov.

according to the docs, this is a detectable condition:

https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#steps-context

steps..outcome string The result of a completed step before continue-on-error is applied. Possible values are success, failure, cancelled, or skipped. When a continue-on-error step fails, the outcome is failure, but the final conclusion is success.
steps..conclusion string The result of a completed step after continue-on-error is applied. Possible values are success, failure, cancelled, or skipped. When a continue-on-error step fails, the outcome is failure, but the final conclusion is success.

@henryiii
Copy link

I thought that jobs.<job_id>.continue-on-error sounded exactly like this feature, but it doesn't seem to work like that. For me, at least, setting a job to "continue-on-error" still causes a red x for the workflow; it doesn't seem to have any effect at all.

@dopplershift
Copy link

By default, when any job within a workflow fails, all the other jobs are cancelled (unless 'fail-fast' is False).continue-on-error specifies that a certain job failing should not trigger cancelling other jobs. (I would definitely love some way to keep a job from signalling a failing check.)

@ramsey
Copy link

ramsey commented Sep 10, 2020

Not only does continue-on-error not work as expected (as mentioned by everyone on this thread), but the example in the documentation appears to result in a syntax error that I cannot figure out how to get around. As a result, I've been trying all sorts of permutations to figure out if this syntax error is causing continue-on-error not to work as expected.

Maybe it is?

Here's the documentation example verbatim:

runs-on: ${{ matrix.os }}
continue-on-error: ${{ matrix.experimental }}
strategy:
  fail-fast: false
  matrix:
    node: [11, 12]
    os: [macos-latest, ubuntu-18.04]
    experimental: [false]
    include:
      - node: 13
        os: ubuntu-18.04
        experimental: true

My version looks similar:

unit-tests:
  name: "Unit Tests"

  runs-on: ${{ matrix.operating-system }}
  continue-on-error: ${{ matrix.experimental }}

  strategy:
    fail-fast: false
    matrix:
      dependencies:
        - "lowest"
        - "highest"
      php-version:
        - "7.4"
        - "8.0"
      operating-system:
        - "ubuntu-latest"
      experimental: [false]
      include:
        - php: "8.0"
          composer-options: "--ignore-platform-reqs"
          experimental: true

When this runs, it fails at the unit-tests job, due to the following error. (See the failure here.)

Error when evaluating 'runs-on' for job 'unit-tests'. (Line: 121, Col: 14): Unexpected value ''

Somehow, it's getting an empty string for runs-on, or at least, that's what this error seems to indicate to me.

When I remove the line that defines experimental: [false] (and only this line), the build runs successfully (see here).

So, maybe the successful build is not actually taking into account the continue-on-error property? Does this syntax error help point to some kind of culprit that could resolve this issue for everyone on this thread?

@nschonni
Copy link

@ramsey your issue doesn't seem related to the thread. Your job is failing because your include doesn't have a value for operating-system

@ramsey
Copy link

ramsey commented Sep 11, 2020

Why does my include work without operating-system when I remove experimental: [false] from the matrix?

@ramsey
Copy link

ramsey commented Sep 17, 2020

@nschonni You were right. Turns out, I had my matrix messed up. Sorry for the churn here, y'all. 😀

@szepeviktor
Copy link

szepeviktor commented Sep 18, 2020

This is one solution

      - name: Install dependencies
        id: composer-run
        continue-on-error: true
        run: composer update --${{ matrix.dependency-version }} --prefer-dist --no-interaction --no-suggest

      - name: Execute tests
        if: steps.composer-run.outcome == 'success' && steps.composer-run.conclusion == 'success'
        run: vendor/bin/phpunit

continue-on-error causes difference in outcome and conclusion!

ch1bo added a commit to cardano-scaling/hydra that referenced this issue Sep 13, 2024
A bit of a hack to only run the network tests we expect to succeed. This
at least ensures we don't get any worse, even if it doesn't directly
allow us to track if we're getting better.

See also actions/runner#2347
dead-claudia added a commit to dead-claudia/mithril.js that referenced this issue Sep 23, 2024
Tests still appear to fail per MithrilJS#2898. Unfortunately, I need actions/runner#2347 to ignore the test failures properly - I need them to be warnings, not hard errors.
@mathis-la-debrouille
Copy link

Drone and others have it, and while it may seem minor, good UI for CI is incredibly important. I hope to see this 'on' soon.

@wsanchez
Copy link

I got sufficiently tired of this that I came up with a work-around. Posting it here in case others find it useful:

https://github.com/burningmantech/ranger-ims-server/pull/1347/files

What I did gets me:

  • a green checkmark for the job (not great, but this is the trade-off we are making)
  • a green checkmark for the workflow (the goal here)
  • a comment in the PR indicating that an optional build failed (replacement for red checkmark for the job)
Screenshot 2024-10-31 at 10 42 26 AM

@Caellian
Copy link

@wsanchez Thank you. That support for a yellow exclamation mark instead of a green chackmark is basically the "good UI for CI is incredibly important" part.

So until we get the (apparently) super difficult to implement different status value in DB, complicated backend support for if (step.allow_failure && step.failed) { status = warn } else { existing logic }, and advanced yellow color support the likes of which even CIE can't comprehend in frontend, that's sadly the best DX that can be supported.

Management is probably too busy bringing 3 new LM models to Copilot - a gimmick for investors that will wow them for precisely maybe another quarter - to push for actually useful features to the product, that most devs want and need.

@pboling
Copy link

pboling commented Nov 16, 2024

GitHub is actively anti-developer at this point.

@sthibaul
Copy link

I would need this as well... To be able to have non-regression testing even when we know we have some failures that we will handle later.

@bswck
Copy link

bswck commented Feb 12, 2025

Check out https://github.com/marketplace/actions/alls-green which allows to you to specify allowed failures explicitly, like:

  check:
    if: always()

    needs:
    - docs
    - linters
    - package-linters
    - tests

    runs-on: ubuntu-latest

    steps:
    - name: Decide whether the needed jobs succeeded or failed
      uses: re-actors/alls-green@release/v1
      with:
        allowed-failures: docs, linters
        allowed-skips: non-voting-flaky-job
        jobs: ${{ toJSON(needs) }}

@ljharb
Copy link

ljharb commented Feb 12, 2025

That seems to only work if all your jobs are in the same workflow file, unfortunately.

@jacobtomlinson
Copy link

It looks like a good solution for branch protection rules, but it will still show the PR with an ❌ as the overall status.

@bswck
Copy link

bswck commented Feb 13, 2025

It looks like a good solution for branch protection rules, but it will still show the PR with an ❌ as the overall status.

Yeah, I ended up finally not using it just because of that :/

@mansona
Copy link

mansona commented Feb 13, 2025

I built something that takes a different approach: https://github.com/mainmatter/continue-on-error-comment

It shows the overall PR as a ✅ and adds a comment if any of the jobs with continue-on-error have failed. The only drawback right now is I haven't closed the loop on being able to make this work for PRs from people's forks but I know there are some workarounds for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request external
Projects
None yet
Development

No branches or pull requests