-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider high risk evaluation result from other unrelated prs #2317
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: xueqzhan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
if summary.RiskLevel == api.FailureRiskLevelHigh { | ||
riskAnalysisPRTestRiskMetric.WithLabelValues(org, repo, number, jobName, jobID, testSummary.Name).Set(float64(testSummary.Risk.Level.Level)) | ||
} else { | ||
riskAnalysisPRTestRiskMetric.DeleteLabelValues(org, repo, number, jobName, jobID, testSummary.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I struggled to unset the metric gauge. At first I thought this would work. But in reality, failed test might pass in the next run and will not be evaluated for risk analysis. That means they will not be in the summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need a separate daemon processor for metrics. Something that could look at the previous 24 hours and track the active count until it dropped off to 0 for 24 hours or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be better to leave the metrics out and focus that work on TRT-1704
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, we only set this for FailureRiskLevelHigh. I would be interested in a metric that published the count for how many failures we have seen recently for a High Risk Failure, the greater the count the more likely we are detecting a regression. Again, this work would likely be better in the other card.
5e6cb1b
to
c26d118
Compare
@xueqzhan: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@@ -280,6 +287,7 @@ func JobRunRiskAnalysis(dbc *db.DB, jobRun *models.ProwJobRun, jobRunTestCount i | |||
logger.WithError(err).Errorf("Failed to find matching jobIds for: %s", jobRun.ProwJob.Name) | |||
} | |||
} | |||
logger.Infof("Found %d unfilered matching jobs %d runs for: %s\njobs %+v", len(jobNames), totalJobRuns, jobRun.ProwJob.Name, jobNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/unfilered/unfiltered
@@ -553,6 +572,55 @@ func runTestRunAnalysis(failedTest models.ProwJobRunTest, jobRun *models.ProwJob | |||
return analysis, nil | |||
} | |||
|
|||
func isHighRiskInOtherPRs(bqc *bigquery.Client, failedTest models.ProwJobRunTest, jobRun *models.ProwJobRun) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried running locally? I would be curious if we could find a case or fake one out to run through this logic.
@@ -161,7 +167,7 @@ func findReleaseMatchJobNames(dbc *db.DB, jobRun *models.ProwJobRun, compareRele | |||
} | |||
|
|||
if len(jobs) > 0 { | |||
logger.Infof("Found %d matches with: %s", len(jobs), name) | |||
logger.Infof("Found %d matches with: %s\njobs %+v", len(jobs), name, jobs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of logging isn't it? An array of jobs each time within a loop...
} | ||
rowCount = values[0].(int64) | ||
if rowCount > 0 { | ||
log.Infof("High risk items found in other PRs for job %s test '%s'", jobRun.ProwJob.Name, failedTest.Test.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding the count to the logging? Maybe even returning the count so we could use it in a metric
analysis.Risk = apitype.TestFailureRisk{ | ||
Level: apitype.FailureRiskLevelNone, | ||
Reasons: []string{ | ||
"High risk was identified in other PRs first", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this shouldn't go straight to none but instead be set to Medium with the reason indicating that the test may have regressed external to the current PR. Potential external regression detected for High Risk Test analysis
@@ -106,6 +111,7 @@ func FetchJobRun(dbc *db.DB, jobRunID int64, logger *log.Entry) (*models.ProwJob | |||
// Load the ProwJobRun, ProwJob, and failed tests: | |||
// TODO: we may want to expand to analyzing flakes here in the future | |||
res := dbc.DB.Joins("ProwJob"). | |||
Preload("PullRequests"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this something that is being used already and is performance boost or left over from previous work (I didn't see a new reference to pull requests but might be missing it.
With this there are still a few caveats: