-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix incorrect skip result evaluation causing false positives in PyPI malware reporting" #1031
base: staging
Are you sure you want to change the base?
Conversation
8af93d3
to
97bb593
Compare
src/macaron/slsa_analyzer/checks/detect_malicious_metadata_check.py
Outdated
Show resolved
Hide resolved
{Confidence.MEDIUM.value}::result("medium_confidence_2") :- | ||
quickUndetailed, | ||
failed({Heuristics.ONE_RELEASE.value}), | ||
passed({Heuristics.WHEEL_ABSENCE.value}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isHeuristics.WHEEL_ABSENCE.value
passed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a result of the translation from these combinations:
(
HeuristicResult.FAIL, # Empty Project
HeuristicResult.SKIP, # Source Code Repo
HeuristicResult.FAIL, # One Release
HeuristicResult.SKIP, # High Release Frequency
HeuristicResult.SKIP, # Unchanged Release
HeuristicResult.FAIL, # Closer Release Join Date
HeuristicResult.PASS, # Suspicious Setup
HeuristicResult.PASS, # Wheel Absence
HeuristicResult.FAIL, # Anomalous Version
# No project link, only one release, and the maintainer released it shortly
# after account registration.
# The setup.py file has no effect and .whl file is present.
# The version number is anomalous.
): Confidence.MEDIUM,
(
HeuristicResult.FAIL, # Empty Project
HeuristicResult.SKIP, # Source Code Repo
HeuristicResult.FAIL, # One Release
HeuristicResult.SKIP, # High Release Frequency
HeuristicResult.SKIP, # Unchanged Release
HeuristicResult.FAIL, # Closer Release Join Date
HeuristicResult.FAIL, # Suspicious Setup
HeuristicResult.PASS, # Wheel Absence
HeuristicResult.FAIL, # Anomalous Version
# No project link, only one release, and the maintainer released it shortly
# after account registration.
# The setup.py file has no effect and .whl file is present.
# The version number is anomalous.
): Confidence.MEDIUM,
(
HeuristicResult.FAIL, # Empty Project
HeuristicResult.SKIP, # Source Code Repo
HeuristicResult.FAIL, # One Release
HeuristicResult.SKIP, # High Release Frequency
HeuristicResult.SKIP, # Unchanged Release
HeuristicResult.FAIL, # Closer Release Join Date
HeuristicResult.SKIP, # Suspicious Setup
HeuristicResult.PASS, # Wheel Absence
HeuristicResult.FAIL, # Anomalous Version
# No project link, only one release, and the maintainer released it shortly
# after account registration.
# The setup.py file has no effect and .whl file is present.
# The version number is anomalous.
): Confidence.MEDIUM,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the previous implementation, we wanted to make sure if wheel absence does not fail, we still report as a malicious patterns as long as anomalicious version fails, there is one release, and quickUndetailed matches. With the new declarative approach, do we still need explicitly require passed({Heuristics.WHEEL_ABSENCE.value})
?
src/macaron/slsa_analyzer/checks/detect_malicious_metadata_check.py
Outdated
Show resolved
Hide resolved
passed({Heuristics.WHEEL_ABSENCE.value}), | ||
failed({Heuristics.ANOMALOUS_VERSION.value}). | ||
|
||
% ----- Evaluation ----- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a comment in the code to explain what happens here? I.e., which rules will be selected and how they are combined, e.g., is it enough if one of them is True
? What if more than one rule is evaluated to True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 12b2c65 should now handle this. The ProbLog model aggregates the results of all of the rules, and a separate query gets the triggered rule IDs.
326e88f
to
362d0b4
Compare
Signed-off-by: Carl Flottmann <[email protected]>
Signed-off-by: Carl Flottmann <[email protected]>
Signed-off-by: Carl Flottmann <[email protected]>
…ules triggers increase the confidence Signed-off-by: Carl Flottmann <[email protected]>
Signed-off-by: Carl Flottmann <[email protected]>
… improved. Signed-off-by: Carl Flottmann <[email protected]>
Signed-off-by: Carl Flottmann <[email protected]>
362d0b4
to
d46f7ed
Compare
Addressing issue identified in #1027, where skips were being evaluated as false. This PR introduces wrappers
passed()
andfailed()
into the ProbLog model that usetry_call()
statements. Skipped heuristics are no longer defined in the ProbLog model, which is why thistry_call()
statement is used. This means that evaluatingfailed(heuristic)
will be false if the heuristic passed, or if it was not defined (i.e. was skipped). Similarly, for evaluatingpassed()
, this will be false if the heuristic failed, or if it was not defined. This should handle situations where skips should not cause rules they are part of to trigger. This method was the easiest way to keep as much of the ProbLog model in a static string as possible, without having to perform extensive string operations.Rule IDs have also been added for debugging purposes, and a method to extract them, so that it is evident what rule was triggered.