Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions src/MSDIAL5/MsdialCore/Algorithm/Alignment/SpotAction.cs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ public MatchResultAnnotationDeduplicator(IMatchResultEvaluator<MsScanMatchResult
}

public void Process(IEnumerable<AlignmentSpotProperty> spots) {
var spotList = spots.Where(n => n.IsReferenceMatched(_evaluator)).OrderByDescending(spot => spot.MatchResults.Representative.LibraryID).ToList();
var spotList = spots.Where(n => n.IsReferenceMatched(_evaluator) && !n.Name.StartsWith("Putative")).OrderByDescending(spot => spot.MatchResults.Representative.LibraryID).ToList();
var currentPeakId = 0;
var currentLibraryId = spotList[currentPeakId].MatchResults.Representative.LibraryID;

Comment on lines +33 to 36
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage: this change alters the alignment deduplication behavior (Putative prefix filtering + match flag behavior), but the existing RefineMspDuplicateTest in LcmsAlignmentRefinerTests is currently marked [Ignore], and there don’t appear to be active tests covering MatchResultAnnotationDeduplicator. Adding/enabling a test for the updated behavior would help prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to 36
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new filter !n.Name.StartsWith("Putative") can make spotList empty, but the code immediately indexes spotList[0] (line 35), which will throw. Add an early return when spotList.Count == 0 before accessing spotList[currentPeakId] (and consider using a more specific prefix like "Putative:" with an explicit StringComparison).

Copilot uses AI. Check for mistakes.
Expand All @@ -56,7 +56,7 @@ public void Process(IEnumerable<AlignmentSpotProperty> spots) {

// by InChIKey
spotList = spots
.Where(n => n.IsReferenceMatched(_evaluator))
.Where(n => n.IsReferenceMatched(_evaluator) && !n.Name.StartsWith("Putative"))
.Where(n => !string.IsNullOrEmpty(n.MatchResults?.Representative?.InChIKey) && n.MatchResults.Representative.InChIKey.Length > 1)
.OrderByDescending(spot => spot.MatchResults.Representative.InChIKey)
Comment on lines 58 to 61
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartsWith("Putative") uses current-culture, case-sensitive comparison and will also exclude names like "PutativeX" (no colon). If this is meant to detect the prefix added by ChangeAnnotationToLowScore, consider checking for the exact prefix (e.g., "Putative:") with an explicit StringComparison (Ordinal/OrdinalIgnoreCase) to avoid unintended exclusions.

Copilot uses AI. Check for mistakes.
.ToList();
Expand All @@ -83,7 +83,7 @@ public void Process(IEnumerable<AlignmentSpotProperty> spots) {

// by Name
spotList = spots
.Where(n => n.IsReferenceMatched(_evaluator))
.Where(n => n.IsReferenceMatched(_evaluator) && !n.Name.StartsWith("Putative"))
.Where(n => !n.MatchResults.Representative.Name.IsEmptyOrNull())
.OrderByDescending(spot => spot.MatchResults.Representative.Name)
.ToList();
Expand All @@ -110,8 +110,8 @@ public void Process(IEnumerable<AlignmentSpotProperty> spots) {
}

static void ChangeAnnotationToLowScore(AlignmentSpotProperty spot) {
spot.MatchResults.Representative.IsReferenceMatched = false;
spot.Name = "putative: " + spot.Name;
//spot.MatchResults.Representative.IsReferenceMatched = false;
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChangeAnnotationToLowScore() no longer clears spot.MatchResults.Representative.IsReferenceMatched. That means (for spots where reference-matched status comes from MatchResults) the “deduplicated” low-score spots may still be treated as reference-matched throughout the pipeline (e.g., blank filtering keeps ref-matched features). If the intent is to demote duplicates to putative, this should still update match flags (and/or clear the representative result) rather than only prefixing the name.

Suggested change
//spot.MatchResults.Representative.IsReferenceMatched = false;
var representative = spot.MatchResults?.Representative;
if (representative != null) {
representative.IsReferenceMatched = false;
}

Copilot uses AI. Check for mistakes.
spot.Name = "Putative: " + spot.Name;
}
}

Expand Down
41 changes: 40 additions & 1 deletion src/MSDIAL5/MsdialGcMsApi/Algorithm/Annotation.cs
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ public AnnotatedMSDecResult[] MainProcess(IReadOnlyList<MSDecResult> ms1DecResul

var features = new AnnotatedMSDecResult[ms1DecResults.Count];
if (_parameter.OnlyReportTopHitInMspSearch) {
var used = new HashSet<MoleculeMsReference>();
var used = new HashSet<MoleculeMsReference>(new MoleculeMsReferenceEqualityComparer());
foreach (var (container, i) in containers.WithIndex().OrderByDescending(p => p.Item1.Representative.TotalScore)) {
Comment on lines +55 to 56
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new HashSet(new MoleculeMsReferenceEqualityComparer()) is unsafe because the comparer’s Equals()/GetHashCode() are inconsistent: Equals() can return true via Name/ScanID matching even when only one object has InChIKey/ScanID set, but GetHashCode() prefers the first available field on each object. This breaks HashSet invariants (same item may not be found / duplicates may slip through). Consider switching to a key-based approach (e.g., HashSet of normalized string keys and add all applicable keys), or redefine equality to use a single canonical key selection that GetHashCode mirrors.

Copilot uses AI. Check for mistakes.
if (container.Representative is MsScanMatchResult topHit && !topHit.IsUnknown) {
ms1DecResults[i].MspIDs.AddRange(containers[i].MatchResults.OrderByDescending(r => r.TotalScore).Select(r => r.LibraryID));
Expand Down Expand Up @@ -99,4 +99,43 @@ public AnnotatedMSDecResult[] MainProcess(IReadOnlyList<MSDecResult> ms1DecResul
return ms1DecResults.Select(r => new AnnotatedMSDecResult(r, new MsScanMatchResultContainer())).ToArray();
}
}

public class MoleculeMsReferenceEqualityComparer : IEqualityComparer<MoleculeMsReference> {
public bool Equals(MoleculeMsReference x, MoleculeMsReference y) {
if (ReferenceEquals(x, y)) return true;
if (x is null || y is null) return false;

if (!string.IsNullOrEmpty(x.InChIKey) && !string.IsNullOrEmpty(y.InChIKey)) {
if (x.InChIKey == y.InChIKey) return true;
}

if (x.ScanID != 0 && y.ScanID != 0 && x.ScanID == y.ScanID) {
return true;
}

if (!string.IsNullOrEmpty(x.Name) && !string.IsNullOrEmpty(y.Name)) {
if (x.Name.ToLower() == y.Name.ToLower()) return true;
}
Comment on lines +116 to +118
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Name.ToLower() for comparisons/hash codes is culture-sensitive and allocates. Prefer a case-insensitive ordinal comparison (e.g., string.Equals(..., StringComparison.OrdinalIgnoreCase)) and a matching hash strategy (e.g., StringComparer.OrdinalIgnoreCase.GetHashCode(name)) to avoid locale-dependent behavior.

Copilot uses AI. Check for mistakes.

return false;
}
Comment on lines +104 to +121
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoleculeMsReferenceEqualityComparer violates the requirement that if Equals(x,y) is true then GetHashCode(x) must equal GetHashCode(y). With the current “match on any of InChIKey/ScanID/Name” logic, two objects can be equal by Name while hashing by different fields, which will break HashSet/Dictionary behavior. The comparer should either (1) compute hash from a single canonical key that Equals also uses, or (2) avoid OR-equality and instead deduplicate by maintaining multiple key sets (InChIKey/ScanID/Name) separately.

Copilot uses AI. Check for mistakes.

public int GetHashCode(MoleculeMsReference obj) {
if (obj is null) return 0;

if (!string.IsNullOrEmpty(obj.InChIKey)) {
return obj.InChIKey.GetHashCode();
}

if (obj.ScanID != 0) {
return obj.ScanID.GetHashCode();
}

if (!string.IsNullOrEmpty(obj.Name)) {
return obj.Name.ToLower().GetHashCode();
}

return 0;
}
}
}
Loading