-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
As @sdrogers mentioned: In the first preprint, version Fig 5b shows the possible matches between peaks, but maybe it would be nicer (?) to display the ones that get selected by the greedy algorithm used in ModifiedCosine.
Here just quickly where you would have to look for that in matchms: the actual matching peak selection is done in score_best_matches from matchms.similarity.spectrum_similarity_functions. Currently this outputs only the number of used matching pairs and, so that needs to be modified to also output the used pairs.
import numpy
def score_best_matches_simile(matching_pairs: numpy.ndarray, spec1: numpy.ndarray,
spec2: numpy.ndarray, mz_power: float = 0.0,
intensity_power: float = 1.0) -> Tuple[float, int]:
"""Calculate cosine-like score by multiplying matches. Does require a sorted
list of matching peaks (sorted by intensity product)."""
score = float(0.0)
used_matches = []
used1 = set()
used2 = set()
for i in range(matching_pairs.shape[0]):
if not matching_pairs[i, 0] in used1 and not matching_pairs[i, 1] in used2:
score += matching_pairs[i, 2]
used1.add(matching_pairs[i, 0]) # Every peak can only be paired once
used2.add(matching_pairs[i, 1]) # Every peak can only be paired once
used_matches.append(i)
# Normalize score:
spec1_power = spec1[:, 0] ** mz_power * spec1[:, 1] ** intensity_power
spec2_power = spec2[:, 0] ** mz_power * spec2[:, 1] ** intensity_power
score = score/(numpy.sum(spec1_power ** 2) ** 0.5 * numpy.sum(spec2_power ** 2) ** 0.5)
return score, used_matchesUnfortunately, other involved functions are build in as subfunctions in ModifiedCosine, so you would essentially have to write your own edited version of that (or skip the class and build a function instead).
class ModifiedCosineSimile(BaseSimilarity):
...
def pair(...
spec1 = get_peaks_array(reference)
spec2 = get_peaks_array(query)
matching_pairs = get_matching_pairs()
if matching_pairs.shape[0] == 0:
return None
score, used_matches = score_best_matches_simile(matching_pairs, spec1, spec2,
self.mz_power, self.intensity_power)
return score, [matching_pairs[i] for i in used_matches]Metadata
Metadata
Assignees
Labels
No labels