Accessing actual used peaks

As @sdrogers mentioned: In the first preprint, version Fig 5b shows the possible matches between peaks, but maybe it would be nicer (?) to display the ones that get selected by the greedy algorithm used in `ModifiedCosine`.

Here just quickly where you would have to look for that in matchms: the actual matching peak selection is done in `score_best_matches` from `matchms.similarity.spectrum_similarity_functions`. Currently this outputs only the number of used matching pairs and, so that needs to be modified to also output the used pairs. 

```python
import numpy

def score_best_matches_simile(matching_pairs: numpy.ndarray, spec1: numpy.ndarray,
                              spec2: numpy.ndarray, mz_power: float = 0.0,
                              intensity_power: float = 1.0) -> Tuple[float, int]:
    """Calculate cosine-like score by multiplying matches. Does require a sorted
    list of matching peaks (sorted by intensity product)."""
    score = float(0.0)
    used_matches = []
    used1 = set()
    used2 = set()
    for i in range(matching_pairs.shape[0]):
        if not matching_pairs[i, 0] in used1 and not matching_pairs[i, 1] in used2:
            score += matching_pairs[i, 2]
            used1.add(matching_pairs[i, 0])  # Every peak can only be paired once
            used2.add(matching_pairs[i, 1])  # Every peak can only be paired once
            used_matches.append(i)

    # Normalize score:
    spec1_power = spec1[:, 0] ** mz_power * spec1[:, 1] ** intensity_power
    spec2_power = spec2[:, 0] ** mz_power * spec2[:, 1] ** intensity_power

    score = score/(numpy.sum(spec1_power ** 2) ** 0.5 * numpy.sum(spec2_power ** 2) ** 0.5)
    return score, used_matches
```

Unfortunately, other involved functions are build in as subfunctions in `ModifiedCosine`, so you would essentially have to write your own edited version of that (or skip the class and build a function instead).
```python
class ModifiedCosineSimile(BaseSimilarity):
    ...
    def pair(...
    
        spec1 = get_peaks_array(reference)
        spec2 = get_peaks_array(query)
        matching_pairs = get_matching_pairs()
        if matching_pairs.shape[0] == 0:
            return None
        score, used_matches = score_best_matches_simile(matching_pairs, spec1, spec2,
                                   self.mz_power, self.intensity_power)
        return score, [matching_pairs[i] for i in used_matches]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accessing actual used peaks #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accessing actual used peaks #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions