Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal changes to get_searchlight_RDMs to make it more convenient #329

Closed
mathias-sm opened this issue Jun 26, 2023 · 4 comments
Closed

Comments

@mathias-sm
Copy link

get_searchlight_RDMs is very convenient when working with fMRI datasets, but a few small changes would make it easier for me to use, and might benefit others. As things stand, I basically wrote my own version of it, which is almost 100% identical to upstream. I am aware of the efforts in PR #253 whence this issue instead of a PR with my small changes.

There are two things that I would happily add because I use/need them:

  • calc_rdm takes more arguments than get_searchlight_RDMs does, making the later less flexible than the former, e.g. I needed to pass a cv_descriptor when using crossnobis distance, but get_searchlight_RDMs cannot handle that. Same with noise, prior_lambda and prior_weight. Adding them to get_searchlight_RDMs could be useful, possibly with **kwargs?
  • Parallelizing over chunks seems easy and reasonable, esp. as the searchlight util functions already make use of joblib's parallel. I did it by turning for chunks in tqdm(chunked_center, desc='Calculating RDMs...'): into a function like def chunk_to_rdm(chunk):, then called with RDM_corrs = Parallel(n_jobs=-1)(delayed(chunk_to_rdm)(chunk) for chunk in track(chunked_center)) followed by
for chunks, RDM_corr in zip(chunked_center, RDM_corrs):
    RDM[chunks, :] = RDM_corr.dissimilarities

I'd be happy to add these two changes (I have already added them locally to what I'm running, but this makes sharing my code with others harder), and would happily upstream them if it's considered useful.

@CompSocNeuroGroup
Copy link

Hi @mathias-sm did you get any response here? I've been struggling with trying to get a crossnobis searchlight working, and was wondering if you'd be willing to share your version get_searchlight_RDMs that you can pass noise to? thx

@JasperVanDenBosch
Copy link
Contributor

Just an update from our side here; both these features are planned as part of a larger rework of searchlights, but we don't have time in the next month or two as we're prioritising other features. So if @mathias-sm can share his code that may be the best approach in the short run.

@hippocampeli
Copy link

I just followed @mathias-sm proposal. The code worked for me and decreased computation time significantly. Thx!

def chunk2rdm(data_2d,events,chunk,centers,neighbors, method):
    center_data = []
    for c in chunk:
        # grab this center and neighbors
        center = centers[c]
        center_neighbors = neighbors[c]
        # create a database object with this data
        ds = Dataset(data_2d[:, center_neighbors],
                     descriptors={'center': center},
                     obs_descriptors={'events': events},
                     channel_descriptors={'voxels': center_neighbors})
        center_data.append(ds)
        
    RDM_corr = calc_rdm(center_data, method=method,
                        descriptor='events')
    return RDM_corr

Then i replaced the "for chunks in ..." in the original function with parallel execution of the chunk2rdm function like so:

RDM_corrs = Parallel(n_jobs=n_jobs)(delayed(chunk2rdm)(data_2d,events,chunk,centers, neighbors, method) for chunk in tqdm(chunked_center))
        
for chunks, RDM_corr in zip(chunked_center, RDM_corrs):
    RDM[chunks, :] = RDM_corr.dissimilarities

@mathias-sm
Copy link
Author

mathias-sm commented Jan 11, 2024

Hi,

Thanks for the replies. Somehow I missed notifications about this issue. @hippocampeli effectively ended up writing almost exactly the same code as mine — which, just in case someone finds it useful, I'm putting below.

n_centers = centers.shape[0]
chunked_center = np.split(np.arange(n_centers), np.linspace(0, n_centers, 101, dtype=int)[1:-1])

def map_one(chunks, method, descriptor, cv_descriptor):
    center_data = []
    for c in chunks:
        center = centers[c]
        center_neighbors = neighbors[c]
        ds = Dataset(
            data_2d[:, center_neighbors],
            descriptors={"center": center},
            obs_descriptors=obs_descriptors,
            channel_descriptors={"voxels": center_neighbors},
        )
        center_data.append(ds)

    return calc_rdm(center_data, method=method, descriptor=descriptor, cv_descriptor=cv_descriptor)

# map; send parallel computation over several jobs
RDM_corrs = Parallel(n_jobs=10)(delayed(map_one)(c, method, descriptor, cv_descriptor) for c in track(chunked_center))

# reduce; take the result of the parallel computations and stitch them back together
n_conds = len(np.unique(obs_descriptors[descriptor]))
RDM = np.zeros((n_centers, n_conds * (n_conds - 1) // 2))
for chunks, RDM_corr in zip(chunked_center, RDM_corrs):
    RDM[chunks, :] = RDM_corr.dissimilarities

There's still a more general solution based on **kwargs lurking somewhere, but this works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants