Skip to content

Conversation

jmorlock
Copy link

@jmorlock jmorlock commented Jan 25, 2025

There are model parameters where the matrix factorization of BayesianPersonalizedRanking fails.
In this case some (or all) entries of the user and the item matrix become NaN.

While this applies to both the CPU and the GPU version, the CPU version already features a corresponding check. In this pull request I added a similar check to the GPU version and consolidated the source code.

Side Note: Not having this check can be quite misleading. Because in this case a strange behavior can be observed:
no error occurs but recommend returns items the user already liked even with filter_already_liked_items set to True. It can be verified using the following test:

import implicit
import numpy as np
import scipy.sparse as sparse

def test_matrix_nan():
    num_users = 2
    num_items = 4
    factors = 3

    # customer 0 liked item 0 and 1
    customers = np.array([0, 0, 1, 1])
    items = np.array([0, 1, 2, 3])
    quantity = np.ones(len(items))

    user_items = sparse.csr_matrix((quantity, (customers, items)))

    user_factors = implicit.gpu._cuda.Matrix(
        np.full((num_users, factors), np.nan, dtype=np.float32)
    )

    item_factors = implicit.gpu._cuda.Matrix(
        np.full((num_items, factors), np.nan, dtype=np.float32)
    )

    # simulate a failed fit by setting both matrices to NaN
    model = implicit.gpu.bpr.BayesianPersonalizedRanking()
    model.user_factors = user_factors
    model.item_factors = item_factors

    (ids, scores) = model.recommend(
        userid=0,
        user_items=user_items[0],
        N=1,
        filter_already_liked_items=True,
        filter_items=None,
        recalculate_user=False,
        items=None,
    )
    assert ids[0] not in {0, 1}   # FAILS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant