Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Word2Vec' object has no attribute 'infer_vector' #18

Open
crowoy opened this issue Apr 15, 2018 · 12 comments
Open

'Word2Vec' object has no attribute 'infer_vector' #18

crowoy opened this issue Apr 15, 2018 · 12 comments

Comments

@crowoy
Copy link

crowoy commented Apr 15, 2018

I've cloned the repo, downloaded the pre-trained Wikipedia model, and installed Gensim via pip install git+https://github.com/jhlau/gensim.

Then I pasted the downloaded model files into the toy_data directory and changed the model line in the file to: model="toy_data/word2vec.bin".

However, when I run infer_test.py I get the following error:

Traceback (most recent call last):
  File "infer_test.py", line 25, in <module>
    output.write( " ".join([str(x) for x in m.infer_vector(d, alpha=start_alpha, steps=infer_epoch)]) + "\n" )
AttributeError: 'Word2Vec' object has no attribute 'infer_vector'
@jhlau
Copy link
Owner

jhlau commented Apr 15, 2018

Looks like it's probably not reading from my forked version of gensim. Can you test if you can call infer_test manually with an interactive python session?

@crowoy
Copy link
Author

crowoy commented Apr 15, 2018

Thank you for the prompt reply!

What do you mean by calling infer_test with an interactive python session?

I've tried doing the following:

python
import gensim.models as g
model = g.Doc2Vec.load("toy_data/word2vec.bin")
model.infer_vector("this is a test".split(), alpha=0.01, steps=1000)

Which generates the same issue.

@crowoy
Copy link
Author

crowoy commented Apr 15, 2018

Could it be to do with dependencies of Gensim?

(env) $ pip list
Package         Version
--------------- ---------
boto            2.48.0
boto3           1.7.4
botocore        1.10.4
bz2file         0.98
certifi         2018.1.18
chardet         3.0.4
docutils        0.14
futures         3.2.0
gensim          0.12.4
idna            2.6
jmespath        0.9.3
numpy           1.14.2
pip             10.0.0
python-dateutil 2.6.1
requests        2.18.4
s3transfer      0.1.13
scipy           1.0.1
setuptools      39.0.1
six             1.11.0
smart-open      1.5.7
urllib3         1.22
wheel           0.31.0

@jhlau
Copy link
Owner

jhlau commented Apr 16, 2018

Weird, I just tried doing a fresh install (with virtualenv) and have no problems. Your gensim version seems to be right too (0.12.4), so I am not sure why this is happening.

Can you try create a new virtualenv, install gensim like you did before and try again?

@crowoy
Copy link
Author

crowoy commented Apr 16, 2018

So I deleted the env, and run the following:

$ virtualenv env
$ source env/bin/activate
(env) $ pip install git+https://github.com/jhlau/gensim
(env) $ python
>>> import gensim.models as g
>>> model = g.Doc2Vec.load("toy_data/word2vec.bin")
>>> model.infer_vector("this is a test".split(), alpha=0.01, steps=1000)

And is still outputs:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Word2Vec' object has no attribute 'infer_vector'

@jhlau
Copy link
Owner

jhlau commented Apr 16, 2018

Does the train script (train_model.py) work?

@crowoy
Copy link
Author

crowoy commented Apr 16, 2018

Looks to be working fine:

(env) $ python train_model.py
2018-04-16 11:22:21,169 : INFO : collecting all words and their counts
2018-04-16 11:22:21,169 : INFO : PROGRESS: at example #0, processed 0 words (0/s), 0 word types, 0 tags
2018-04-16 11:22:21,223 : INFO : collected 11097 word types and 1000 unique tags from a corpus of 1000 examples and 84408 words
2018-04-16 11:22:21,272 : INFO : min_count=1 retains 11097 unique words (drops 0)
2018-04-16 11:22:21,272 : INFO : min_count leaves 84408 word corpus (100% of original 84408)
2018-04-16 11:22:21,325 : INFO : deleting the raw counts dictionary of 11097 items
2018-04-16 11:22:21,325 : INFO : sample=1e-05 downsamples 3599 most-common words
2018-04-16 11:22:21,325 : INFO : downsampling leaves estimated 22704 word corpus (26.9% of prior 84408)
2018-04-16 11:22:21,326 : INFO : estimated required memory for 11097 words and 300 dimensions: 33381300 bytes
2018-04-16 11:22:21,377 : INFO : resetting layer weights
2018-04-16 11:22:21,377 : INFO : loading pre-trained embeddings
2018-04-16 11:22:21,819 : INFO : 1000 lines processed (0.441607952118s); 969 embeddings collected
2018-04-16 11:22:22,129 : INFO : training model with 1 workers on 11129 vocabulary and 300 features, using sg=1 hs=0 sample=1e-05 negative=5
2018-04-16 11:22:22,129 : INFO : expecting 1000 sentences, matching count from corpus used for vocabulary survey
2018-04-16 11:22:23,205 : INFO : PROGRESS: at 1.29% examples, 28676 words/s, in_qsize 2, out_qsize 0
2018-04-16 11:22:24,212 : INFO : PROGRESS: at 2.48% examples, 28295 words/s, in_qsize 1, out_qsize 0
2018-04-16 11:22:25,259 : INFO : PROGRESS: at 3.76% examples, 28426 words/s, in_qsize 1, out_qsize 0
2018-04-16 11:22:26,349 : INFO : PROGRESS: at 5.04% examples, 28352 words/s, in_qsize 1, out_qsize 0
2018-04-16 11:22:27,436 : INFO : PROGRESS: at 6.36% examples, 28413 words/s, in_qsize 1, out_qsize 0
2018-04-16 11:22:28,470 : INFO : PROGRESS: at 7.54% examples, 28139 words/s, in_qsize 1, out_qsize 0
2018-04-16 11:22:29,493 : INFO : PROGRESS: at 8.71% examples, 27971 words/s, in_qsize 1, out_qsize 0

@jhlau
Copy link
Owner

jhlau commented Apr 16, 2018

Yea this really beats me. The train_model.py loads pre-trained embeddings and it won't work if you use the canonical gensim, so it looks like your gensim version is right but somehow it doesn't see infer_vector...

@rafiqhasan
Copy link

Any fix on this yet ? We are facing a similar issue.

@Abhimanyu100
Copy link

I am facing this same issue with doc2vec. My code run fine on Google Cloab, but getting an error while running locally on my own system.
AttributeError: 'NumpyArrayWrapper' object has no attribute 'infer_vector'

@Saatvik-droid
Copy link

Saatvik-droid commented May 29, 2023

Did anyone solve this?
I get the following error

model.infer_vector(["my", "input]))
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'Word2Vec' object has no attribute 'infer_vector'

Pip freeze:

gensim @ file:///home/saat/Projects/gensim
numpy==1.24.3
scipy==1.10.1
six==1.16.0
smart-open==6.3.0

@Saatvik-droid
Copy link

Running and saving my own model with train_model.py it can use infer_vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants