Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR import pymssa #7

Open
fspaolo opened this issue May 22, 2019 · 10 comments
Open

ERROR import pymssa #7

fspaolo opened this issue May 22, 2019 · 10 comments

Comments

@fspaolo
Copy link

fspaolo commented May 22, 2019

Installing thought python setup.py install went just fine. The very fist attempt to import the module, however, gave me the following error:

$ python -c 'import pymssa'

\Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/paolofer/anaconda2/lib/python2.7/site-packages/pymssa-0.1.0-py2.7.egg/pymssa/__init__.py", line 1, in <module>
    from .mssa import MSSA
  File "/Users/paolofer/anaconda2/lib/python2.7/site-packages/pymssa-0.1.0-py2.7.egg/pymssa/mssa.py", line 229
    U = left_singular_vectors @ T
                              ^
SyntaxError: invalid syntax
@EricKenjiLee
Copy link

This is because the @ operator for matrix multiplication was only introduced in Python 3.5 PEP465. You're probably not using a recent enough version of Python.

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

No, I need to stick to 2.7+, which is stable with the amount of legacy code I have. So I'm afraid this package is not an option then.

@EricKenjiLee
Copy link

EricKenjiLee commented May 22, 2019

The package isn't very large so I'd suggest forking the package and changing any @'s to np.matmul(). Not sure if there's a safe way to have multiple versions of Python playing nice together

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

I can try replacing the @'s and see if that's the only issue. For multiple versions of Python I would have to modify working code to fit the Py3 standards. I'll post an update. Thanks

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

Yes, there are other issues related to Py2+ vs Py3+... when I replace the @'s other errors come up:

ImportError                               Traceback (most recent call last)
<ipython-input-2-73632525add2> in <module>()
----> 1 import pymssa

/Users/paolofer/anaconda2/lib/python2.7/site-packages/pymssa-0.1.0-py2.7.egg/pymssa/__init__.py in <module>()
----> 1 from .mssa import MSSA

/Users/paolofer/anaconda2/lib/python2.7/site-packages/pymssa-0.1.0-py2.7.egg/pymssa/mssa.py in <module>()
      7 from scipy.linalg import hankel
      8
----> 9 from functools import partial, lru_cache, reduce
     10 from tqdm.autonotebook import tqdm
     11

ImportError: cannot import name lru_cache

I can fix that as well by using (for Py2+):

from backports.functools_lru_cache import lru_cache

But probably there will be other dependencies/errors showing up...

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

I managed to replace the Py2+ dependencies and run the example Notebook (you have to provide the wine.csv file, even after finding it online I had to manually modified the header).

Tomorrow I'll test is on a real use case... forecasting applied to a data cube with (t, x, y) = (1452, 1836, 104)

Hopefully it can handle it?

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

BTW, a simple

try:
    a @ b
except:
    np.matmul(a, b)

try:
    from functools import lru_cache
except:
    from backports.functools_lru_cache import lru_cache

would ensure backward compatibility.

@fspaolo
Copy link
Author

fspaolo commented May 22, 2019

This package can't handle such data sets...

Limiting n_components=4 and window_size=25 (out of >300 obs) gives me:

('Trajectory matrix shape:', (1572000, 282))
Decomposing trajectory covariance matrix with SVD
Killed: 9

Reducing the data set to ~1/4, still gives me

('Trajectory matrix shape:', (393000, 282))
Decomposing trajectory covariance matrix with SVD
Killed: 9

These are likely memory-related issues. I also noticed that you use np.dot() for the SVD, which is not very efficient.

This is a typical use case for EOF analysis, it should be doable with MSSA.

@EricKenjiLee
Copy link

I’d like to first point out this isn’t my package :)

So try-except is useful but it depends on the approach to coding. Sometimes it’s better to have things “fail loudly” so in this case I don’t mind that the author hasn’t written this in; generally, Python2 isn’t supported anymore and all code should be ported over if possible. Also, if you want to write try-excepts, better to have a specific exception in mind like “except ValueError()” so you catch only a very specific failure mode.

@EricKenjiLee
Copy link

EricKenjiLee commented May 22, 2019

It also looks like you’re not going to be able to do computations with that large of a matrix since you’re probably running out of memory. Consider doing sparse matrix operations if possible or I know there are some other packages for out of memory operations. I think PyTables is made for large amounts of data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants