Skip to content

Make it easy to create a class that behaves like a Session #832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alixdamman opened this issue Nov 29, 2019 · 29 comments
Closed

Make it easy to create a class that behaves like a Session #832

alixdamman opened this issue Nov 29, 2019 · 29 comments
Assignees
Milestone

Comments

@alixdamman
Copy link
Collaborator

alixdamman commented Nov 29, 2019

The main idea of the present issue is to benefit from both auto-completion of PyCharm (by defining user classes) and all (nice) features offered by the Session class.

This implies to develop a mechanism to:

  • inherit from Session class,
  • automatically push all attributes defined in the sub-class to the self._objects attribute of Session.

Another (big) advantage of using user-defined classes is to reduce significantly the number of global variables and function parameters in the models that are developed in house.

@alixdamman
Copy link
Collaborator Author

@gdementen OK to "try" to implement this for the next release (0.33)?
I said "try".

@gdementen
Copy link
Contributor

I had in mind to do a quicker-than-usual release to fix the issues we found during the training (matplotlib import in the viewer, small tutorial/doc changes and the load data >2d without axes names problem), probably as 0.32.1 and then move on to liam2. So I am basically ok with the 0.33 milestone but that will depend when you want to release it.

@alixdamman
Copy link
Collaborator Author

If we plan to do a 0.32.1 release, I think there is no reason to make 0.33 release soon.

@gdementen
Copy link
Contributor

Or... If you want to work on it before I am able to (probably somewhere in January), I give you the experimental code I have so far and you do it. So far, I have always linked this "static session" thing to the lazy session feature (see #727), but maybe we could dissociate the two. In that case it's pretty easy. The lazy part is a lot more tricky (when I say tricky, it's not necessarily hard to implement, but rather hard to figure out which behaviour is best for users).

@alixdamman alixdamman self-assigned this Dec 2, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 11, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 12, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 12, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 16, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 17, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 18, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 18, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 18, 2019
alixdamman added a commit to alixdamman/larray that referenced this issue Dec 18, 2019
@alixdamman
Copy link
Collaborator Author

Related to issue #727

@alixdamman
Copy link
Collaborator Author

See comment #840 (comment) for an example of how the StaticSession class should be used.

@gdementen
Copy link
Contributor

This is at least how I saw things. There are probably other options. I just don't think restricting values based on values defined at the class level is a good idea. It is just too weird in the Python world, while there are alternatives which are more usual (restrict on types defined at the class level) or restrict on values defined at the instance level (i.e. values defined in __init__)

@alixdamman
Copy link
Collaborator Author

In the Demo model, they have a Parameters class in which they define all axes, groups and scalars of the model. The other classes contains only arrays but inherits from the Session class to mainly take benefit of its I/O methods. I tried to somewhat "reproduce" what is done in that model (consider all axes and groups defined in a class as frozen) plus to include the idea of ArrayDef but I see now that implementing a class just based on one model is wrong.

The only thing that bothered me with the proposed "specifications" of the StaticSession is that all axes used in statements like array_var = ArrayDef(axes) must be defined before the definition of the class and they cannot depend on any variable with value defined at runtime --> imagine a TIME axis defined as TIME = Axis(f'time={first_proj_year}..{last_proj_year}') where first_proj_year and last_proj_year are defined at runtime.

I wonder if there an easy way to also "freeze" the axes of an Array variables defined in the __init__ ? method?

alixdamman added a commit to alixdamman/larray that referenced this issue Dec 20, 2019
@gdementen
Copy link
Contributor

gdementen commented Dec 20, 2019 via email

alixdamman added a commit to alixdamman/larray that referenced this issue Jan 3, 2020
@alixdamman alixdamman changed the title Make it easy to create a class that behaves like a Session Make it easy to create a class that behaves like a Session (implement ConstrainedSession class) Jan 8, 2020
alixdamman added a commit to alixdamman/larray that referenced this issue Feb 18, 2020
…on preliminary code written by gdementen)
@alixdamman
Copy link
Collaborator Author

I wonder if we couldn't try to use the pydantic to fix this issue ?
This may help us to make classes inheriting from ConstrainedSession more compatible with the broader Python ecosystem. FWIW, a PyCharm plugin exists for pydantic.

@alixdamman
Copy link
Collaborator Author

alixdamman commented Apr 8, 2020

For me this question is crucial. If pydantic allows us to do "the right thing" (i.e. annotate classes -- and possible other code -- with arrays with defined axes) it is worth adding it as a dependency. If it does not, I am unsure it is worth the added trouble.

What you are saying is that you want something like:

class MyDemo(ConstrainedSession):
    pop: ArrayDef("age=0..120; gender=female,male; time=1991..2020")

?

It sounds like it is not possible with the typing module nor with pydantic.

But:

  1. There is a little problem with the current implementation of ArrayDef. This class has only an __init__() method taking axes as argument. So when instantiating MyDemo class as demo = MyClass(pop=zeros("age=0..120; gender=female,male; time=1991..2020")), writing pop. later will not popup the methods of the Array. This is because PyCharm thinks that pop is still an ArrayDef object. This is not an argument to not use the var = type notation but the problem is still there and needs to be fixed one way or another.
  2. The operator = is commonly used to define a default value and : to define a type. That is what is used in functions (and traditional dataclasses) and that is what PyCharm relies on to propose auto-completion. So, I would prefer if we if we stick to that (after all).

So, what we may do is to introduce a subclass ArrayDef implemented as something like:

class ArrayDef(Array):
   def __inti__(data=None, axes=None, dtype=None, meta=None):
         if axes is not None and not isinstance(axes, AxisCollection):
             axes = AxisCollection(axes)
         if data is None:
             data = np.empty(axes.shape, dtype)
         Array.__init__(self, data, axes, dtype, meta)

and then:

from pydantic import BaseModel

class ConstrainedSession(Session, BaseModel):
        ...
        @pydantic.validator('*')
        def validate_variables(cls, v, field):
            key = field.name
            # --- array with constant axes
            if isinstance(field.type_, ArrayDef):
                # no axes given in class definition
                if not field.value:
                    # first initialization
                    if key is not in self.keys()
                        return v
                    else:
                       # compare with current value
                       try:
                            self[key].axes.check_compatible(v.axes)
                       except ValueError as error:
                            msg = str(error).replace("incompatible axes:", f"incompatible axes for array '{key}':")\
                                .replace("vs", "was declared as")
                            raise ValueError(msg)
                       return v
                # axes given in class definition
                else:
                    try:
                        field.value.axes.check_compatible(v.axes)
                    except ValueError as error:
                        msg = str(error).replace("incompatible axes:", f"incompatible axes for array '{key}':")\
                            .replace("vs", "was declared as")
                        raise ValueError(msg)
                    return v

and then:

class MyDemo(ConstrainedSession):
    # axes given in class definition
    pop = ArrayDef("age=0..120; gender=female,male; time=1991..2020")
    # axes given at instantiation
    births: ArrayDef

demo = MyDemo(pop=zeros("age=0..120; gender=female,male; time=1991..2020"),
                             births=zeros("age=0..120; gender=female,male; time=1991..2020"))

@gdementen
Copy link
Contributor

For me this question is crucial. If pydantic allows us to do "the right thing" (i.e. annotate classes -- and possible other code -- with arrays with defined axes) it is worth adding it as a dependency. If it does not, I am unsure it is worth the added trouble.

What you are saying is that you want something like:

class MyDemo(ConstrainedSession):
    pop: ArrayDef("age=0..120; gender=female,male; time=1991..2020")

Indeed. Or, in fact, I would even prefer this (the spelling does not matter that much but the structure does):

class MyDemo(ConstrainedSession):
     gender = Axis("gender=female,male")
     time: Axis
     pop: ArrayDef([gender, time])

def init_demo():
    time = Axis('time=1991..2020')
    return MyDemo(time=time, pop=zeros([MyDemo.gender, time])

It sounds like it is not possible with the typing module nor with pydantic.

I knew about typing but I had hopes for pydantic... Too bad.

But:

1. There is a little problem with the current implementation of `ArrayDef`. This class has only an `__init__()` method taking axes as argument. So when instantiating `MyDemo` class as `demo = MyClass(pop=zeros("age=0..120; gender=female,male; time=1991..2020"))`, writing `pop.` later will not popup the methods of the `Array`. This is because PyCharm thinks that `pop` is still an ArrayDef object. This is not an argument to not use the `var = type` notation but the problem is still there and needs to be fixed one way or another.

2. The operator `=` is commonly used to define a default and `:` to define a type. That is what is used in functions (and traditional dataclasses) and that is what PyCharm relies on to propose auto-completion. So, I would prefer if we if we stick to that (after all).

Indeed. And I would prefer that too (that's the whole point of my comments), but then what you show do not respect that, so I am confused 😕 ...

So, what we may do is to introduce a subclass ArrayDef implemented as something like:

class ArrayDef(Array):
   def __inti__(data=None, axes=None, dtype=None, meta=None):
         if axes is not None and not isinstance(axes, AxisCollection):
             axes = AxisCollection(axes)
         if data is None:
             data = np.empty(axes.shape, dtype)

Making ArrayDef a subclass of Array should be enough. You do not actually need any data. To trick PyCharm/static analyzers, we do not need the object to be functional. We will not use them anyway.

     Array.__init__(self, data, axes, dtype, meta)

This is not needed either AFAICT.

and then:

class MyDemo(ConstrainedSession):
    # axes given in class definition
    pop = ArrayDef("age=0..120; gender=female,male; time=1991..2020")

Here you use = for a type => I am confused...

@gdementen
Copy link
Contributor

Alright.

Either we choose to use pydantic or not, I have a few more suggestions and questions:

Suggestions:

1. I would to add a boolean keyword argument `allow_extra_vars` to whether or not allow extra variables during and after model initialization. Defaults to True.

I fear an option nobody will use. The option could come later if we have demand for it.

2. I would to add a boolean keyword argument `allow_mutation` to whether or not tell the constrained class is faux-immutable, i.e. whether **setattr** and **setitem** are allowed. Defaults to True. This argument could be interesting to create "Parameters" class such as the one defined in the DEMO model.

I fear an option nobody will use. The option could come later if we have demand for it OR we could provide a built-in subclass (e.g. Parameters) with that option enabled.

Questions:

1. The `filter()`, `compact()` and `apply()` methods of `ConstrainedSession` will return a normal `Session` object. Is putting a Warning section in the doctsring of ConstrainedSession enough or should I print a warning when one of these methods is called. I guess a Warning section is enough but do you think our users will read that section?

They will not read that section, but a dynamic warning would be annoying, so I would only warn in the doc and wait for users to discover it (or we might show one such example of Constrainted-> normal session in the tutorial section about ConstraintedSession)

2. What to do with binary operations (**eq** and **neq** excluded)?
   
   * a) Always return a normal `Session` object silently
   * b) Always return a normal `Session` object and print a warning
   * c) Always return a `ConstrainedSession` object and raise an error in case of any incompatibility (extra arrays, ...)
   * d) Return a `ConstrainedSession` object if possible. If not, return a normal `Session` object and print a warning.

I don't really know. I don't like option c) because this severely limits what you can do, but I am unsure about the other options.

3. Same question for unary operations

Personally, I would go for option d) or b) for the questions 2 and 3. I think binary operations is rarely used by our users but IMO returning a normal Session object should not be done silently in this case.

@alixdamman
Copy link
Collaborator Author

alixdamman commented Apr 17, 2020

The PandaSDMX (there is no second s) API is awful. I just hope you will not mimic it.

No, that's not what I have in mind. I only wanted to be able to import the model.py script from pandasSDMX (which implements the so-called SDMX Information-Model (https://sdmx.org/?page_id=5008 > SDMX 2.1 > Section 2 - Information Model)) but after a second thought I think this is silly to add a dependency just for one script. I will write my own implementation of the SDMX-IM but also using pydantic because it has an added value in this case.

I have started to read the official technical documentation of the SDMX-IM. The document is messy. There discrepancies between the class diagrams and the definitions sections. Testing the code is gonna be fun...

@alixdamman
Copy link
Collaborator Author

Suggestions:

  1. I would to add a boolean keyword argument allow_extra_vars to whether or not allow extra variables during and after model initialization. Defaults to True.

I fear an option nobody will use. The option could come later if we have demand for it.

  1. I would to add a boolean keyword argument allow_mutation to whether or not tell the constrained class is faux-immutable, i.e. whether setattr and setitem are allowed. Defaults to True. This argument could be interesting to create "Parameters" class such as the one defined in the DEMO model.

I fear an option nobody will use. The option could come later if we have demand for it OR we could provide a built-in subclass (e.g. Parameters) with that option enabled.

Providing a built-in subclass Parameters sounds a good idea to me. In this case, Parameters should simulate a faux-immutable class and also forbid to add undeclared variables (and to delete declared variables) IMO.

1. The `filter()`, `compact()` and `apply()` methods of `ConstrainedSession` will return a normal `Session` object. Is putting a Warning section in the doctsring of ConstrainedSession enough or should I print a warning when one of these methods is called. I guess a Warning section is enough but do you think our users will read that section?

They will not read that section, but a dynamic warning would be annoying, so I would only warn in the doc and wait for users to discover it (or we might show one such example of Constrainted-> normal session in the tutorial section about ConstraintedSession)

OK

  1. What to do with binary operations (eq and neq excluded)?
  * a) Always return a normal `Session` object silently
  * b) Always return a normal `Session` object and print a warning
  * c) Always return a `ConstrainedSession` object and raise an error in case of any incompatibility (extra arrays, ...)
  * d) Return a `ConstrainedSession` object if possible. If not, return a normal `Session` object and print a warning.

I don't really know. I don't like option c) because this severely limits what you can do, but I am unsure about the other options.

OK to exclude option c) which is indeed too restrictive.
I'll go for option d) (but without warnings?).

@gdementen
Copy link
Contributor

  1. What to do with binary operations (eq and neq excluded)?
  * a) Always return a normal `Session` object silently
  * b) Always return a normal `Session` object and print a warning
  * c) Always return a `ConstrainedSession` object and raise an error in case of any incompatibility (extra arrays, ...)
  * d) Return a `ConstrainedSession` object if possible. If not, return a normal `Session` object and print a warning.

I don't really know. I don't like option c) because this severely limits what you can do, but I am unsure about the other options.

OK to exclude option c) which is indeed too restrictive.
I'll go for option d) (but without warnings?).

In fact, even option c might make sense as long as we have both 1) a way to easily convert a ConstrainedSession to a normal Session and 2) we support binary ops between a ConstrainedSession and a normal Session without error even if they differ (either returning a ConstrainedSession or Session).

In other words, one more option (I am not saying it is the best option -- I don't know) is to behave differently depending on if we are doing ConstrainedSession + Session or ConstrainedSession + ConstrainedSession. We could be more restrictive in the second case.

In short, I still don't know what to do... Sorry, I really don't have a clear mind these days... I guess you will have to pick one an see how it goes in practice. I usually don't like working like that but...

@alixdamman
Copy link
Collaborator Author

In fact, even option c might make sense as long as we have both 1) a way to easily convert a ConstrainedSession to a normal Session and 2) we support binary ops between a ConstrainedSession and a normal Session without error even if they differ (either returning a ConstrainedSession or Session).

In other words, one more option (I am not saying it is the best option -- I don't know) is to behave differently depending on if we are doing ConstrainedSession + Session or ConstrainedSession + ConstrainedSession. We could be more restrictive in the second case.

In short, I still don't know what to do... Sorry, I really don't have a clear mind these days... I guess you will have to pick one an see how it goes in practice. I usually don't like working like that but...

What I have implemented now (but didn't push yet) is:

        def opmethod(self, other):
                    (...)
                    res.append((name, res_item))
            try:
                # XXX: print a warning?
                ses = self.__class__(res)
            except Exception:
                ses = Session(res)
            return ses

in both _bynaryop and _unaryop.

@alixdamman
Copy link
Collaborator Author

I really don't have a clear mind these days...

Me neither...

@gdementen
Copy link
Contributor

What I have implemented now (but didn't push yet) is:

        def opmethod(self, other):
                    (...)
                    res.append((name, res_item))
            try:
                # XXX: print a warning?
                ses = self.__class__(res)
            except Exception:
                ses = Session(res)
            return ses

in both _bynaryop and _unaryop.

Seems good enough for now. We can always refine this later.

@alixdamman
Copy link
Collaborator Author

Or, in fact, I would even prefer this (the spelling does not matter that much but the structure does):

class MyDemo(ConstrainedSession):
     gender = Axis("gender=female,male")
     time: Axis
     pop: ArrayDef([gender, time])

def init_demo():
    time = Axis('time=1991..2020')
    return MyDemo(time=time, pop=zeros([MyDemo.gender, time])

It sounds like it is not possible with the typing module nor with pydantic.

I knew about typing but I had hopes for pydantic... Too bad.

I was speaking too fast. I finally found some time to take a deeper look inside the pydantic library.
I made some quick tests but it seems like that implementing:

AGE = Axis("age=0..120")
GENDER = Axis("gender=female,male")

class MyDemo(ConstrainedSession):
     pop: ArrayDef((AGE, GENDER))

def init_demo():
    return MyDemo(pop=zeros((AGE, GENDER))

is actually possible with pydantic.

Maybe implementing something like pop: ArrayDef((GENDER, 'time')) where the axis time is defined at runtime is possible.

@gdementen
Copy link
Contributor

🎉Yipie!🎉 Now pydantic looks a lot more interesting to me. How does PyCharm behave with such annotations?

@alixdamman
Copy link
Collaborator Author

How does PyCharm behave with such annotations?

Quiet well actually but I have to say that I recently installed the 2020.1 version of PyCharm.

alixdamman added a commit to alixdamman/larray that referenced this issue Jun 15, 2021
…on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 3, 2021
…on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 3, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 4, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 4, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
@alixdamman alixdamman changed the title Make it easy to create a class that behaves like a Session (implement ConstrainedSession class) Make it easy to create a class that behaves like a Session Aug 4, 2021
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 6, 2021
…s and ConsArray classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 6, 2021
…s and ArrayConstraint classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 6, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 10, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
alixdamman added a commit to alixdamman/larray that referenced this issue Aug 17, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
gdementen pushed a commit to gdementen/larray that referenced this issue Nov 30, 2021
…s and CheckedArray classes (based on preliminary code written by gdementen)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants