Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporating AePPL into GaussianRandomWalk #5814

Closed
wants to merge 1 commit into from

Conversation

larryshamalama
Copy link
Member

Closes #5762.

This PR removes GaussianRandomWalkRV altogether by defining an rv_op as a cumulative sum of distributions. The most recent version of AePPL, i.e. 0.0.31, is now able to retrieve the appropriate logp graph from a CumOp.

This is a WIP for now as I figure out how to use AePPL...

@codecov
Copy link

codecov bot commented May 28, 2022

Codecov Report

Merging #5814 (9a7badd) into main (906fcdc) will decrease coverage by 6.24%.
The diff coverage is 93.18%.

❗ Current head 9a7badd differs from pull request most recent head dbeb801. Consider uploading reports for the commit dbeb801 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5814      +/-   ##
==========================================
- Coverage   89.26%   83.01%   -6.25%     
==========================================
  Files          72       73       +1     
  Lines       12890    13261     +371     
==========================================
- Hits        11506    11009     -497     
- Misses       1384     2252     +868     
Impacted Files Coverage Δ
pymc/distributions/timeseries.py 77.55% <92.85%> (-1.09%) ⬇️
pymc/distributions/logprob.py 89.55% <100.00%> (-8.18%) ⬇️
pymc/func_utils.py 22.50% <0.00%> (-65.88%) ⬇️
pymc/distributions/bound.py 45.54% <0.00%> (-54.46%) ⬇️
pymc/distributions/mixture.py 59.35% <0.00%> (-36.37%) ⬇️
pymc/exceptions.py 68.96% <0.00%> (-31.04%) ⬇️
pymc/distributions/discrete.py 71.35% <0.00%> (-27.87%) ⬇️
pymc/tuning/scaling.py 35.18% <0.00%> (-24.08%) ⬇️
pymc/math.py 47.34% <0.00%> (-22.71%) ⬇️
pymc/distributions/continuous.py 76.31% <0.00%> (-21.17%) ⬇️
... and 22 more

@twiecki
Copy link
Member

twiecki commented May 28, 2022

Nice! However, this isn't the same thing, it's a different parameterization of the same thing. Like centered vs non-centered. So I don't think we should remove the previous one as people might be using it and then changing the parameterization might screw with the inference.

However, we could add it either as an option or a separate dist.

@ricardoV94
Copy link
Member

Nice! However, this isn't the same thing, it's a different parameterization of the same thing. Like centered vs non-centered. So I don't think we should remove the previous one as people might be using it and then changing the parameterization might screw with the inference.

However, we could add it either as an option or a separate dist.

What do you mean? This should be completely equivalent to what we had before.

@ricardoV94
Copy link
Member

ricardoV94 commented May 28, 2022

You might be thinking of the latent parametrizations but that was not what GRW did before (as that wouldn't have been a vanilla distribution)

Something like that would just require implementing a default (diff) transform which would definitely be helpful for random walks.

Alternatively it would be useful to allow different implementations between unobserved and observed variables. Either eagerly at creation time or, even better, lazily when compiling the logp, but the latter requires some more work (make sure we have the right deterministics).

@twiecki
Copy link
Member

twiecki commented May 31, 2022

Isn't this changing the previous parameterization: $x_t \sim \mathcal{N}(x_{t-1}, \sigma)$ to $x_t \sim x_{t-1} + \mathcal{N}(0, \sigma)$

@ricardoV94
Copy link
Member

ricardoV94 commented May 31, 2022

Isn't this changing the previous parameterization: xt∼N(xt−1,σ) to xt∼xt−1+N(0,σ)

Where would the parametrization be changed?

The random generator method is exactly the same as before if you check what was going on in the old rng_fn. The logp is exactly the same as before (otherwise this test would fail):

def test_gaussianrandomwalk(self):
def ref_logp(value, mu, sigma, steps):
# Relying on fact that init will be normal by default
return (
scipy.stats.norm.logpdf(value[0])
+ scipy.stats.norm.logpdf(np.diff(value), mu, sigma).sum()
)
check_logp(
pm.GaussianRandomWalk,
Vector(R, 4),
{"mu": R, "sigma": Rplus, "steps": Nat},
ref_logp,
decimal=select_by_precision(float64=6, float32=1),
)

The expression $x_{t-1}+N(0,σ)$ "does not have a logp", it always has to be converted to $N(x_{t-1},σ)$, which Aeppl does when it sees it or, in this case, when it sees $x \sim \text{cumsum}(N(0, σ))$

Now for NUTS sampling (when we have unobserved GRW) it would probably be better if we were sampling in the latent space $x_{\text{raw}} \sim N(0, σ)$ and only then did the deterministic mapping to $x = \text{cumsum}(x_{\text{raw}})$, but this was not done before nor is it now.

For that we would need to either: 1) Add a new type of transform or 2) Have a way to create different graphs depending on whether the variables are observed or not.

@twiecki
Copy link
Member

twiecki commented May 31, 2022

I see, so there is no change from this implementation to the previous in v4, but it is different compared to v3, right?

@ricardoV94
Copy link
Member

ricardoV94 commented May 31, 2022

I see, so there is no change from this implementation to the previous in v4, but it is different compared to v3, right?

No, V3 did the same as well. The logp is written a bit different, but it's equivalent to the manual logp we had here in V4 up to this PR:

def logp(self, x):
"""
Calculate log-probability of Gaussian Random Walk distribution at specified value.
Parameters
----------
x: numeric
Value for which log-probability is calculated.
Returns
-------
TensorVariable
"""
if x.ndim > 0:
x_im1 = x[:-1]
x_i = x[1:]
mu, sigma = self._mu_and_sigma(self.mu, self.sigma)
innov_like = Normal.dist(mu=x_im1 + mu, sigma=sigma).logp(x_i)
return self.init.logp(x[0]) + tt.sum(innov_like)
return self.init.logp(x)

And there was no special transformation either.

@twiecki
Copy link
Member

twiecki commented May 31, 2022

Oh right, I read you previous comment again. It's always been $x_t \sim N(x_{t-1}, \sigma)$, then this makes sense 👍

Copy link
Member Author

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ricardoV94 about a question regarding retrieving dist_params from the random graph

pymc/distributions/timeseries.py Outdated Show resolved Hide resolved
@canyon289
Copy link
Member

canyon289 commented Jun 5, 2022

cc @nicospinu for GSOC reference

This PR will end up being a good reference for a side by side comparison of how time series work when implemented in PyMC and how they'll be implemented if more integrated with AePPL

@larryshamalama
Copy link
Member Author

To my knowledge, three sets of tests are failing and they are definitely worth looking into.

  • (Static?) shape checks
  • pm.logp yields component-wise log-probability rather than a sum, which seems to be the case in the previous API
  • Inferring step sizes from shapes. Would this be possible with AePPL?

Would these issues be important to address? To my knowledge, some of these, especially static shape inference, would take substantial work to be included in Aesara. Happy to hear thoughts about this

Uncommenting the error pertaining to unexpected rv nodes here still yields many errors, so this would be important to look into

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 5, 2022

To my knowledge, three sets of tests are failing and they are definitely worth looking into.

  • (Static?) shape checks

Can you expand?

  • pm.logp yields component-wise log-probability rather than a sum, which seems to be the case in the previous API

Yes, that behaves differently, but I don't think it poses a significant problem. We can just change the expectation of the tests.

  • Inferring step sizes from shapes. Would this be possible with AePPL?

That should work the same way as before. We do that in AR which is also a symbolic dist. It happens before any call to rv_op. It does not interfere or depend on Aeppl

@larryshamalama
Copy link
Member Author

To my knowledge, three sets of tests are failing and they are definitely worth looking into.

  • (Static?) shape checks

Can you expand?

Some tests that checked for shape inference were failing. They seem to have gone away whether this is related or not to the implemented change_size. However, it is incorrectly implemented. Can you have a look at this? Notably, test_shape_ellipsis is failing. I will tag you in a review.

  • pm.logp yields component-wise log-probability rather than a sum, which seems to be the case in the previous API

Yes, that behaves differently, but I don't think it poses a significant problem. We can just change the expectation of the tests.

In the tests, I summed up the logp terms, for now.

Copy link
Member Author

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that I'm getting closer, minus the implementation of change_size.

pymc/distributions/timeseries.py Show resolved Hide resolved
pymc/tests/test_distributions_timeseries.py Show resolved Hide resolved
pymc/tests/test_distributions_timeseries.py Show resolved Hide resolved
@larryshamalama larryshamalama requested a review from ricardoV94 June 7, 2022 01:24
pymc/distributions/timeseries.py Outdated Show resolved Hide resolved
pymc/distributions/timeseries.py Outdated Show resolved Hide resolved
@larryshamalama larryshamalama marked this pull request as ready for review June 7, 2022 13:09
@larryshamalama larryshamalama marked this pull request as draft June 7, 2022 16:02
@larryshamalama
Copy link
Member Author

Converted this PR back to draft. These lines probably warrant discussion as the implementation of the GaussianRandomWalk wouldn't work. Other things to be done:

  • Moment tests
  • Test to catch NotImplementedError in moment dispatching of CumOp
  • Investigating test_gaussianrandomwalk in test_distributions.py

Copy link
Member Author

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes from the meeting on Friday, June 10, 2022

pymc/distributions/timeseries.py Show resolved Hide resolved
Copy link
Member Author

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisiting this PR after a hiatus. I'm not sure if the presence of random variables in the logp graph is due to retrieving distribution/rv static shapes.

pymc/distributions/timeseries.py Show resolved Hide resolved
Copy link
Member Author

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be the appropriate hackish fix? 😅 It seems that the presence of random variables in the logp graph is not an AePPL problem but rather PyMC-related

pymc/distributions/logprob.py Outdated Show resolved Hide resolved
@larryshamalama larryshamalama marked this pull request as ready for review June 29, 2022 11:49
@larryshamalama
Copy link
Member Author

Still needs a test for moments

@larryshamalama
Copy link
Member Author

Actually, I'm realizing that we need a dispatch for the DimShuffle op because moment(mu[..., None]) is returning a NotImplementedError. @ricardoV94 Does this sound right? This would be useful for the following model:

with pm.Model() as model:
    mu = pm.Normal("mu", 2, 3)
    sigma = pm.Gamma("sigma", 1, 1)
    grw = pm.GaussianRandomWalk(name="grw", mu=mu, sigma=sigma, init_dist=pm.StudentT.dist(5), steps=10)

@ricardoV94
Copy link
Member

Actually, I'm realizing that we need a dispatch for the DimShuffle op because moment(mu[..., None]) is returning a NotImplementedError. @ricardoV94 Does this sound right? This would be useful for the following model:

with pm.Model() as model:
    mu = pm.Normal("mu", 2, 3)
    sigma = pm.Gamma("sigma", 1, 1)
    grw = pm.GaussianRandomWalk(name="grw", mu=mu, sigma=sigma, init_dist=pm.StudentT.dist(5), steps=10)

We can handle it in the same dispatch moment function, just have an if branch for that. It's the same discussion we had before of whether we specialize or create more generalized moment functions.

@larryshamalama
Copy link
Member Author

Actually, I'm realizing that we need a dispatch for the DimShuffle op because moment(mu[..., None]) is returning a NotImplementedError. @ricardoV94 Does this sound right? This would be useful for the following model:

with pm.Model() as model:
    mu = pm.Normal("mu", 2, 3)
    sigma = pm.Gamma("sigma", 1, 1)
    grw = pm.GaussianRandomWalk(name="grw", mu=mu, sigma=sigma, init_dist=pm.StudentT.dist(5), steps=10)

I realized that I could just do moment(mu)[..., None] instead 😅.

I just added tests for moments. I believe that this PR is ready to be merged, but moment tests are only passing locally so far and will be with PR 151 in AePPL merged.

@larryshamalama
Copy link
Member Author

With #5955 soon to be merged, I believe that this PR would be good to go? This PR adds a hackish tweak that we don't check for random variables in the ancestors of a Join Op due to issue 149 in AePPL, so we can perhaps think of a more sophisticated way of dealing with this later.

@ricardoV94
Copy link
Member

With #5955 soon to be merged, I believe that this PR would be good to go? This PR adds a hackish tweak that we don't check for random variables in the ancestors of a Join Op due to issue 149 in AePPL, so we can perhaps think of a more sophisticated way of dealing with this later.

I don't think we should proceed with the hack at all. It's mixing logic that should be separated. We should override aeppl join logprob instead as we have more strict requirements than they do, and do the constant fold of the shapes there.

@larryshamalama
Copy link
Member Author

larryshamalama commented Jul 8, 2022

Oh, my bad, and sounds good. It was recommended by @brandonwillard to edit a graph rewrite or register another one in the database for shape inference. ShapeFeature can also be added a listener on FunctionGraph calls if the former is not automatically included. This would be an AePPL-centred solution. An override within PyMC would also work, I suppose

@larryshamalama larryshamalama marked this pull request as draft July 12, 2022 11:11
Co-authored-by: Ricardo Vieira <[email protected]>

Co-authored-by: lucianopaz <[email protected]>
),
],
)
def test_gaussianrandomwalk(mu, sigma, init_dist, steps, size, expected):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a test of moments in test_distribution_timeseries

grw = pm.GaussianRandomWalk.dist(mu=0, sigma=1, steps=1, init_dist=init)
assert tuple(grw.owner.inputs[-2].shape.eval()) == ()
assert tuple(get_init_dist(grw).shape.eval()) == ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use tag


def test_shape_ellipsis(self):
grw = pm.GaussianRandomWalk.dist(
mu=0, sigma=1, steps=5, init_dist=pm.Normal.dist(), shape=(3, ...)
)
assert tuple(grw.shape.eval()) == (3, 6)
assert tuple(grw.owner.inputs[-2].shape.eval()) == (3,)
assert tuple(grw.owner.inputs[0].owner.inputs[1].owner.inputs[0].shape.eval()) == (3,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tag

10,
(4, 2),
np.full((4, 2, 11), np.vstack((np.arange(11), -np.arange(11)))),
),
],
)
def test_moment(self, mu, sigma, init_dist, steps, size, expected):
def test_moment(self, mu, sigma, init, steps, size, expected):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to rename init_dist to init

@ricardoV94
Copy link
Member

@larryshamalama do you feel like picking this up after #6072? I think that makes our lives easier here

@larryshamalama
Copy link
Member Author

Absolutely

@ricardoV94
Copy link
Member

Actually it seems like #6072 will have to include this one by necessity

@ricardoV94
Copy link
Member

Done in #6072

@ricardoV94 ricardoV94 closed this Sep 5, 2022
@larryshamalama larryshamalama deleted the grw_aeppl branch September 5, 2022 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorporate AePPL into GaussianRandomWalk
4 participants