Skip to content

Normalize #192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 89 commits into
base: main
Choose a base branch
from
Open

Normalize #192

wants to merge 89 commits into from

Conversation

JoeyT1994
Copy link
Contributor

This PR adds support for normalizing tensor networks with either a BP backend or an exact backend.

Specifically given an ITensorNetwork tn we can call tn_normalized = normalize(tn; alg) to enforce tn_normalized * dag(tn_normalized) == 1 within the framework of the desired algorithm.

This is particularly useful in the context of alg = "bp" as it stabilizes the fixed point of belief propagation such that the norm of the message tensors is stable when running subsequent bp iterations on tn_normalized.

@mtfishman this is a routine that I am calling frequently in bp_alternating_update and so I thought I would add it. I also think it is generally useful when doing things like TEBD to keep the bp_norm more stable.

src/normalize.jl Outdated
function rescale(
alg::Algorithm"exact",
tn::AbstractITensorNetwork;
vs_to_rescale=collect(vertices(tn)),
Copy link
Member

@mtfishman mtfishman Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little bit "off" to me to have this. I guess the idea is that you might only want to normalize a region of the network?

In that case maybe I would say that you should slice out that subgraph of the network, i.e. tn_sub = subgraph(tn, subverts), and then slice it back into the original network.

I'm kind of on the fence about that but I would bias towards making functions simpler and composable with each other.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see one reason for this is that vs_to_rescale is then passed along to the cache versions of these functions, and updating a subcache may be a bit trickier, but I think we should discuss if we can avoid this keyword argument at both levels in a nice way.

Copy link
Member

@mtfishman mtfishman Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it may be a bit tricky since the normalization/rescaling would be based on messages/environment tensors coming into the region being normalized/rescaled. But anyway, I think this is a subtle thing that we should discuss the design of, maybe there is another way to go with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's being used here actually because sometimes you might be passing it a cache representing <psi|I|psi> but only want to normalize the network |psi> and not put the scale factors on the identity matrices (which rescale will do if it were to treat every vertex equally)

I see what you mean that its a bit overly finicky. Another option would be just to expect the user to not pass a QuadraticFormNetwork three-layer style cache in this case and only ever pass a two-layer one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that's an interesting example, let's think about how to handle that kind of situation. More generally, there definitely is some freedom in how the rescaling of a factor/partition might get distributed across tensors in the factor/partition and the code should be pretty agnostic about that.

One thought would be to introduce an interface function/callback function scale_factor(bp_cache::BeliefPropagationCache, pv::PartitionVertex, scale::Number) where scale is inv(region_scalar(bp_cache, pv)) that defaults to distributing the scale factor scale across each tensor in pv equally, but then could be specialized in certain cases to distribute it unequally. That is kind of related to the discussion in #192 (comment) in that there are different choices for how tensors might be scaled and it isn't obvious what the best choice is in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah a basic version of that behavior is also happening for the bpc::BeliefPropagationCache in that rescale_partitions(bpc, ...) takes a keyword argument telling it which tensors to do the rescaling on (and then it is done equally across all of those).

These could be made more sophisticated by doing it based on weights etc but I guess that comes with more code complexity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, though I'm wondering if there is just a different way to think about the design where we get both more flexibility but also a simple code design. I have a feeling there is if we step back from the problem a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtfishman I updated the code a bit now. Basically all the rescaling now (whether "exact" or "bp") figures out a dictionary mapping the vertices of the tensor network to the weights they are going to be scaled bp and then
passes down to a new function in abstractitensornetwork.jl defined as scale(tn::AbstractITensorNetwork, vertices_weights) which takes an ITensorNetwork and a dictionary of weights to scale the vertices of the network by.
Let me know what you think

@mtfishman
Copy link
Member

@JoeyT1994 it looks like the tests are crashing in test_contraction_sequence.jl (since it shows test_contract_deltas.jl passing as the last thing before it crashes). Any idea what's going on there? Maybe we should just disable those tests for now and bring them back once we figure out what's going on.

@JoeyT1994
Copy link
Contributor Author

JoeyT1994 commented Apr 21, 2025

@mtfishman Yeah you're right it is in contraction_sequences.jl because when I comment that all out the tests pass. Let me try to see which backend it is. It might be EinExprs.jl so I am going to uncomment the rest and see if things pass that way.

@JoeyT1994
Copy link
Contributor Author

@mtfishman Yes the source of the failing tests is EinExprs.jl as when I comment out those specific tests and loading of that extension things pass. I will keep them commented out for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants