-
Notifications
You must be signed in to change notification settings - Fork 17
Normalize #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Normalize #192
Conversation
src/normalize.jl
Outdated
function rescale( | ||
alg::Algorithm"exact", | ||
tn::AbstractITensorNetwork; | ||
vs_to_rescale=collect(vertices(tn)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a little bit "off" to me to have this. I guess the idea is that you might only want to normalize a region of the network?
In that case maybe I would say that you should slice out that subgraph of the network, i.e. tn_sub = subgraph(tn, subverts)
, and then slice it back into the original network.
I'm kind of on the fence about that but I would bias towards making functions simpler and composable with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see one reason for this is that vs_to_rescale
is then passed along to the cache versions of these functions, and updating a subcache may be a bit trickier, but I think we should discuss if we can avoid this keyword argument at both levels in a nice way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it may be a bit tricky since the normalization/rescaling would be based on messages/environment tensors coming into the region being normalized/rescaled. But anyway, I think this is a subtle thing that we should discuss the design of, maybe there is another way to go with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being used here actually because sometimes you might be passing it a cache representing <psi|I|psi>
but only want to normalize the network |psi>
and not put the scale factors on the identity matrices (which rescale
will do if it were to treat every vertex equally)
I see what you mean that its a bit overly finicky. Another option would be just to expect the user to not pass a QuadraticFormNetwork
three-layer style cache in this case and only ever pass a two-layer one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that's an interesting example, let's think about how to handle that kind of situation. More generally, there definitely is some freedom in how the rescaling of a factor/partition might get distributed across tensors in the factor/partition and the code should be pretty agnostic about that.
One thought would be to introduce an interface function/callback function scale_factor(bp_cache::BeliefPropagationCache, pv::PartitionVertex, scale::Number)
where scale
is inv(region_scalar(bp_cache, pv))
that defaults to distributing the scale factor scale
across each tensor in pv
equally, but then could be specialized in certain cases to distribute it unequally. That is kind of related to the discussion in #192 (comment) in that there are different choices for how tensors might be scaled and it isn't obvious what the best choice is in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah a basic version of that behavior is also happening for the bpc::BeliefPropagationCache
in that rescale_partitions(bpc, ...)
takes a keyword argument telling it which tensors to do the rescaling on (and then it is done equally across all of those).
These could be made more sophisticated by doing it based on weights etc but I guess that comes with more code complexity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, though I'm wondering if there is just a different way to think about the design where we get both more flexibility but also a simple code design. I have a feeling there is if we step back from the problem a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtfishman I updated the code a bit now. Basically all the rescaling now (whether "exact" or "bp") figures out a dictionary mapping the vertices of the tensor network to the weights they are going to be scaled bp and then
passes down to a new function in abstractitensornetwork.jl
defined as scale(tn::AbstractITensorNetwork, vertices_weights)
which takes an ITensorNetwork
and a dictionary of weights to scale the vertices of the network by.
Let me know what you think
Co-authored-by: Matt Fishman <[email protected]>
Co-authored-by: Matt Fishman <[email protected]>
@JoeyT1994 it looks like the tests are crashing in |
@mtfishman Yeah you're right it is in |
@mtfishman Yes the source of the failing tests is |
Co-authored-by: Matt Fishman <[email protected]>
This PR adds support for normalizing tensor networks with either a BP backend or an exact backend.
Specifically given an
ITensorNetwork
tn
we can calltn_normalized = normalize(tn; alg)
to enforcetn_normalized * dag(tn_normalized) == 1
within the framework of the desired algorithm.This is particularly useful in the context of
alg = "bp"
as it stabilizes the fixed point of belief propagation such that the norm of the message tensors is stable when running subsequent bp iterations ontn_normalized
.@mtfishman this is a routine that I am calling frequently in
bp_alternating_update
and so I thought I would add it. I also think it is generally useful when doing things like TEBD to keep thebp_norm
more stable.