Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support and testing for equality-atom and if-then-else based RDDL-style boolean interoperability #120

Draft
wants to merge 3 commits into
base: devel
Choose a base branch
from

Conversation

mejrpete
Copy link
Collaborator

@mejrpete mejrpete commented Aug 2, 2021

Note that this PR is submitted as a draft, since the true PR should probably be opened against a separate feature branch, rather than devel.

Boolean interop for RDDL requires the ability to move from Term -> Formula and from Formula -> Term within the AST built in Tarski as discussed in #91. In general, we need the ability to construct nested formulae which include both logical operations on expressions with atoms based on the numerically-derived Boolean sort, and direct mathematical operations on these items.

To accomplish this in practice, two patterns can be used.

1) Term -> Formula:

Given a Term with sort Boolean, we can use the typical equality predicate and a constant True/1 value to wrap to a Formula (for a language which has both the Boolean and Equality theories attached)

a = lang.function('a', lang.Boolean)
b = lang.function('b', lang.Boolean)
bool_t = lang.constant(1, lang.Boolean)
logical_formula = (a == bool_t) | (b == bool_t)

2) Formula -> Term:

Given a Formula, we can wrap out to a Term with sort Boolean using the IfThenElse term, and corresponding boolean constants.

a = lang.function('a', lang.Boolean)
b = lang.function('b', lang.Boolean)
bool_t = lang.constant(1, lang.Boolean)
logical_formula = (a == bool_t) | (b == bool_t)
math_included = ite(logical_formula, bool_t, lang.constant(0, lang.Boolean)) + 1 

With these two patterns, we can arbitrarily move back and forth between Formula and Term objects depending on the operations needed when building the AST in Tarski. Additionally, because these patterns are easily identifiable with simple pattern matching, when performing write-side IO for an RDDL domain which includes these wrappers within the Tarski representation, we can remove the wrappers accordingly, and generate legal RDDL which does not include the additional representational overhead involved with the wrapper patterns discussed above.

The Tarski modeling approach for RDDL using these patterns involves representing Boolean RDDL values as Boolean (sort) codomain functions, with equality-based atoms as the basic logical type (e.g. (a == lang.constant(1, lang.Boolean)). This is in contrast to the approach explored in the rddl-support branch (#96), which assumed the use of Predicate for representation of Boolean RDDL values. One additional advantage of this change is that while the closed-world assumption prohibits the declaration of false-valued non-fluents and initial-state state-fluents (necessary in RDDL, since Boolean valued fluents can be declared with default values of true) in an implementation based on Predicate representations for these fluents (as in #96), the use of Boolean codomain functions in this alternate approach trivially avoids this issue.

Included changes/additions:

  1. Incorporating a Boolean sort descended from Natural when the Boolean theory is attached to a FirstOrderLanguage
  2. Addition of Bernoulli as a builtin for the Random theory
  3. Implementation additions in RDDL write IO in order to "unwrap" the two major patterns used to translate back and forth between Formula and Term items when using the Boolean sort as described.
  4. Minor changes to write true and false literals rather than 0/1 when writing RDDL from our representation (needed as far as I know for appropriate behavior with both the PROST and rddlsim RDDL parsers)
  5. Test additions & modifications to test RDDL write-side IO support, along with tests for the Boolean theory changes

Omissions/Necessary future discussion

  1. Current tests are focused on write-side RDDL IO. Neither the read-side tests, nor the read-side RDDL IO code has been modified to automatically wrap back and forth using the two patterns above.
  2. From a user perspective (when using Tarski as a tool to directly construct the AST and domain/instance models for RDDL-specified problems) requiring the explicit use of these wrappers puts significant onus on the user to understand the distinction between Formula and Term within Tarski in order to appropriately use these wrappers and correctly construct their domain/instance. While this is potentially workable, it would be preferable from the user's perspective to have some level of "automatic" injection of wrappers in instances where there is a mismatch between input to an operator and the expected object type (e.g. when dispatching a lor called from a Formula where the argument is a Term with sort Boolean). This would involve building recovery logic with knowledge of the Boolean sort into Tarski's operator dispatching implementation, which may or may not be acceptable from a design standpoint. This may be a worthwhile point of discussion!

mejrpete added 3 commits July 22, 2021 11:20
…ort equality-atom based boolean interop development

Tests are focused on supporting Boolean/numeric interoperability for RDDL as discussed in aig-upf#91.
Currently passing all relevent tests. Read-side IO is untested. Write-side IO does not test necessary simplifications
from the equality-atom, if-then-else wrapper, and sumterm-based quantifier replacements to typical RDDL patterns.
…ls rather than 1/0 in written files for bool vars
@miquelramirez
Copy link
Member

miquelramirez commented Oct 28, 2021

Hi @mejrpete,

  1. Current tests are focused on write-side RDDL IO. Neither the read-side tests, nor the read-side RDDL IO code has been modified to automatically wrap back and forth using the two patterns above.

That is okay, will check, but the problem is that I don't really have very strong "constraints" on this. Basically because I haven't any planners using the RDDL -> AST features.

  1. From a user perspective (when using Tarski as a tool to directly construct the AST and domain/instance models for RDDL-specified problems) requiring the explicit use of these wrappers puts significant onus on the user to understand the distinction between Formula and Term within Tarski in order to appropriately use these wrappers and correctly construct their domain/instance. While this is potentially workable, it would be preferable from the user's perspective to have some level of "automatic" injection of wrappers in instances where there is a mismatch between input to an operator and the expected object type (e.g. when dispatching a lor called from a Formula where the argument is a Term with sort Boolean). This would involve building recovery logic with knowledge of the Boolean sort into Tarski's operator dispatching implementation, which may or may not be acceptable from a design standpoint. This may be a worthwhile point of discussion!

The "significant onus on the user" is a bit fuzzy for me, I need to see specific examples where this becomes an issue.

Python being Python (and not C++) it is not clear to me what degree of "automation" is possible, analogous to what would be the approach in C++ (e.g. generic programming, traits, concepts, etc.) to essentially implement what is code generation. Overloading operators in Python was a massive pain in the bum and the best we could come up with were the frankly, quite bothersome, concept of TermReference to wrap terms so we could use them in standard containers.

If you could come with a concrete example that illustrates this issue - on a gist perhaps - I would appreciate it.

Regards,

Miquel.

Copy link
Member

@miquelramirez miquelramirez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mejrpete,

it is looking good other than the comment regarding the ValueError being raised without data. Get back to me and I am happy to approve.

@@ -610,7 +704,9 @@ def rewrite(self, expr):
if len(re_st) > 0:
# MRJ: Random variables need parenthesis, other functions need
# brackets...
if expr.symbol.symbol in builtins.get_random_binary_functions():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this one was a bit of a puzzle back in the day. Nice improvement.

@@ -14,6 +14,14 @@ def normal(mu, sigma):
return np.random.normal(mu, sigma)
return normal_func(mu, sigma)

def bernoulli(p):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers for adding the "master" distribution

@@ -29,7 +29,7 @@ def __hash__(self):
return hash(self.name)

def __eq__(self, other):
return self.name == other.name
return self.name == other.name and self.language == other.language
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one

@@ -191,25 +191,14 @@ def children(s: Sort) -> Set[Sort]:

def int_encode_fn(x):
if isinstance(x, float) and not x.is_integer():
raise ValueError(x) # We don't want 1.2 to get encoded as an int
raise ValueError() # We don't want 1.2 to get encoded as an int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mejrpete,

is there any reason for raising the exception with no data?

@miquelramirez
Copy link
Member

I have been thinking a bit about the issues regarding modelling support @mejrpete you mentioned earlier.

Thinking a bit about it I see the following as possible "helpers":

  1. Add literal or lt global function

This would be a global function much like symref is, and with a similar purpose. The function would be something along the lines of:

def lt(t: CompoundTerm, v: Constant) -> Atom:
    if t.symbol.builtin: 
        raise SyntaxError("Cannot define literals for built-in symbol: {}".format(str(t.symbol)))
    return t == v

So one could write:

condition = land( lt(f(a), 1), lt(g(b), 1)) # f(a) = 1 /\ g(b) = 1
condition2  = lor( lt(f(a), 0), lt(g(c), 1)) # f(a) = 0 -> g(c) = 1
  1. Overload operators of CompoundTerm

This would require to overwrite a binary operator like @ for CompoundTerm, encapsulating the same as the proposed function lt above. Then we would end up with something like:

condition = land( f(a)@1, g(b)@1) # for  f(a) = 1 /\ g(b) = 1
condition2  = lor( f(a)@0, g(c)@1) # for f(a) = 0 -> g(c) = 1

What do you think? any other ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution Contributions to Tarski enhancement Enhances existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Boolean-valued fluents and RDDL-style boolean & numeric interoperability
2 participants