7 implement ot #47

joannacknight · 2023-04-21T09:51:02Z

The main features of the PR are:

Introduced a new DistanceMetric class
Removed the mmd.py script and created MMD as a child class of DistanceMetric
Introduced a new OTDD class that calculates the otdd between two datasets
Updated the compute_similarity method of DMPair so that it returns labels as well as the data, also it now returns all data as Numpy arrays (previously it could be either Numpy or a Tensor depending on whether it was only using train data or not)
Introduced a test_otdd.py script for testing the OTDD class

Some thoughts on testing:

The PyTest tests can be run as they are but further otdd configurations could be added into the metrics.yaml file - currently there are only two options there and there are a lot of parameters to experiment with
As the mmd code has changed, it is necessary to verify this still works and returns the expected results
When I was testing I experienced the otdd tests failing inconsistently for the otdd_naive_upperbound test parameters. After setting a random seed when initialising the OTDD object the tests no longer failed... however, I am not 100% sure if this is a definite fix.
The calculate_metrics.py script also needs testing

Some further work that needs considering after this:

Further safeguarding to prevent metrics being calculated on Apple M1 cpu (raised issue Prevent the metrics being calculated on M1 cpu #45 )
The parameters to be set for OTDD - will the exact calculation be too computationally intensive? (raised issue Determine parameters for OTDD #48)
Do we want any other implementations of OT? If so, I envisage them as being set up as their own class rather than an option in the OTDD class (raised issue Other implementations of OT #49)

lannelin

Nice work! I've left a few comments and happy to discuss further.
Re. points you made in opening comment:

The PyTest tests can be run as they are but further otdd configurations could be added into the metrics.yaml file - currently there are only two options there and there are a lot of parameters to experiment with

I don't think we should worry too much about testing all param combinations, we should be able to assume that otdd has been tested. We just need to sanity check we're calling it correctly and I think we've done that

When I was testing I experienced the otdd tests failing inconsistently for the otdd_naive_upperbound test parameters. After setting a random seed when initialising the OTDD object the tests no longer failed... however, I am not 100% sure if this is a definite fix.

ok seems reasonable. have made a specific comment on the seeding

The calculate_metrics.py script also needs testing

Agreed, up to you whether you want to create a dedicated test_calculate_metrics.py or test locally

Further safeguarding to prevent metrics being calculated on Apple M1 cpu

agree, have made some comments

The parameters to be set for OTDD - will the exact calculation be too computationally intensive

very good point. Let's try it out on Baskerville and make some adjustments if necessary. can be done outside of this PR

Do we want any other implementations of OT? If so, I envisage them as being set up as their own class rather than an option in the OTDD class

maybe and yes I think a separate class would be appropriate

src/modsim2/data/loader.py

src/modsim2/similarity/metrics.py

README.md

src/modsim2/similarity/constants.py

src/modsim2/data/loader.py

src/modsim2/similarity/metrics.py

philswatton

Overall looks really good and a huge improvement on the previous approach to the metrics. I've suggested some fairly minor changes

philswatton · 2023-04-24T08:55:39Z

.gitignore

 # Slurm outputs
 slurm_logs/
+
+.DS_Store


With #46 merged into develop there's going to be a merge conflict on this file (and maybe some others) - worth doing git merge and resolving it as part of this PR

src/modsim2/data/loader.py

src/modsim2/similarity/metrics.py

tests/test_otdd.py

philswatton · 2023-04-24T09:43:54Z

Also, agreed re OT requiring a separate implementation to OTDD

philswatton

Looks good. The merge needs to be done and resolved, and I have some additional comments (one of which I really should have thought of last time!). Overall looks better again!

src/modsim2/data/loader.py

src/modsim2/similarity/metrics/otdd.py

philswatton · 2023-04-27T14:18:17Z

tests/test_otdd.py

+# When exact calculations are used then this will be zero
+# Approximate methods may be non-zero and these are checked against the known value
+# for the seed
+def test_cifar_otdd_same(metrics_config: dict):


I'm still getting a failure on both tests even with a fresh installation of everything, so we may need to settle for the 'within a range' test for the OTDD tests!

As discussed, the next update will change the tests to check for 'closeness' rather than equality

philswatton · 2023-04-27T14:21:17Z

src/modsim2/similarity/metrics/mmd.py

+    def __init__(self, seed: int):
+        # Kernel dictionary
+        self.__MMD_KERNEL_DICT = {
+            "rbf": metrics.pairwise.rbf_kernel,
+            "laplace": metrics.pairwise.laplacian_kernel,
+        }


Missing super().__init__(seed) call

I've added this in though thought the mmd calculations were deterministic

Oh true! My bad!

lannelin

looks good, nice work. Just very minor points and then the issue (as discussed on slack) with checking values are close rather than exact matches.

src/modsim2/similarity/metrics/otdd.py

lannelin · 2023-04-28T12:56:53Z

tests/test_data.py

@@ -367,11 +367,19 @@ def test_get_AB_data():
    train_data_b, val_data_b = dmpair.get_B_data()

    # a train
-    _compare_dataloader_to_tensor(dl=dmpair.A.train_dataloader(), data=train_data_a)
+    _compare_dataloader_to_tensor(
+        dl=dmpair.A.train_dataloader(), data=torch.tensor(train_data_a)


torch.from_numpy rather than torch.tensor - from_numpy ensures uses the same memory https://pytorch.org/docs/stable/generated/torch.from_numpy.html

lannelin · 2023-04-28T12:58:42Z

tests/test_otdd.py

+    platform.processor() == "arm",
+    reason="These tests should not be run on Apple M1 devices",
+)
+def test_cifar_otdd_different(metrics_config: dict):


similar to Phil's comment, and as discussed on slack, different vals on my laptop vs azure - closeness probably best here
FAILED tests/test_otdd.py::test_cifar_otdd_different - assert 98.62165832519531 == 98.62129211425781

joannacknight · 2023-05-03T11:23:56Z

The updates to the otdd tests are now ready to be tested.

I wanted to write the tests so that only one assert statement would be made for each test - it would be best practice to implement them that way so that all the metric configs get tested. Currently if the first one fails, then the others don't run and I thought it would be useful when debugging to know if it's only one that has failed or all of them, for example.

I thought this should be possible by parametrising the tests (https://docs.pytest.org/en/7.1.x/example/parametrize.html). I could write the tests in pseudo-code and then had an L&D day on Friday trying to get this working in Pytest, however it doesn’t seem possible to use fixtures directly as arguments, e.g. I wanted to write something like the following, but it throws an error due to metrics_config being a fixture:
@pytest.mark.parametrize(‘metric_config’, [metric_config for metric_config in metrics_config])

I couldn’t find a way around it so I’ve implemented a work-around that still runs multiple checks in each test, but it will run all checks and then print out the details of any failures. You need to run pytest in verbose mode (-v) in order to see the details of the failures if there are many. My preference is to run pytest --disable-warnings -v tests to get rid of all the warning messages too.

I don't mind changing the code back to the multiple assert statements in one test if the preference is to keep them consistent with the mmd tests.

A few notes:

I have set the expected values for the otdd tests in the metrics.yaml file to be the values that James posted in slack
With the expected values set as they are:
- Two otdd tests should be skipped for Phil, and the third (checking that an error is thrown) should pass
- Two otdd tests should pass for James, and the third should be skipped
I have used the default tolerance values in the isclose calculation - do these need changing? May depend on James’ results
It is obviously possible to amend the expected values in the metrics.yaml file and the code in the skip statements to test the tests using different scenarios.

philswatton

One minor comment, but looks good to me and happy for this to be merged in without further review. I can confirm that 23 tests pass and 2 tests are skipped for me!

philswatton · 2023-05-04T09:55:13Z

.gitignore

@@ -151,5 +151,8 @@ attack_scripts/
 results/

 # Slurm outputs
+slurm_logs/


I think this line was removed in a previous PR and may have been re-added due to the merge conflict? It's no longer needed

lannelin · 2023-05-05T10:23:04Z

tests are failing for me.
Macbook:

E           Failed: test:otdd_exact/same_result, expected result: 0.0, actual result: 98.62165069580078
E           test:otdd_naive_upperbound/same_result, expected result: 256.1785888671875, actual result: 306.5455017089844
E           test:otdd_exact/same_result_only_train, expected result: 0.0, actual result: 98.76387786865234
E           test:otdd_naive_upperbound/same_result_only_train, expected result: 256.1785888671875, actual result: 306.96649169921875

azure:

E           Failed: test:otdd_exact/same_result, expected result: 0.0, actual result: 98.62165069580078
E           test:otdd_naive_upperbound/same_result, expected result: 256.1785888671875, actual result: 306.5455017089844
E           test:otdd_exact/same_result_only_train, expected result: 0.0, actual result: 98.76387023925781
E           test:otdd_naive_upperbound/same_result_only_train, expected result: 256.1785888671875, actual result: 306.96649169921875

slight difference there in otdd_exact/same_result_only_train!

aside, is the expected val for same_result_only_train worryingly high, perhaps not?

lannelin

looks great and passes on my Mac and Azure.

Have made a suggestion on the tests structure. Happy with whatever you choose but should be consistent across metrics.

Nice work.

lannelin · 2023-05-05T11:16:44Z

tests/test_otdd.py

+        "diff_result": similarity_dict,
+        "diff_result_only_train": similarity_dict_only_train,
+    }
+    failures = []


I like the reporting that we get out of this approach.
You might want to consider using something like pytest-check that achieves a similar outcome.

Suggested change

failures = []

from pytest_check import check # move this up

for scenario, results in test_scenarios.items():

for k in metrics_config:

expected_result = metrics_config[k]["expected_results"][scenario]

actual_result = results[k]

with check:

assert np.isclose(actual_result, expected_result, rtol=1e-5, atol=1e-8)

if you run pytest in verbose then you still get the detailed output of mismatched values as you've captured here.

If y

happy to keep this if you'd prefer not to introduce another dependency.
Whichever approach is taken, we should do the same in the mmd tests.

joannacknight added 14 commits April 13, 2023 14:54

Initial commit placeholders for ot

69af68a

Merge remote-tracking branch 'origin/develop' into 7-implement-ot

5467727

update poetry setup to include otdd dependencies

d94bed5

add tag for otdd poetry dependency

0c9e0ab

implemented vanilla otdd example

f60d35d

update to pyproject.toml

4dc49ba

Initial commit placeholders for ot

6f036b0

update poetry setup to include otdd dependencies

bfa20b7

add tag for otdd poetry dependency

a4c44c3

implemented vanilla otdd example

7f43993

Trying to merge

ca15034

update to output of ot tests

bba1cf0

replace individual metric functions with class

14e7e36

otdd calc return float instead of tensor

fdedd2e

joannacknight requested review from lannelin and philswatton April 21, 2023 09:51

lannelin requested changes Apr 21, 2023

View reviewed changes

joannacknight added 2 commits April 21, 2023 15:13

update to compute_similarity method and related var names

2eb487d

Split metrics.py into one file per class

63c9249

philswatton requested changes Apr 24, 2023

View reviewed changes

joannacknight added 3 commits April 24, 2023 10:59

move metrics config test setup to pytest fixtures

5523750

data loader get A B data returns numpy arrays

ad374a9

re-format mmd tests

ff6951d

joannacknight mentioned this pull request Apr 25, 2023

Determine parameters for OTDD #48

Closed

joannacknight added 2 commits April 26, 2023 09:18

updates to pytest fixtures

b91effc

update to pytest fixtures in metrics tests

a844ca2

joannacknight requested review from philswatton and lannelin April 26, 2023 08:59

joannacknight linked an issue Apr 27, 2023 that may be closed by this pull request

Implement Optimal Transport Metrics #7

Closed

philswatton requested changes Apr 27, 2023

View reviewed changes

update to otdd tests - skip if on M1

fc31800

lannelin requested changes Apr 28, 2023

View reviewed changes

joannacknight added 3 commits May 3, 2023 10:29

otdd tests changed to isclose

8e13c27

update comments in test_otdd.py

b52248c

merge with develop

5ce86da

joannacknight requested review from lannelin and philswatton May 3, 2023 13:34

philswatton approved these changes May 4, 2023

View reviewed changes

bug fixing in test_otdd

42d8bad

lannelin reviewed May 5, 2023

View reviewed changes

lannelin approved these changes May 5, 2023

View reviewed changes

joannacknight added 2 commits May 5, 2023 16:37

include pytest-check test_otdd.py

522d3df

update to test_mmd to use pytest-check

05c9f52

joannacknight merged commit 0d205d5 into develop May 9, 2023

joannacknight deleted the 7-implement-ot branch May 9, 2023 09:00

philswatton mentioned this pull request May 9, 2023

33 transforms group #51

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

7 implement ot #47

7 implement ot #47

joannacknight commented Apr 21, 2023 •

edited

Loading

lannelin left a comment

philswatton left a comment

philswatton Apr 24, 2023

philswatton commented Apr 24, 2023

philswatton left a comment

philswatton Apr 27, 2023

joannacknight May 3, 2023

philswatton Apr 27, 2023

joannacknight May 3, 2023

philswatton May 4, 2023

lannelin left a comment

lannelin Apr 28, 2023

lannelin Apr 28, 2023

joannacknight commented May 3, 2023

philswatton left a comment

philswatton May 4, 2023

lannelin commented May 5, 2023

lannelin left a comment

lannelin May 5, 2023

lannelin May 5, 2023

-    failures = []
+    from pytest_check import check # move this up
+    for scenario, results in test_scenarios.items():
+        for k in metrics_config:
+            expected_result = metrics_config[k]["expected_results"][scenario]
+            actual_result = results[k]
+            with check:
+                assert np.isclose(actual_result, expected_result, rtol=1e-5, atol=1e-8)

7 implement ot #47

7 implement ot #47

Conversation

joannacknight commented Apr 21, 2023 • edited Loading

lannelin left a comment

Choose a reason for hiding this comment

philswatton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philswatton commented Apr 24, 2023

philswatton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lannelin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joannacknight commented May 3, 2023

philswatton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lannelin commented May 5, 2023

lannelin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joannacknight commented Apr 21, 2023 •

edited

Loading