Skip to content

Conversation

@sarakodeiri
Copy link
Collaborator

@sarakodeiri sarakodeiri commented Oct 16, 2025

PR Type

Feature

Short Description

Clickup Ticket(s): https://app.clickup.com/t/868fatfqq

Using the trained shadow models and their respective generated synthetic data, this PR adds the RMIA attack to the meta-classifier training pipeline.

Main changes:

  • Refactored the BlendingPlusPlus class in blending.py to accept attack_data_collection (result of shadow model training) and target_data_collection, and to calculate RMIA signals as part of meta-feature preparation. The _prepare_meta_features, fit, and predict methods now accept and propagate ID column data and names, ensuring correct alignment and integration of RMIA features.

  • Updated the meta-classifier training pipeline (run_metaclassifier_training.py) to load and pass shadow and target model data, extract and use trans_id columns, and ensure the attack results directory is created if missing. The function signature now requires explicit paths to attack and target data, and the pipeline is adjusted to handle these inputs.

  • Added id_column_name to data_types.json to support flexible and explicit ID column handling throughout the pipeline.

Tests Added

Added tests to tests/unit/attacks/ensemble/test_meta_classifier.py

Summary by CodeRabbit

  • New Features

    • Integrated RMIA signal calculation into the metaclassifier feature pipeline.
  • Configuration

    • Disabled shadow-model training by default and enabled metaclassifier training.
    • Added output path settings for shadow/target artifacts and an id_column_name config.
    • Added metaclassifier hyperparameter options (optuna trials, k-folds) and GPU toggle change.
  • Improvements

    • Training flows now produce and consume explicit model/data paths so metaclassifier can run without re-training shadows.
  • Tests

    • Updated unit tests to validate id-based RMIA flow and new training invocation patterns.

@sarakodeiri sarakodeiri marked this pull request as ready for review October 16, 2025 17:44
Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good start to the final piece for the attack. There are some things I think we can clean up and be a bit more explicit on. Let me know if you have questions regarding my comments!

attack_data_paths = [
"initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
"initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
"shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the third path here so different from the first two? If there a way for us to unify these so it's not the odd-one out?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reading below, it looks like we also expect this list to be exactly three model sets? So we should assert it is of the correct length here and throw an exception if not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original attack repository, they fine-tuned 4 shadow models, then fine-tuned another set of 4 shadow models, and then trained 8 models from scratch. That's why we have three sets and why the last set is the odd one out.
So far, Fatemeh and I decided to keep everything identical to the original repository, make sure it works within the context of this toolkit, and then do a major refactoring at the end. Everything else is also handled with the "three sets" format.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine, but as of now, we've "hard-coded" that there should be three. So we want to ensure it is actually that way, in case we never do the refactor or decide that the implementation they performed is sufficient.

W.r.t the path being largely different, I may be missing something here, but I don't see why those paths are as complex and/or different. Can we not just group them together somehow? If they are only the cluster and these need to be hardcoded, I'd say we have them as part of the config?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think my final question w.r.t the paths here is answered?

@emersodb emersodb self-requested a review October 22, 2025 20:33
Copy link
Collaborator

@emersodb emersodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all of the changes and improvements. Some additional comments or responses to consider. Getting closer to ready in my view 🙂

attack_data_paths = shadow_pipeline.run_shadow_model_training(config)
attack_data_paths = [Path(path) for path in attack_data_paths]

target_data_path = shadow_pipeline.run_target_model_training(config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tightly couples shadow model training with target model training. Based on our discussion, the target model is the one we're attacking right? Theoretically, this model may already exist and we just want to attack it. I.e. we may not always want to or be able to train it? Again, I may still be misunderstanding our vocabulary here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, the target model is the model we're attacking in the simulated setting, and is a shadow model. The main difference between the target model and the other shadow models is the data it's being trained on. I might be wrong, but I think the target model is being trained on the entire "real data" while the other shadow models are being trained/fine-tuned on different combinations and subsets of the population data. The docstring on train_three_sets_of_shadow_models in midst_toolkit.attacks.ensemble.rmia.shadow_model_training explains it in more detail.
Our vocabulary and the original implementation's vocabulary isn't the most simple and I'm still not sure if I've done the right thing here. I want to keep the PR open a bit longer to get more eyes on it and the concepts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure sounds good. Like I said, I'm not as deeply integrated in the vocabulary. So this may be perfectly reasonable. Just wanted to ask the question.

attack_data_paths = [
"initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
"initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
"shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think my final question w.r.t the paths here is answered?


def calculate_rmia_signals(
df_input: pd.DataFrame,
attack_data_collection: list[dict[str, list]],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can be more specific for the inner list type that would be great. If you can't, then list[Any] closes it off.

id_column_name: str,
id_column_data: pd.Series,
k: int = 5,
random_seed: int = 42,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in other comments, we don't want to make determinism the default if no random seed is provided.

[len(data.synthetic_data) for data in attack_data_collection[1]["fine_tuned_results"]],
[len(data.synthetic_data) for data in attack_data_collection[2]["trained_results"]],
[len(data) for data in attack_data_collection[2]["selected_sets"]],
[len(data) for data in attack_data_collection[0]["fine_tuning_sets"]],
Copy link
Collaborator

@emersodb emersodb Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor, but having the order not go [0], [1], [2] here ([2], [0], [1]) made me worried for a second until I realized it didn't matter for what you intended. However, that kind of "trip up" will likely get others. So I'd order these consistently.

)
shadow_gower_list.append(np.array(shadow_model_gower_2))

gower_shadows = np.vstack(shadow_gower_list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something here, but why bother with the appending your doing etc. You can condense those into just

gower_shadows = np.vstack([np.array(shadow_model_gower_0), np.array(shadow_model_gower_1), np.array(shadow_model_gower_2)])

can't you?

shadow_gower_list = []

all_lengths = [
[len(data.synthetic_data) for data in attack_data_collection[0]["fine_tuned_results"]],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You extract attack_data_collection[i] a bunch of times in this function. Pulling each of them into a named variable and just using that seems like a cleaner thing to do and you can give each an informative name.

@coderabbitai
Copy link

coderabbitai bot commented Oct 24, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The ensemble attack pipeline was extended to add RMIA signal computation, a new target-model training workflow, updated data/config paths and toggles, and propagated id-column-aware data flow and signatures through metaclassifier training and blending logic.

Changes

Cohort / File(s) Summary
Gitignore header
\.gitignore
Renamed comment header from # Dataset files to # Data files; existing path examples/**/data/ unchanged.
Configuration and data types
examples/ensemble_attack/config.yaml, examples/ensemble_attack/data_configs/data_types.json
config.yaml: removed model_paths.shadow_models_path; swapped pipeline toggles (pipeline.run_shadow_model_training: true→false, pipeline.run_metaclassifier_training: false→true); added shadow_training.shadow_models_output_path, shadow_training.target_model_output_path, shadow_training.final_shadow_models_path (list), shadow_training.final_target_model_path, shadow_training.pre_train_data_size; added metaclassifier settings num_optuna_trials, num_kfolds, set use_gpu false. data_types.json: added top-level id_column_name: "trans_id".
Top-level orchestration
examples/ensemble_attack/run_attack.py
Capture and convert return values from shadow/target training to Path objects; call run_target_model_training; when shadow training is skipped, reconstruct attack_data_paths and target_data_path from shadow_training config; pass attack_data_paths and target_data_path into metaclassifier training.
Shadow / Target training
examples/ensemble_attack/run_shadow_model_training.py
run_shadow_model_training now returns list[Path] (three shadow result paths). New function run_target_model_training(config) -> Path trains the target model, synthesizes attack data, saves artifacts, and returns the result path. Added synthesis/config utility imports and minor file orchestration.
Metaclassifier training flow
examples/ensemble_attack/run_metaclassifier_training.py
Signature now run_metaclassifier_training(config, attack_data_paths: list[Path], target_data_path: Path); loads three shadow attack pickles and target pickle instead of a single synth.csv; extracts and uses trans_id lists for id alignment; instantiates BlendingPlusPlus with attack_data_collection and target_data; passes id_column_data into fit and predict; ensures output dirs exist and saves timestamped predictions.
RMIA implementation
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py
New module implementing RMIA signals: Key enum, get_rmia_gower(...) (Gower matrices), conditional_average(...), and calculate_rmia_signals(...) which computes multiple RMIA-derived signals (k=1 and k=k) and returns a DataFrame including id_column_name.
Shadow training docstring fix
src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py
Corrected docstring typo ("pre-traine" → "pre-trained"); no behavioral changes.
Metaclassifier core (Blending)
src/midst_toolkit/attacks/ensemble/blending.py
BlendingPlusPlus.__init__ now accepts attack_data_collection and target_data; added training_config member. _prepare_meta_features now accepts id_column_data and id_column_name; replaced placeholder RMIA logic with calculate_rmia_signals(...); feature assembly concatenates numerical, Gower, DOMIAS, and RMIA signals. fit/predict signatures updated to accept id_column_data and fit supports optional epochs. Hyperparameter tuning uses self.training_config.num_optuna_trials/num_kfolds.
Unit tests
tests/unit/attacks/ensemble/test_meta_classifier.py
Updated fixtures to include id_column_name and id data; adjusted BlendingPlusPlus constructor calls to pass attack_data_collection and MOCK_TARGET_DATA; replaced CSV mocking with calculate_rmia_signals mocking and added test_prepare_meta_features_rmia_calculation; validated hyperparameter usage from config values.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant run_attack as run_attack.py
    participant shadow_train as run_shadow_model_training.py
    participant target_train as run_target_model_training
    participant meta_train as run_metaclassifier_training.py
    participant rmia as rmia_calculation.py
    participant blend as BlendingPlusPlus

    User->>run_attack: start pipeline

    alt Shadow Training run
        run_attack->>shadow_train: run_shadow_model_training(config)
        shadow_train-->>run_attack: attack_data_paths (list[Path])
    else Shadow Training skipped
        run_attack->>run_attack: build attack_data_paths from config
    end

    run_attack->>target_train: run_target_model_training(config)
    target_train-->>run_attack: target_data_path (Path)

    alt Metaclassifier run
        run_attack->>meta_train: run_metaclassifier_training(config, attack_data_paths, target_data_path)
        meta_train->>meta_train: load attack pickles -> attack_data_collection
        meta_train->>meta_train: load target pickle -> target_data
        meta_train->>blend: instantiate BlendingPlusPlus(config, attack_data_collection, target_data)
        meta_train->>blend: fit(df_meta_train, id_column_data)

        rect rgb(220,240,255)
            Note over blend,rmia: RMIA computation during feature prep
            blend->>rmia: calculate_rmia_signals(df_input, attack_data_collection, target_data, id_column_data)
            rmia-->>blend: rmia_signals_df
        end

        blend->>blend: assemble features (num + Gower + DOMIAS + RMIA)
        blend->>blend: train metaclassifier
        meta_train->>blend: predict(df_synthetic, id_column_data)
        blend-->>meta_train: predictions
        meta_train-->>User: save attack results
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Areas needing focused review:
    • RMIA math and correctness in rmia_calculation.py (Gower distances, downsampling, conditioning).
    • Signature and data-flow changes across run_* scripts and propagation of Path/pickle handling.
    • Integration points in BlendingPlusPlus (_prepare_meta_features, fit/predict) ensuring id alignment and feature concatenation order.
    • Tests updates verifying the RMIA invocation and hyperparameter sourcing.

Poem

🐰🥕 I hopped through configs, paths, and scores,
I measured Gower, opened new doors,
Shadows and target now sing in line,
RMIA whispers: "these traces are mine."
A rabbit's cheer — ensembles align!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title Check ❓ Inconclusive The title "RMIA attack" is extremely minimal and lacks context about the scope of changes. While it does reference a real component of the changeset—the RMIA attack feature being added—it fails to convey what was actually done with RMIA or where it's being integrated. A teammate scanning PR history would struggle to understand that this PR refactors the meta-classifier training pipeline, updates multiple functions and classes, and integrates RMIA signals into the feature preparation workflow. The title reads more like a placeholder than a complete description of the main change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed The PR description fully addresses all required template sections: PR Type is clearly marked as "Feature," a comprehensive short description explains the purpose and scope of the changes with relevant context, a Clickup ticket link is provided, main changes are detailed with specific file and class modifications, and tests added information is provided with the file path. The description goes beyond template minimums by explaining the refactoring rationale, listing specific method signature changes, and documenting data flow modifications across the pipeline.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
examples/ensemble_attack/run_shadow_model_training.py (1)

119-124: Remove duplicate assertion

"trans_id" is asserted twice on df_master_challenge_train; drop one.

Apply:

-assert "trans_id" in df_master_challenge_train.columns
examples/ensemble_attack/run_metaclassifier_training.py (1)

1-147: Replace hardcoded "trans_id" assignments with config-based id_column_name

Hardcoded "trans_id" occurrences found across multiple files. The most practical migrations are:

  • run_shadow_model_training.py lines 48 and 134: Replace id_column_name = "trans_id" assignments with values from config (already defined in data_types.json).
  • run_metaclassifier_training.py lines 78-85: Consider extracting id_column_name to a config variable to reduce hardcoding in assertions and column operations.
  • real_data_collection.py lines 168, 186, 189, 200: Drop operations targeting "trans_id" are tightly coupled to data format; verify if these should remain hardcoded or become configurable.
src/midst_toolkit/attacks/ensemble/blending.py (1)

103-145: Exclude ID column from Gower and DOMIAS feature calculations.

As written, df_input/df_synthetic/df_reference still include the ID column, which will leak into Gower/DOMIAS and distort distances/densities. Keep RMIA call on full df_input, but drop ID for Gower/DOMIAS.

-        df_synthetic = df_synthetic.reset_index(drop=True)[df_input.columns]
+        df_synthetic = df_synthetic.reset_index(drop=True)[df_input.columns]
@@
-        rmia_signals = calculate_rmia_signals(
+        rmia_signals = calculate_rmia_signals(
             df_input=df_input,
             attack_data_collection=self.attack_data_collection,
             target_data=self.target_data,
             categorical_column_names=categorical_cols,
             id_column_name=id_column_name,
             id_column_data=id_column_data,
             random_seed=self.random_seed,
         )
 
-        # 2. Get Gower distance features
+        # 2. Get Gower distance features (exclude ID)
 
         log(INFO, "Calculating Gower features...")
 
-        gower_features = calculate_gower_features(
-            df_input=df_input, df_synthetic=df_synthetic, categorical_column_names=categorical_cols
-        )
+        df_input_no_id = df_input.drop(columns=[id_column_name])
+        df_synth_no_id = df_synthetic.drop(columns=[id_column_name])
+        gower_features = calculate_gower_features(
+            df_input=df_input_no_id, df_synthetic=df_synth_no_id, categorical_column_names=categorical_cols
+        )
 
-        # 3. Get DOMIAS predictions
+        # 3. Get DOMIAS predictions (exclude ID)
 
         log(INFO, "Calculating DOMIAS scores...")
 
-        domias_features = calculate_domias_score(
-            df_input=df_input, df_synthetic=df_synthetic, df_reference=df_reference
-        )
+        df_ref_no_id = df_reference.drop(columns=[id_column_name]) if id_column_name in df_reference.columns else df_reference
+        domias_features = calculate_domias_score(
+            df_input=df_input_no_id, df_synthetic=df_synth_no_id, df_reference=df_ref_no_id
+        )
♻️ Duplicate comments (1)
examples/ensemble_attack/run_shadow_model_training.py (1)

142-144: Nice: clearer multi‑line log formatting

Matches prior review suggestion and improves readability.

🧹 Nitpick comments (17)
examples/ensemble_attack/data_configs/data_types.json (1)

4-6: Good: explicit id_column_name added

Please consume this in code instead of hardcoding "trans_id" (e.g., in run_shadow_model_training.py and run_metaclassifier_training.py).

examples/ensemble_attack/run_shadow_model_training.py (4)

46-49: Avoid hardcoding id/table names; read from config/data_types.json

Load id_column_name from examples/ensemble_attack/data_configs/data_types.json (already configured) and move table_name to config to prevent drift.

Apply:

+import json
@@
-    # TODO: Add this to config or .json files
-    table_name = "trans"
-    id_column_name = "trans_id"
+    # Read identifiers from config
+    with open(config.metaclassifier.data_types_file_path, "r") as f:
+        _types = json.load(f)
+    id_column_name = _types["id_column_name"]
+    table_name = config.shadow_training.get("table_name", "trans")

32-37: Add explicit check for id column in real_train.csv

Fail fast with a clear error if the id column is missing.

Apply:

     df_real_data = load_dataframe(
         Path(config.data_paths.processed_attack_data_path),
         "real_train.csv",
     )
+    assert id_column_name in df_real_data.columns, (
+        f"'{id_column_name}' must exist in real_train.csv for target model training."
+    )

75-77: Clarify selected_id_lists shape

Your current construction yields a list-of-list as expected by RMIA ([[...]]) — good. Consider a brief comment to document the contract for target_data["selected_sets"].


127-140: Parameterize model counts via config instead of literals

Expose n_models_per_set and n_reps under config.shadow_training to keep logic configurable and aligned with docs.

Apply:

-        n_models_per_set=4,
-        n_reps=12,
+        n_models_per_set=config.shadow_training.get("n_models_per_set", 4),
+        n_reps=config.shadow_training.get("n_reps", 12),

And add these keys in config.yaml (see separate comment).

examples/ensemble_attack/config.yaml (2)

16-21: Add explicit flag to control target model training

Decouple target model training from shadow training with a pipeline flag.

Apply:

 pipeline:
   run_data_processing: false # Set this to false if you have already saved the processed data
-  run_shadow_model_training: false # Set this to false if shadow models are already trained and saved
+  run_shadow_model_training: false # Set this to false if shadow models are already trained and saved
+  run_target_model_training: false # Set this to false if target model is already trained and saved
   run_metaclassifier_training: true

56-66: Expose counts in config; optional table name

Add n_models_per_set/n_reps here and (optionally) table_name to avoid literals in code.

Apply:

 shadow_training:
@@
   shadow_models_output_path: ${base_data_dir}/shadow_models_and_data
   target_model_output_path: ${base_data_dir}/target_model_and_data
+  # Optional: main table name for TabDDPM training
+  table_name: trans
+  # Shadow training sizes
+  n_models_per_set: 4  # must be even
+  n_reps: 12           # repetitions per challenge point
   final_shadow_models_path: [
           "initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
           "initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl",
           "shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl",
       ] # Paths to final shadow models used for metaclassifier training (relative to shadow_models_output_path)
   final_target_model_path: "target_model/shadow_workspace/trained_target_model/target_model.pkl"
examples/ensemble_attack/run_attack.py (2)

61-67: Gate target model training behind a dedicated flag

Avoid coupling target training with shadow training.

Apply:

-    if config.pipeline.run_shadow_model_training:
+    if config.pipeline.run_shadow_model_training:
         shadow_pipeline = importlib.import_module("examples.ensemble_attack.run_shadow_model_training")
         attack_data_paths = shadow_pipeline.run_shadow_model_training(config)
-        attack_data_paths = [Path(path) for path in attack_data_paths]
-
-        target_data_path = shadow_pipeline.run_target_model_training(config)
-        target_data_path = Path(target_data_path)
+        # Results are already Paths; no reconversion needed
+        if config.pipeline.get("run_target_model_training", False):
+            target_data_path = shadow_pipeline.run_target_model_training(config)
+            target_data_path = Path(target_data_path)

63-64: Remove redundant Path conversion

run_shadow_model_training already returns list[Path].

Apply:

-        attack_data_paths = [Path(path) for path in attack_data_paths]
+        # attack_data_paths already Path list; no conversion required
examples/ensemble_attack/run_metaclassifier_training.py (3)

77-85: Use id_column_name from data_types.json; avoid hardcoded "trans_id"

Keeps alignment with config and shadow/target artifacts.

Apply:

+    # Load id column name from data_types config
+    import json
+    with open(config.metaclassifier.data_types_file_path, "r") as f:
+        _types = json.load(f)
+    id_col = _types["id_column_name"]
@@
-    assert "trans_id" in df_meta_train.columns, "Meta train data must have trans_id column"
-    train_trans_ids = df_meta_train["trans_id"]
+    assert id_col in df_meta_train.columns, f"Meta train data must have {id_col} column"
+    train_trans_ids = df_meta_train[id_col]
@@
-    assert "trans_id" in df_meta_test.columns, "Meta test data must have trans_id column"
-    test_trans_ids = df_meta_test["trans_id"]
+    assert id_col in df_meta_test.columns, f"Meta test data must have {id_col} column"
+    test_trans_ids = df_meta_test[id_col]
@@
-    df_meta_train = df_meta_train.drop(columns=["trans_id", "account_id"])
-    df_meta_test = df_meta_test.drop(columns=["trans_id", "account_id"])
+    df_meta_train = df_meta_train.drop(columns=[id_col, "account_id"], errors="ignore")
+    df_meta_test = df_meta_test.drop(columns=[id_col, "account_id"], errors="ignore")

120-126: Confirm correct synthetic set for evaluation, or document

You switch to challenge-provided synth.csv here (vs. target synth for training). If intentional, add a brief comment explaining why; otherwise, consider using the same target synthetic set for both phases for consistency.


14-27: Public API change: signature now requires paths

Good change. Document that paths must exist and point to pickles created by this toolkit; consider accepting strings and coercing to Path for robustness.

src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (2)

56-63: Use WARNING level for missing categorical columns.

Message is a warning but logged at INFO. Switch to WARNING for visibility.

-from logging import INFO
+from logging import INFO, WARNING
@@
-        log(
-            INFO,
+        log(
+            WARNING,

37-49: Docstring mismatch for model_data keys.

Function is used with both FINE_TUNED_RESULTS and TRAINED_RESULTS; update Args to reflect either key rather than only fine-tune keys.

src/midst_toolkit/attacks/ensemble/blending.py (1)

147-157: Remove stale TODO; epochs already handled via None default.

The comment is outdated; delete to avoid confusion.

-    # TODO: Handle epochs parameter
     def fit(
tests/unit/attacks/ensemble/test_meta_classifier.py (2)

140-181: Assert that ID is excluded from Gower/DOMIAS inputs.

Add checks on mocked call args to ensure _prepare_meta_features drops the ID column for these calculators.

         meta_features = bpp._prepare_meta_features(
@@
         mock_gower.assert_called_once()
         mock_domias.assert_called_once()
         mock_rmia.assert_called_once()
 
+        # Verify ID column is excluded from Gower/DOMIAS inputs
+        _, gower_kwargs = mock_gower.call_args
+        assert "id_col" not in gower_kwargs["df_input"].columns
+        assert "id_col" not in gower_kwargs["df_synthetic"].columns
+
+        _, domias_kwargs = mock_domias.call_args
+        assert "id_col" not in domias_kwargs["df_input"].columns
+        assert "id_col" not in domias_kwargs["df_synthetic"].columns
+        assert "id_col" not in domias_kwargs["df_reference"].columns

25-33: Remove unused cfg fixture (hydra compose).

Fixture isn’t referenced; drop to speed tests and avoid unnecessary Hydra initialization.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75b33f1 and 6cb82f0.

📒 Files selected for processing (10)
  • .gitignore (1 hunks)
  • examples/ensemble_attack/config.yaml (3 hunks)
  • examples/ensemble_attack/data_configs/data_types.json (1 hunks)
  • examples/ensemble_attack/run_attack.py (1 hunks)
  • examples/ensemble_attack/run_metaclassifier_training.py (3 hunks)
  • examples/ensemble_attack/run_shadow_model_training.py (2 hunks)
  • src/midst_toolkit/attacks/ensemble/blending.py (11 hunks)
  • src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (1 hunks)
  • src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py (1 hunks)
  • tests/unit/attacks/ensemble/test_meta_classifier.py (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
examples/ensemble_attack/run_metaclassifier_training.py (2)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
  • load_dataframe (31-52)
src/midst_toolkit/attacks/ensemble/blending.py (3)
  • MetaClassifierType (20-22)
  • BlendingPlusPlus (25-260)
  • predict (209-260)
examples/ensemble_attack/run_attack.py (2)
examples/ensemble_attack/run_shadow_model_training.py (2)
  • run_shadow_model_training (93-146)
  • run_target_model_training (20-90)
examples/ensemble_attack/run_metaclassifier_training.py (1)
  • run_metaclassifier_training (14-147)
tests/unit/attacks/ensemble/test_meta_classifier.py (2)
src/midst_toolkit/attacks/ensemble/blending.py (5)
  • BlendingPlusPlus (25-260)
  • MetaClassifierType (20-22)
  • _prepare_meta_features (75-145)
  • fit (148-207)
  • predict (209-260)
src/midst_toolkit/attacks/ensemble/xgboost_tuner.py (1)
  • tune_hyperparameters (117-146)
src/midst_toolkit/attacks/ensemble/blending.py (4)
src/midst_toolkit/attacks/ensemble/distance_features.py (2)
  • calculate_domias_score (62-124)
  • calculate_gower_features (10-59)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (1)
  • calculate_rmia_signals (114-323)
src/midst_toolkit/attacks/ensemble/train_utils.py (1)
  • get_tpr_at_fpr (7-25)
src/midst_toolkit/attacks/ensemble/xgboost_tuner.py (1)
  • XgBoostHyperparameterTuner (20-146)
examples/ensemble_attack/run_shadow_model_training.py (3)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
  • load_dataframe (31-52)
src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py (1)
  • train_three_sets_of_shadow_models (288-414)
src/midst_toolkit/attacks/ensemble/shadow_model_utils.py (2)
  • save_additional_tabddpm_config (37-77)
  • train_tabddpm_and_synthesize (81-156)
🪛 Ruff (0.14.1)
examples/ensemble_attack/run_metaclassifier_training.py

60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

🔇 Additional comments (1)
src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py (1)

312-316: Docstring consistency looks good

The numbers (16 total; 8 fine‑tuned + 8 scratch; pre‑train size drawn from config) match the implementation. LGTM.

.gitignore Outdated
Comment on lines 39 to 41
# Shadow models
examples/ensemble_attack/shadow_models_data/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix path mismatch with config; also ignore target model artifacts

Config uses shadow_models_and_data and target_model_and_data. Current ignore points to shadow_models_data, so artifacts may be committed.

Apply:

-# Shadow models
-examples/ensemble_attack/shadow_models_data/
+# Shadow/target model artifacts
+examples/ensemble_attack/shadow_models_and_data/
+examples/ensemble_attack/target_model_and_data/
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Shadow models
examples/ensemble_attack/shadow_models_data/
# Shadow/target model artifacts
examples/ensemble_attack/shadow_models_and_data/
examples/ensemble_attack/target_model_and_data/
🤖 Prompt for AI Agents
In .gitignore around lines 39 to 41, the current entry
"examples/ensemble_attack/shadow_models_data/" does not match the config which
expects "shadow_models_and_data" and misses ignoring target model artifacts;
update the ignore paths to "examples/ensemble_attack/shadow_models_and_data/"
(matching config exactly) and add an entry
"examples/ensemble_attack/target_model_and_data/" (or the correct target
directory name used in config) so both shadow and target model artifacts are
ignored and cannot be accidentally committed.

Comment on lines +48 to +62
assert len(attack_data_paths) == 3, (
"At this point of development, the attack_data_paths list must contain exactly three elements."
)

attack_data_collection = []

for model_path in attack_data_paths:
assert model_path.exists(), (
f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
)

with open(model_path, "rb") as f:
shadow_data_and_result = pickle.load(f)
attack_data_collection.append(shadow_data_and_result)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate and constrain loaded pickle inputs (security posture)

pickle.load is unsafe on untrusted files. These are internal artifacts, but add basic guardrails: ensure paths resolve under expected base dirs before loading.

Apply:

+    # Basic safety: ensure attack_data_paths live under configured shadow_models_output_path
+    base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()
+    for p in attack_data_paths:
+        resolved = Path(p).resolve()
+        assert str(resolved).startswith(str(base_shadow_dir)), f"Refusing to load pickle outside {base_shadow_dir}: {resolved}"
@@
-    assert target_data_path.exists(), (
+    assert target_data_path.exists(), (
         f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
     )
+    # Ensure target path under expected base dir
+    base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()
+    resolved_target = Path(target_data_path).resolve()
+    assert str(resolved_target).startswith(str(base_target_dir)), f"Refusing to load pickle outside {base_target_dir}: {resolved_target}"

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.14.1)

60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 48 to
62, add guardrails before unpickling: define the expected base directory (e.g.,
SHADOW_MODELS_DIR or a configured Path), verify each model_path.resolve() is
under that base (use Path.resolve() + Path.is_relative_to(base) or compare
str(Path(commonpath))), ensure the filename has an expected suffix (like .pkl),
and wrap the pickle.load call in a try/except that raises a clear error if
validation fails or unpickling raises an exception; this prevents loading files
outside the intended directory and surfaces failures safely.

Comment on lines +69 to +86
numerical_columns = [col for col in df_input.columns if col not in categorical_column_names]

for i in range(len(model_data[key.value])):
df_synthetic = model_data[key.value][i].synthetic_data.copy()

# Convert numerical columns to float (otherwise error in the numpy divide)
df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
df_input[numerical_columns] = df_input[numerical_columns].astype(float)

# Sample synthetic data points if there's too many
if len(df_synthetic) > min_length:
df_synthetic = df_synthetic.sample(n=min_length, random_state=random_seed)

df_synthetic = df_synthetic.drop(columns=[id_column_name])

gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features)

assert np.all((gower_matrix >= 0) & (gower_matrix <= 1)), "Distances are falling outside of range [0, 1]."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Fix Gower inputs: drop ID from df_input and avoid in-place mutation.

Currently df_synthetic drops the ID column but df_input keeps it; this mismatches columns and can break gower_matrix. Also, casting df_input in-place mutates caller state.

  • Drop ID from df_input as well for Gower.
  • Build cat mask after removing ID.
  • Cast on copies only.

Apply:

-    categorical_features = [column in categorical_column_names for column in df_input.columns]
-
-    gower_matrices = []
-
-    numerical_columns = [col for col in df_input.columns if col not in categorical_column_names]
+    # Work on copies and exclude the ID column from distance computations
+    df_input_no_id = df_input.drop(columns=[id_column_name]).copy()
+    gower_matrices = []
+    categorical_features = [c in categorical_column_names for c in df_input_no_id.columns]
+    numerical_columns = [c for c in df_input_no_id.columns if c not in categorical_column_names]
@@
-        # Convert numerical columns to float (otherwise error in the numpy divide)
-        df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
-        df_input[numerical_columns] = df_input[numerical_columns].astype(float)
+        # Convert numerical columns to float (otherwise error in the numpy divide)
+        df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
+        df_input_no_id[numerical_columns] = df_input_no_id[numerical_columns].astype(float)
@@
-        df_synthetic = df_synthetic.drop(columns=[id_column_name])
-
-        gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features)
+        df_synthetic = df_synthetic.drop(columns=[id_column_name])
+        gower_matrix = gower.gower_matrix(
+            data_x=df_input_no_id, data_y=df_synthetic, cat_features=categorical_features
+        )

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py around lines 69
to 86, the code drops the ID column only from df_synthetic and casts df_input
in-place which leads to mismatched columns for gower and mutates the caller
state; fix by creating local copies of df_input and df_synthetic, drop the
id_column_name from both copies before computing gower_matrix, rebuild the
categorical_features/categorical_column_names mask after the ID is removed, and
perform numeric astype(float) conversions on those local copies only so the
original data and model_data remain unchanged and columns align for
gower.gower_matrix.

Comment on lines +292 to +306
# Create masks for records in/out of training sets
mask_in_training = np.array(
[
[results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))]
for i in range(len(shadow_training_sets))
]
)

mask_not_in_training = np.array(
[
[results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))]
for i in range(len(shadow_training_sets))
]
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Membership mask uses value in DataFrame (checks columns), producing wrong masks.

Compute membership against each training set’s ID column values. Also derive the OUT mask as negation to keep shapes aligned.

-    mask_in_training = np.array(
-        [
-            [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))]
-            for i in range(len(shadow_training_sets))
-        ]
-    )
-
-    mask_not_in_training = np.array(
-        [
-            [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))]
-            for i in range(len(shadow_training_sets))
-        ]
-    )
+    # Build membership masks per shadow training DF, using its ID column
+    shadow_training_id_sets = [
+        set(df[id_column_name].tolist()) for df in shadow_training_sets  # type: ignore[index]
+    ]
+    mask_in_training = np.array(
+        [results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets]
+    )
+    mask_not_in_training = ~mask_in_training
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Create masks for records in/out of training sets
mask_in_training = np.array(
[
[results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))]
for i in range(len(shadow_training_sets))
]
)
mask_not_in_training = np.array(
[
[results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))]
for i in range(len(shadow_training_sets))
]
)
# Create masks for records in/out of training sets
# Build membership masks per shadow training DF, using its ID column
shadow_training_id_sets = [
set(df[id_column_name].tolist()) for df in shadow_training_sets # type: ignore[index]
]
mask_in_training = np.array(
[results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets]
)
mask_not_in_training = ~mask_in_training

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (2)

298-312: Critical: Membership masks check column names instead of ID values.

Lines 301 and 308 use in shadow_training_sets[i] where shadow_training_sets[i] is a DataFrame. The in operator on a DataFrame checks if the value exists as a column name, not within the data itself. This produces completely incorrect membership masks, corrupting all downstream RMIA signals.

Apply this diff to fix the membership checks:

-    # Create masks for records in/out of training sets
+    # Create masks for records in/out of training sets by checking ID column values
+    shadow_training_id_sets = [
+        set(df[id_column_name].tolist()) for df in shadow_training_sets
+    ]
+    
     mask_in_training = np.array(
-        [
-            [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))]
-            for i in range(len(shadow_training_sets))
-        ]
+        [results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets]
     )
 
-    mask_not_in_training = np.array(
-        [
-            [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))]
-            for i in range(len(shadow_training_sets))
-        ]
-    )
+    mask_not_in_training = ~mask_in_training

65-84: Critical: Fix ID column mismatch and in-place mutation in Gower computation.

The current implementation has two critical issues:

  1. Column mismatch: Line 65 builds categorical_features mask based on df_input.columns (which includes the ID column), but line 82 drops the ID from df_synthetic only. When gower_matrix is called at line 84, df_input still contains the ID column while df_synthetic doesn't, causing misaligned columns and an incorrect boolean mask.

  2. In-place mutation: Line 76 modifies df_input[numerical_columns] in-place, corrupting the caller's DataFrame.

Apply this diff to fix both issues:

+    # Work on a copy and exclude ID from distance computations
+    df_input_no_id = df_input.drop(columns=[id_column_name]).copy()
+    
-    categorical_features = [column in categorical_column_names for column in df_input.columns]
+    categorical_features = [column in categorical_column_names for column in df_input_no_id.columns]
 
     gower_matrices = []
 
-    numerical_columns = [col for col in df_input.columns if col not in categorical_column_names]
+    numerical_columns = [col for col in df_input_no_id.columns if col not in categorical_column_names]
 
     for i in range(len(model_data[key.value])):
         df_synthetic = model_data[key.value][i].synthetic_data.copy()
 
         # Convert numerical columns to float (otherwise error in the numpy divide)
         df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
-        df_input[numerical_columns] = df_input[numerical_columns].astype(float)
+        df_input_no_id[numerical_columns] = df_input_no_id[numerical_columns].astype(float)
 
         # Sample synthetic data points if there's too many
         if len(df_synthetic) > min_length:
             df_synthetic = df_synthetic.sample(n=min_length, random_state=random_seed)
 
         df_synthetic = df_synthetic.drop(columns=[id_column_name])
 
-        gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features)
+        gower_matrix = gower.gower_matrix(data_x=df_input_no_id, data_y=df_synthetic, cat_features=categorical_features)
 
         assert np.all((gower_matrix >= 0) & (gower_matrix <= 1)), "Distances are falling outside of range [0, 1]."
🧹 Nitpick comments (3)
examples/ensemble_attack/run_metaclassifier_training.py (3)

14-27: Consider clarifying why three shadow model paths are required.

The docstring mentions "three paths" but doesn't explain the rationale. A brief note about why exactly three sets of shadow models are needed for RMIA would help future maintainers understand this constraint.

Apply this diff to enhance the docstring:

     Args:
         config: Configuration object set in config.yaml.
         attack_data_paths: List of paths to the trained shadow models and all their attributes and synthetic data.
-            The list should contain three paths, one for each set of shadow models.
+            The list must contain exactly three paths, one for each set of shadow models. The RMIA attack
+            requires three distinct shadow model sets to compute membership inference signals.
         target_data_path: Path to the target model and all its attributes and synthetic data.

46-50: Clarify whether the three-path requirement is temporary or permanent.

The assertion message says "at this point of development," which suggests this might be a temporary limitation. If the requirement for exactly three shadow model sets is fundamental to RMIA (as indicated by the algorithm), the message should reflect that it's a permanent constraint rather than a development-phase restriction.

If this is a permanent requirement, apply this diff:

     assert len(attack_data_paths) == 3, (
-        "At this point of development, the attack_data_paths list must contain exactly three elements."
+        "The RMIA attack requires exactly three shadow model sets. Received {len(attack_data_paths)} paths."
     )

59-68: Consider more descriptive variable names for loaded data structures.

Previous reviews noted that shadow_data_and_result and target_data_and_result are somewhat misleading since they contain complete training artifacts (datasets, models, synthetic data, results) rather than just models. More descriptive names like shadow_training_artifacts or shadow_model_bundle would better convey their comprehensive content.

Apply this diff:

         with open(model_path, "rb") as f:
-            shadow_data_and_result = pickle.load(f)
-            attack_data_collection.append(shadow_data_and_result)
+            shadow_training_artifacts = pickle.load(f)
+            attack_data_collection.append(shadow_training_artifacts)

     assert target_data_path.exists(), (
         f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
     )

     with open(target_data_path, "rb") as f:
-        target_data_and_result = pickle.load(f)
+        target_training_artifacts = pickle.load(f)

-    synth = target_data_and_result["trained_results"][0].synthetic_data
+    synth = target_training_artifacts["trained_results"][0].synthetic_data
     assert synth is not None, "Target model pickle missing synthetic_data."
     df_synthetic = synth.copy()

And update line 96:

     blending_attacker = BlendingPlusPlus(
         config=config,
         attack_data_collection=attack_data_collection,
-        target_data=target_data_and_result,
+        target_data=target_training_artifacts,
         meta_classifier_type=meta_classifier_enum,
         random_seed=config.random_seed,
     )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cb82f0 and 0f8c851.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • examples/ensemble_attack/run_metaclassifier_training.py (3 hunks)
  • src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .gitignore
🧰 Additional context used
🧬 Code graph analysis (1)
examples/ensemble_attack/run_metaclassifier_training.py (2)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
  • load_dataframe (31-52)
src/midst_toolkit/attacks/ensemble/blending.py (3)
  • MetaClassifierType (20-22)
  • BlendingPlusPlus (25-260)
  • predict (209-260)
🪛 Ruff (0.14.1)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py

212-212: Avoid specifying long messages outside the exception class

(TRY003)


216-216: Avoid specifying long messages outside the exception class

(TRY003)

examples/ensemble_attack/run_metaclassifier_training.py

60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

🔇 Additional comments (9)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (3)

93-112: LGTM!

The conditional averaging logic correctly handles edge cases by returning NaN when no values satisfy the condition, preventing division-by-zero issues.


197-217: LGTM!

The validation logic properly guards against empty data collections and ensures k is within valid bounds before proceeding with distance computations.


328-351: LGTM!

The RMIA ratio calculations properly use np.divide with guards to prevent division-by-zero, setting results to NaN when denominators are zero or negative. This ensures robust signal computation.

examples/ensemble_attack/run_metaclassifier_training.py (6)

70-72: LGTM! Proper validation of synthetic data.

The explicit check that synthetic data is not None before copying prevents AttributeError if training was skipped or synthesis was disabled. This addresses the previous review feedback effectively.


79-87: LGTM! Proper ID column handling for RMIA alignment.

The explicit extraction of trans_id columns before dropping them ensures correct alignment of RMIA features with the data samples. The assertions guard against missing columns, making the flow robust.


93-101: LGTM! Correct initialization with RMIA data.

The BlendingPlusPlus initialization correctly passes attack_data_collection and target data, aligning with the updated constructor signature for RMIA support.


105-113: LGTM! Proper ID column data passed to fit.

The addition of id_column_data=train_trans_ids correctly provides the ID column for RMIA signal alignment during training, matching the updated method signature.


130-136: LGTM! Proper ID column data passed to predict.

The predict call correctly includes id_column_data=test_trans_ids for RMIA feature alignment during evaluation, matching the updated method signature.


139-140: LGTM! Proper directory creation before saving results.

Using mkdir(parents=True, exist_ok=True) safely ensures the results directory exists before saving predictions, preventing potential FileNotFoundError.

Comment on lines +54 to +68
for model_path in attack_data_paths:
assert model_path.exists(), (
f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
)

with open(model_path, "rb") as f:
shadow_data_and_result = pickle.load(f)
attack_data_collection.append(shadow_data_and_result)

assert target_data_path.exists(), (
f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
)

with open(target_data_path, "rb") as f:
target_data_and_result = pickle.load(f)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Add path validation before unpickling to improve security posture.

Loading pickled data without validation poses a security risk if paths are manipulated. Although these are internal artifacts, adding basic guardrails ensures that only files from expected directories are loaded, preventing potential exploitation.

Apply this diff to add path validation:

+    # Validate that attack data paths are within expected shadow models directory
+    if hasattr(config.shadow_training, 'shadow_models_output_path'):
+        base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()
+    
     for model_path in attack_data_paths:
         assert model_path.exists(), (
             f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
         )
+        
+        # Security check: ensure path is within expected directory
+        if hasattr(config.shadow_training, 'shadow_models_output_path'):
+            resolved_path = Path(model_path).resolve()
+            try:
+                resolved_path.relative_to(base_shadow_dir)
+            except ValueError:
+                raise ValueError(f"Refusing to load pickle outside {base_shadow_dir}: {resolved_path}")

         with open(model_path, "rb") as f:
             shadow_data_and_result = pickle.load(f)
             attack_data_collection.append(shadow_data_and_result)

     assert target_data_path.exists(), (
         f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
     )
+    
+    # Security check: ensure target path is within expected directory
+    if hasattr(config.shadow_training, 'target_model_output_path'):
+        base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()
+        resolved_target = Path(target_data_path).resolve()
+        try:
+            resolved_target.relative_to(base_target_dir)
+        except ValueError:
+            raise ValueError(f"Refusing to load pickle outside {base_target_dir}: {resolved_target}")

     with open(target_data_path, "rb") as f:

Run the following script to verify that the config contains the expected base directory paths:


🏁 Script executed:

#!/bin/bash
# Check if shadow_training configuration contains output path specifications
rg -n "shadow_models_output_path|target_model_output_path" --type=yaml

Length of output: 762



🏁 Script executed:

#!/bin/bash
# Examine the full context of the file under review, especially how paths are created
cat examples/ensemble_attack/run_metaclassifier_training.py | head -100

Length of output: 3674


🏁 Script executed:

#!/bin/bash
# Search for where attack_data_paths and target_data_path variables are assigned
rg -n "attack_data_paths\s*=|target_data_path\s*=" examples/ensemble_attack/run_metaclassifier_training.py -A 3

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check for Path imports and usage context
rg -n "from pathlib|import.*Path|Path\(" examples/ensemble_attack/run_metaclassifier_training.py -B 2 -A 2

Length of output: 1807


🏁 Script executed:

#!/bin/bash
# Check config usage and how base directories are referenced
rg -n "config\.shadow_training" examples/ensemble_attack/run_metaclassifier_training.py -B 2 -A 2

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for any existing path validation logic
rg -n "resolve|relative_to|ValueError" examples/ensemble_attack/run_metaclassifier_training.py -B 1 -A 1

Length of output: 0



---

🏁 Script executed:

```shell
#!/bin/bash
# Find where run_metaclassifier_training is called
rg -n "run_metaclassifier_training" --type=py

Length of output: 485


🏁 Script executed:

#!/bin/bash
# Look for the main or script entry point that uses ensemble attack
fd -e py "main|__main__|run.*ensemble" examples/ensemble_attack/ --type f

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check if there's a runner script in examples
ls -la examples/ensemble_attack/ | grep -E "\.py$"

Length of output: 354


🏁 Script executed:

#!/bin/bash
# Search for how attack_data_paths list is constructed
rg -n "attack_data_paths\s*=|\[.*shadow.*\]" --type=py -A 5 | head -50

Length of output: 5231



---

🏁 Script executed:

```shell
#!/bin/bash
# Examine run_attack.py to see how attack_data_paths and target_data_path are constructed
cat examples/ensemble_attack/run_attack.py

Length of output: 3909



---

🏁 Script executed:

```shell
#!/bin/bash
# Examine run_shadow_model_training to see where paths come from
head -120 examples/ensemble_attack/run_shadow_model_training.py

Length of output: 4314


🏁 Script executed:

#!/bin/bash
# Search for where paths are returned in shadow model training
rg -n "return.*path|append.*path" examples/ensemble_attack/run_shadow_model_training.py -B 2 -A 1

Length of output: 179



---

🏁 Script executed:

```shell
#!/bin/bash
# Check if config.shadow_training always has these keys or if they're optional
rg -n "target_model_output_path|shadow_models_output_path" examples/ensemble_attack/config.yaml -B 2 -A 1

Length of output: 1097


🏁 Script executed:

#!/bin/bash
# Verify Path behavior with absolute paths to confirm the concern
python3 << 'EOF'
from pathlib import Path

# Test 1: Normal relative path
base = Path("/models")
rel_path = "shadow_set_1.pkl"
result1 = base / rel_path
print(f"Test 1 - base / rel_path: {result1}")

# Test 2: Absolute path (PROBLEM)
abs_path = "/etc/passwd"
result2 = base / abs_path
print(f"Test 2 - base / abs_path: {result2}")
print(f"Result is absolute? {result2.is_absolute()}")

# Test 3: Traversal with resolve()
traversal_path = "/models/../../../etc/passwd"
resolved = Path(traversal_path).resolve()
print(f"Test 3 - traversal.resolve(): {resolved}")

# Test 4: Validation with relative_to
try:
    Path("/etc/passwd").resolve().relative_to(Path("/models").resolve())
    print("Test 4 - relative_to succeeded (shouldn't)")
except ValueError as e:
    print(f"Test 4 - relative_to failed as expected: {e}")
EOF

Length of output: 296


Add path validation before unpickling to prevent directory traversal attacks.

Loading pickled data without validating that paths stay within expected directories creates a security risk. While config.shadow_training.shadow_models_output_path and target_model_output_path are defined in config, the relative paths from config could be absolute or contain traversal sequences that escape the base directory when combined using the / operator.

Add validation to ensure loaded files are within their expected base directories:

from pathlib import Path

# At the start of the for loop
base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()

for model_path in attack_data_paths:
    assert model_path.exists(), (
        f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
    )
    
    # Security check: ensure path is within expected directory
    resolved_path = Path(model_path).resolve()
    try:
        resolved_path.relative_to(base_shadow_dir)
    except ValueError:
        raise ValueError(f"Refusing to load pickle outside {base_shadow_dir}: {resolved_path}")
    
    with open(model_path, "rb") as f:
        shadow_data_and_result = pickle.load(f)
        attack_data_collection.append(shadow_data_and_result)

# Before loading target model
base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()

assert target_data_path.exists(), (
    f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
)

resolved_target = Path(target_data_path).resolve()
try:
    resolved_target.relative_to(base_target_dir)
except ValueError:
    raise ValueError(f"Refusing to load pickle outside {base_target_dir}: {resolved_target}")

with open(target_data_path, "rb") as f:
    target_data_and_result = pickle.load(f)
🧰 Tools
🪛 Ruff (0.14.1)

60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)

🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 54-68,
add path validation before unpickling: resolve the configured base directories
(shadow_models_output_path and target_model_output_path) to absolute paths, then
resolve each model_path/target_data_path and ensure
resolved_path.relative_to(base_dir) succeeds; if it raises ValueError, raise a
clear error refusing to load pickles outside the base dir, otherwise proceed to
open and pickle.load.

Comment on lines +122 to +127
# Get the synthetic data provided by the challenge for evaluation
# TODO: Check if the file is the correct one.
df_synthetic_original = load_dataframe(
Path(config.data_paths.processed_attack_data_path),
"synth.csv",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Address or remove the TODO comment about file correctness.

The TODO suggests uncertainty about whether synth.csv is the correct synthetic data for evaluation. Since training uses df_synthetic from the target model (line 72), while evaluation uses df_synthetic_original from this CSV, please verify this is the intended behavior. If different synthetic datasets are required for training vs. evaluation, the comment should clarify the rationale; otherwise, the inconsistency should be resolved.

Run the following script to understand the relationship between the target model's synthetic data and synth.csv:

Do you want me to help clarify the intended usage pattern for these two synthetic datasets, or would you like me to open an issue to track resolving this TODO?


🏁 Script executed:

#!/bin/bash
# Check for documentation or comments explaining the two synthetic datasets
rg -n "synth.csv|synthetic.*evaluation|df_synthetic_original" --type=py -A 3 -B 3

# Look for configuration or documentation about evaluation data
rg -n "processed_attack_data_path|synth" --type=yaml

Length of output: 2267


🏁 Script executed:

#!/bin/bash
# Find where df_synthetic is loaded (mentioned in line 72)
rg -n "df_synthetic\s*=" --type=py -B 5 -A 5 examples/ensemble_attack/run_metaclassifier_training.py | head -60

# Look for how df_synthetic is used in fit call
rg -n "\.fit\(" --type=py -B 5 -A 5 examples/ensemble_attack/run_metaclassifier_training.py

# Check for any documentation comments around training vs evaluation data
rg -n "training|evaluation" --type=py -B 2 -A 2 examples/ensemble_attack/run_metaclassifier_training.py | head -40

Length of output: 287



---

🏁 Script executed:

```shell
#!/bin/bash
# Find the file first
fd "run_metaclassifier_training.py" --type f

# Also check what files exist in examples directory
fd --type f --max-depth 3 examples/ 2>/dev/null | head -30

Length of output: 55



---

🏁 Script executed:

```shell
#!/bin/bash
# View lines around 72 to see df_synthetic loading
sed -n '65,85p' examples/ensemble_attack/run_metaclassifier_training.py

# View the fit call around line 100-110
sed -n '100,115p' examples/ensemble_attack/run_metaclassifier_training.py

# View the full context around lines 122-136 (synthetic loading and predict)
sed -n '118,140p' examples/ensemble_attack/run_metaclassifier_training.py

Length of output: 2085



---

🏁 Script executed:

```shell
#!/bin/bash
# Search for documentation about synth.csv
rg -n "synth.csv" --type=py --type=md -B 2 -A 2

# Look for README or documentation files in ensemble_attack
fd "README|readme|INSTRUCTIONS|instructions" examples/ensemble_attack/ --type f

# Check for test files that might explain the pattern
fd "test.*metaclassifier" --type f

Length of output: 466



---

🏁 Script executed:

```shell
#!/bin/bash
# View the README.md file
cat examples/ensemble_attack/README.md

Length of output: 5238



---

🏁 Script executed:

```shell
#!/bin/bash
# View the config.yaml to understand data paths
cat examples/ensemble_attack/config.yaml

# Search for any comments or documentation about synth.csv in config files
rg -n "synth" --type=yaml -B 2 -A 2

# Look at run_attack.py or data processing scripts to see where synth.csv comes from
fd "run_attack|data_process" examples/ensemble_attack/ --type f

# Search for synth.csv file creation or references in Python files
rg -n "synth\.csv|synth.csv" --type=py -B 3 -A 3 | head -40

Length of output: 6316



---

🏁 Script executed:

```shell
#!/bin/bash
# View run_attack.py to understand data processing and synth.csv
cat examples/ensemble_attack/run_attack.py | head -200

# Search for synth.csv creation or handling in run_attack.py
rg -n "synth" examples/ensemble_attack/run_attack.py -B 3 -A 3

Length of output: 3909



---

🏁 Script executed:

```shell
#!/bin/bash
# Find the process_split_data module
fd "process_split_data.py" --type f

# Once found, examine its implementation to see what files it creates
rg -n "synth|csv" --type=py -B 2 -A 2 | grep -A 10 -B 10 "process_split_data"

Length of output: 12865



---

🏁 Script executed:

```shell
#!/bin/bash
# Global search for synth.csv creation or references
rg -n "synth\.csv|synth_csv" --type=py

# Also search in the shadow model training pipeline
rg -n "\.csv" src/midst_toolkit/attacks/ensemble/ --type=py | grep -i synth

# Check what files might be generated or expected
fd "synth" --type f 2>/dev/null | head -20

Length of output: 254


The TODO comment identifies a genuine issue: synth.csv is not created by the data processing pipeline and has no source in the codebase.

The file is loaded from processed_attack_data_path but the process_split_data.py pipeline only generates: real_train.csv, real_val.csv, real_test.csv, and master_challenge_*.csv files. Since synth.csv is never created by the codebase, either:

  1. The evaluation should use df_synthetic from the target model (same as training) instead of loading a separate file, or
  2. synth.csv should be generated by the data processing pipeline but is missing, or
  3. synth.csv is expected to be externally provided without documentation

Clarify the intended behavior and either fix the code path or add the missing data generation step.

🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 122 to
127, the code attempts to load "synth.csv" which is not produced by the data
processing pipeline; update the code or pipeline to match the intended flow:
either (A) stop loading synth.csv and instead use the synthetic dataframe
already created/returned by the target model (replace the load call with the
in-memory df_synthetic source used during training/evaluation), or (B) add
generation of synth.csv in the data processing pipeline (process_split_data.py)
so that it is created under processed_attack_data_path and document its schema,
or (C) if synth.csv is an external input, add explicit validation and clear
documentation+config to require the external file; choose one of these options
and implement the corresponding change and tests so the file path lookup no
longer fails.

@lotif
Copy link
Collaborator

lotif commented Oct 27, 2025

@sarakodeiri I am merging main to this branch to stop coderabbit from adding a review to every new commit that is pushed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants