-
Couldn't load subscription status.
- Fork 1
RMIA attack #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RMIA attack #64
Conversation
…rts, and fixed a unit test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good start to the final piece for the attack. There are some things I think we can clean up and be a bit more explicit on. Let me know if you have questions regarding my comments!
| attack_data_paths = [ | ||
| "initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl", | ||
| "initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl", | ||
| "shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the third path here so different from the first two? If there a way for us to unify these so it's not the odd-one out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on reading below, it looks like we also expect this list to be exactly three model sets? So we should assert it is of the correct length here and throw an exception if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original attack repository, they fine-tuned 4 shadow models, then fine-tuned another set of 4 shadow models, and then trained 8 models from scratch. That's why we have three sets and why the last set is the odd one out.
So far, Fatemeh and I decided to keep everything identical to the original repository, make sure it works within the context of this toolkit, and then do a major refactoring at the end. Everything else is also handled with the "three sets" format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine, but as of now, we've "hard-coded" that there should be three. So we want to ensure it is actually that way, in case we never do the refactor or decide that the implementation they performed is sufficient.
W.r.t the path being largely different, I may be missing something here, but I don't see why those paths are as complex and/or different. Can we not just group them together somehow? If they are only the cluster and these need to be hardcoded, I'd say we have them as part of the config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think my final question w.r.t the paths here is answered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all of the changes and improvements. Some additional comments or responses to consider. Getting closer to ready in my view 🙂
| attack_data_paths = shadow_pipeline.run_shadow_model_training(config) | ||
| attack_data_paths = [Path(path) for path in attack_data_paths] | ||
|
|
||
| target_data_path = shadow_pipeline.run_target_model_training(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tightly couples shadow model training with target model training. Based on our discussion, the target model is the one we're attacking right? Theoretically, this model may already exist and we just want to attack it. I.e. we may not always want to or be able to train it? Again, I may still be misunderstanding our vocabulary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, the target model is the model we're attacking in the simulated setting, and is a shadow model. The main difference between the target model and the other shadow models is the data it's being trained on. I might be wrong, but I think the target model is being trained on the entire "real data" while the other shadow models are being trained/fine-tuned on different combinations and subsets of the population data. The docstring on train_three_sets_of_shadow_models in midst_toolkit.attacks.ensemble.rmia.shadow_model_training explains it in more detail.
Our vocabulary and the original implementation's vocabulary isn't the most simple and I'm still not sure if I've done the right thing here. I want to keep the PR open a bit longer to get more eyes on it and the concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure sounds good. Like I said, I'm not as deeply integrated in the vocabulary. So this may be perfectly reasonable. Just wanted to ask the question.
| attack_data_paths = [ | ||
| "initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl", | ||
| "initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl", | ||
| "shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think my final question w.r.t the paths here is answered?
|
|
||
| def calculate_rmia_signals( | ||
| df_input: pd.DataFrame, | ||
| attack_data_collection: list[dict[str, list]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can be more specific for the inner list type that would be great. If you can't, then list[Any] closes it off.
| id_column_name: str, | ||
| id_column_data: pd.Series, | ||
| k: int = 5, | ||
| random_seed: int = 42, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in other comments, we don't want to make determinism the default if no random seed is provided.
| [len(data.synthetic_data) for data in attack_data_collection[1]["fine_tuned_results"]], | ||
| [len(data.synthetic_data) for data in attack_data_collection[2]["trained_results"]], | ||
| [len(data) for data in attack_data_collection[2]["selected_sets"]], | ||
| [len(data) for data in attack_data_collection[0]["fine_tuning_sets"]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor, but having the order not go [0], [1], [2] here ([2], [0], [1]) made me worried for a second until I realized it didn't matter for what you intended. However, that kind of "trip up" will likely get others. So I'd order these consistently.
| ) | ||
| shadow_gower_list.append(np.array(shadow_model_gower_2)) | ||
|
|
||
| gower_shadows = np.vstack(shadow_gower_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something here, but why bother with the appending your doing etc. You can condense those into just
gower_shadows = np.vstack([np.array(shadow_model_gower_0), np.array(shadow_model_gower_1), np.array(shadow_model_gower_2)])
can't you?
| shadow_gower_list = [] | ||
|
|
||
| all_lengths = [ | ||
| [len(data.synthetic_data) for data in attack_data_collection[0]["fine_tuned_results"]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You extract attack_data_collection[i] a bunch of times in this function. Pulling each of them into a named variable and just using that seems like a cleaner thing to do and you can give each an informative name.
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughThe ensemble attack pipeline was extended to add RMIA signal computation, a new target-model training workflow, updated data/config paths and toggles, and propagated id-column-aware data flow and signatures through metaclassifier training and blending logic. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant run_attack as run_attack.py
participant shadow_train as run_shadow_model_training.py
participant target_train as run_target_model_training
participant meta_train as run_metaclassifier_training.py
participant rmia as rmia_calculation.py
participant blend as BlendingPlusPlus
User->>run_attack: start pipeline
alt Shadow Training run
run_attack->>shadow_train: run_shadow_model_training(config)
shadow_train-->>run_attack: attack_data_paths (list[Path])
else Shadow Training skipped
run_attack->>run_attack: build attack_data_paths from config
end
run_attack->>target_train: run_target_model_training(config)
target_train-->>run_attack: target_data_path (Path)
alt Metaclassifier run
run_attack->>meta_train: run_metaclassifier_training(config, attack_data_paths, target_data_path)
meta_train->>meta_train: load attack pickles -> attack_data_collection
meta_train->>meta_train: load target pickle -> target_data
meta_train->>blend: instantiate BlendingPlusPlus(config, attack_data_collection, target_data)
meta_train->>blend: fit(df_meta_train, id_column_data)
rect rgb(220,240,255)
Note over blend,rmia: RMIA computation during feature prep
blend->>rmia: calculate_rmia_signals(df_input, attack_data_collection, target_data, id_column_data)
rmia-->>blend: rmia_signals_df
end
blend->>blend: assemble features (num + Gower + DOMIAS + RMIA)
blend->>blend: train metaclassifier
meta_train->>blend: predict(df_synthetic, id_column_data)
blend-->>meta_train: predictions
meta_train-->>User: save attack results
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
examples/ensemble_attack/run_shadow_model_training.py (1)
119-124: Remove duplicate assertion"trans_id" is asserted twice on df_master_challenge_train; drop one.
Apply:
-assert "trans_id" in df_master_challenge_train.columnsexamples/ensemble_attack/run_metaclassifier_training.py (1)
1-147: Replace hardcoded "trans_id" assignments with config-based id_column_nameHardcoded "trans_id" occurrences found across multiple files. The most practical migrations are:
run_shadow_model_training.pylines 48 and 134: Replaceid_column_name = "trans_id"assignments with values from config (already defined indata_types.json).run_metaclassifier_training.pylines 78-85: Consider extracting id_column_name to a config variable to reduce hardcoding in assertions and column operations.real_data_collection.pylines 168, 186, 189, 200: Drop operations targeting "trans_id" are tightly coupled to data format; verify if these should remain hardcoded or become configurable.src/midst_toolkit/attacks/ensemble/blending.py (1)
103-145: Exclude ID column from Gower and DOMIAS feature calculations.As written, df_input/df_synthetic/df_reference still include the ID column, which will leak into Gower/DOMIAS and distort distances/densities. Keep RMIA call on full df_input, but drop ID for Gower/DOMIAS.
- df_synthetic = df_synthetic.reset_index(drop=True)[df_input.columns] + df_synthetic = df_synthetic.reset_index(drop=True)[df_input.columns] @@ - rmia_signals = calculate_rmia_signals( + rmia_signals = calculate_rmia_signals( df_input=df_input, attack_data_collection=self.attack_data_collection, target_data=self.target_data, categorical_column_names=categorical_cols, id_column_name=id_column_name, id_column_data=id_column_data, random_seed=self.random_seed, ) - # 2. Get Gower distance features + # 2. Get Gower distance features (exclude ID) log(INFO, "Calculating Gower features...") - gower_features = calculate_gower_features( - df_input=df_input, df_synthetic=df_synthetic, categorical_column_names=categorical_cols - ) + df_input_no_id = df_input.drop(columns=[id_column_name]) + df_synth_no_id = df_synthetic.drop(columns=[id_column_name]) + gower_features = calculate_gower_features( + df_input=df_input_no_id, df_synthetic=df_synth_no_id, categorical_column_names=categorical_cols + ) - # 3. Get DOMIAS predictions + # 3. Get DOMIAS predictions (exclude ID) log(INFO, "Calculating DOMIAS scores...") - domias_features = calculate_domias_score( - df_input=df_input, df_synthetic=df_synthetic, df_reference=df_reference - ) + df_ref_no_id = df_reference.drop(columns=[id_column_name]) if id_column_name in df_reference.columns else df_reference + domias_features = calculate_domias_score( + df_input=df_input_no_id, df_synthetic=df_synth_no_id, df_reference=df_ref_no_id + )
♻️ Duplicate comments (1)
examples/ensemble_attack/run_shadow_model_training.py (1)
142-144: Nice: clearer multi‑line log formattingMatches prior review suggestion and improves readability.
🧹 Nitpick comments (17)
examples/ensemble_attack/data_configs/data_types.json (1)
4-6: Good: explicit id_column_name addedPlease consume this in code instead of hardcoding "trans_id" (e.g., in run_shadow_model_training.py and run_metaclassifier_training.py).
examples/ensemble_attack/run_shadow_model_training.py (4)
46-49: Avoid hardcoding id/table names; read from config/data_types.jsonLoad id_column_name from examples/ensemble_attack/data_configs/data_types.json (already configured) and move table_name to config to prevent drift.
Apply:
+import json @@ - # TODO: Add this to config or .json files - table_name = "trans" - id_column_name = "trans_id" + # Read identifiers from config + with open(config.metaclassifier.data_types_file_path, "r") as f: + _types = json.load(f) + id_column_name = _types["id_column_name"] + table_name = config.shadow_training.get("table_name", "trans")
32-37: Add explicit check for id column in real_train.csvFail fast with a clear error if the id column is missing.
Apply:
df_real_data = load_dataframe( Path(config.data_paths.processed_attack_data_path), "real_train.csv", ) + assert id_column_name in df_real_data.columns, ( + f"'{id_column_name}' must exist in real_train.csv for target model training." + )
75-77: Clarify selected_id_lists shapeYour current construction yields a list-of-list as expected by RMIA ([[...]]) — good. Consider a brief comment to document the contract for target_data["selected_sets"].
127-140: Parameterize model counts via config instead of literalsExpose n_models_per_set and n_reps under config.shadow_training to keep logic configurable and aligned with docs.
Apply:
- n_models_per_set=4, - n_reps=12, + n_models_per_set=config.shadow_training.get("n_models_per_set", 4), + n_reps=config.shadow_training.get("n_reps", 12),And add these keys in config.yaml (see separate comment).
examples/ensemble_attack/config.yaml (2)
16-21: Add explicit flag to control target model trainingDecouple target model training from shadow training with a pipeline flag.
Apply:
pipeline: run_data_processing: false # Set this to false if you have already saved the processed data - run_shadow_model_training: false # Set this to false if shadow models are already trained and saved + run_shadow_model_training: false # Set this to false if shadow models are already trained and saved + run_target_model_training: false # Set this to false if target model is already trained and saved run_metaclassifier_training: true
56-66: Expose counts in config; optional table nameAdd n_models_per_set/n_reps here and (optionally) table_name to avoid literals in code.
Apply:
shadow_training: @@ shadow_models_output_path: ${base_data_dir}/shadow_models_and_data target_model_output_path: ${base_data_dir}/target_model_and_data + # Optional: main table name for TabDDPM training + table_name: trans + # Shadow training sizes + n_models_per_set: 4 # must be even + n_reps: 12 # repetitions per challenge point final_shadow_models_path: [ "initial_model_rmia_1/shadow_workspace/pre_trained_model/rmia_shadows.pkl", "initial_model_rmia_2/shadow_workspace/pre_trained_model/rmia_shadows.pkl", "shadow_model_rmia_third_set/shadow_workspace/trained_model/rmia_shadows_third_set.pkl", ] # Paths to final shadow models used for metaclassifier training (relative to shadow_models_output_path) final_target_model_path: "target_model/shadow_workspace/trained_target_model/target_model.pkl"examples/ensemble_attack/run_attack.py (2)
61-67: Gate target model training behind a dedicated flagAvoid coupling target training with shadow training.
Apply:
- if config.pipeline.run_shadow_model_training: + if config.pipeline.run_shadow_model_training: shadow_pipeline = importlib.import_module("examples.ensemble_attack.run_shadow_model_training") attack_data_paths = shadow_pipeline.run_shadow_model_training(config) - attack_data_paths = [Path(path) for path in attack_data_paths] - - target_data_path = shadow_pipeline.run_target_model_training(config) - target_data_path = Path(target_data_path) + # Results are already Paths; no reconversion needed + if config.pipeline.get("run_target_model_training", False): + target_data_path = shadow_pipeline.run_target_model_training(config) + target_data_path = Path(target_data_path)
63-64: Remove redundant Path conversionrun_shadow_model_training already returns list[Path].
Apply:
- attack_data_paths = [Path(path) for path in attack_data_paths] + # attack_data_paths already Path list; no conversion requiredexamples/ensemble_attack/run_metaclassifier_training.py (3)
77-85: Use id_column_name from data_types.json; avoid hardcoded "trans_id"Keeps alignment with config and shadow/target artifacts.
Apply:
+ # Load id column name from data_types config + import json + with open(config.metaclassifier.data_types_file_path, "r") as f: + _types = json.load(f) + id_col = _types["id_column_name"] @@ - assert "trans_id" in df_meta_train.columns, "Meta train data must have trans_id column" - train_trans_ids = df_meta_train["trans_id"] + assert id_col in df_meta_train.columns, f"Meta train data must have {id_col} column" + train_trans_ids = df_meta_train[id_col] @@ - assert "trans_id" in df_meta_test.columns, "Meta test data must have trans_id column" - test_trans_ids = df_meta_test["trans_id"] + assert id_col in df_meta_test.columns, f"Meta test data must have {id_col} column" + test_trans_ids = df_meta_test[id_col] @@ - df_meta_train = df_meta_train.drop(columns=["trans_id", "account_id"]) - df_meta_test = df_meta_test.drop(columns=["trans_id", "account_id"]) + df_meta_train = df_meta_train.drop(columns=[id_col, "account_id"], errors="ignore") + df_meta_test = df_meta_test.drop(columns=[id_col, "account_id"], errors="ignore")
120-126: Confirm correct synthetic set for evaluation, or documentYou switch to challenge-provided synth.csv here (vs. target synth for training). If intentional, add a brief comment explaining why; otherwise, consider using the same target synthetic set for both phases for consistency.
14-27: Public API change: signature now requires pathsGood change. Document that paths must exist and point to pickles created by this toolkit; consider accepting strings and coercing to Path for robustness.
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (2)
56-63: Use WARNING level for missing categorical columns.Message is a warning but logged at INFO. Switch to WARNING for visibility.
-from logging import INFO +from logging import INFO, WARNING @@ - log( - INFO, + log( + WARNING,
37-49: Docstring mismatch formodel_datakeys.Function is used with both FINE_TUNED_RESULTS and TRAINED_RESULTS; update Args to reflect either key rather than only fine-tune keys.
src/midst_toolkit/attacks/ensemble/blending.py (1)
147-157: Remove stale TODO; epochs already handled via None default.The comment is outdated; delete to avoid confusion.
- # TODO: Handle epochs parameter def fit(tests/unit/attacks/ensemble/test_meta_classifier.py (2)
140-181: Assert that ID is excluded from Gower/DOMIAS inputs.Add checks on mocked call args to ensure _prepare_meta_features drops the ID column for these calculators.
meta_features = bpp._prepare_meta_features( @@ mock_gower.assert_called_once() mock_domias.assert_called_once() mock_rmia.assert_called_once() + # Verify ID column is excluded from Gower/DOMIAS inputs + _, gower_kwargs = mock_gower.call_args + assert "id_col" not in gower_kwargs["df_input"].columns + assert "id_col" not in gower_kwargs["df_synthetic"].columns + + _, domias_kwargs = mock_domias.call_args + assert "id_col" not in domias_kwargs["df_input"].columns + assert "id_col" not in domias_kwargs["df_synthetic"].columns + assert "id_col" not in domias_kwargs["df_reference"].columns
25-33: Remove unused cfg fixture (hydra compose).Fixture isn’t referenced; drop to speed tests and avoid unnecessary Hydra initialization.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.gitignore(1 hunks)examples/ensemble_attack/config.yaml(3 hunks)examples/ensemble_attack/data_configs/data_types.json(1 hunks)examples/ensemble_attack/run_attack.py(1 hunks)examples/ensemble_attack/run_metaclassifier_training.py(3 hunks)examples/ensemble_attack/run_shadow_model_training.py(2 hunks)src/midst_toolkit/attacks/ensemble/blending.py(11 hunks)src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py(1 hunks)src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py(1 hunks)tests/unit/attacks/ensemble/test_meta_classifier.py(11 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
examples/ensemble_attack/run_metaclassifier_training.py (2)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
load_dataframe(31-52)src/midst_toolkit/attacks/ensemble/blending.py (3)
MetaClassifierType(20-22)BlendingPlusPlus(25-260)predict(209-260)
examples/ensemble_attack/run_attack.py (2)
examples/ensemble_attack/run_shadow_model_training.py (2)
run_shadow_model_training(93-146)run_target_model_training(20-90)examples/ensemble_attack/run_metaclassifier_training.py (1)
run_metaclassifier_training(14-147)
tests/unit/attacks/ensemble/test_meta_classifier.py (2)
src/midst_toolkit/attacks/ensemble/blending.py (5)
BlendingPlusPlus(25-260)MetaClassifierType(20-22)_prepare_meta_features(75-145)fit(148-207)predict(209-260)src/midst_toolkit/attacks/ensemble/xgboost_tuner.py (1)
tune_hyperparameters(117-146)
src/midst_toolkit/attacks/ensemble/blending.py (4)
src/midst_toolkit/attacks/ensemble/distance_features.py (2)
calculate_domias_score(62-124)calculate_gower_features(10-59)src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (1)
calculate_rmia_signals(114-323)src/midst_toolkit/attacks/ensemble/train_utils.py (1)
get_tpr_at_fpr(7-25)src/midst_toolkit/attacks/ensemble/xgboost_tuner.py (1)
XgBoostHyperparameterTuner(20-146)
examples/ensemble_attack/run_shadow_model_training.py (3)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
load_dataframe(31-52)src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py (1)
train_three_sets_of_shadow_models(288-414)src/midst_toolkit/attacks/ensemble/shadow_model_utils.py (2)
save_additional_tabddpm_config(37-77)train_tabddpm_and_synthesize(81-156)
🪛 Ruff (0.14.1)
examples/ensemble_attack/run_metaclassifier_training.py
60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
🔇 Additional comments (1)
src/midst_toolkit/attacks/ensemble/rmia/shadow_model_training.py (1)
312-316: Docstring consistency looks goodThe numbers (16 total; 8 fine‑tuned + 8 scratch; pre‑train size drawn from config) match the implementation. LGTM.
.gitignore
Outdated
| # Shadow models | ||
| examples/ensemble_attack/shadow_models_data/ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix path mismatch with config; also ignore target model artifacts
Config uses shadow_models_and_data and target_model_and_data. Current ignore points to shadow_models_data, so artifacts may be committed.
Apply:
-# Shadow models
-examples/ensemble_attack/shadow_models_data/
+# Shadow/target model artifacts
+examples/ensemble_attack/shadow_models_and_data/
+examples/ensemble_attack/target_model_and_data/📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Shadow models | |
| examples/ensemble_attack/shadow_models_data/ | |
| # Shadow/target model artifacts | |
| examples/ensemble_attack/shadow_models_and_data/ | |
| examples/ensemble_attack/target_model_and_data/ | |
🤖 Prompt for AI Agents
In .gitignore around lines 39 to 41, the current entry
"examples/ensemble_attack/shadow_models_data/" does not match the config which
expects "shadow_models_and_data" and misses ignoring target model artifacts;
update the ignore paths to "examples/ensemble_attack/shadow_models_and_data/"
(matching config exactly) and add an entry
"examples/ensemble_attack/target_model_and_data/" (or the correct target
directory name used in config) so both shadow and target model artifacts are
ignored and cannot be accidentally committed.
| assert len(attack_data_paths) == 3, ( | ||
| "At this point of development, the attack_data_paths list must contain exactly three elements." | ||
| ) | ||
|
|
||
| attack_data_collection = [] | ||
|
|
||
| for model_path in attack_data_paths: | ||
| assert model_path.exists(), ( | ||
| f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first." | ||
| ) | ||
|
|
||
| with open(model_path, "rb") as f: | ||
| shadow_data_and_result = pickle.load(f) | ||
| attack_data_collection.append(shadow_data_and_result) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate and constrain loaded pickle inputs (security posture)
pickle.load is unsafe on untrusted files. These are internal artifacts, but add basic guardrails: ensure paths resolve under expected base dirs before loading.
Apply:
+ # Basic safety: ensure attack_data_paths live under configured shadow_models_output_path
+ base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()
+ for p in attack_data_paths:
+ resolved = Path(p).resolve()
+ assert str(resolved).startswith(str(base_shadow_dir)), f"Refusing to load pickle outside {base_shadow_dir}: {resolved}"
@@
- assert target_data_path.exists(), (
+ assert target_data_path.exists(), (
f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
)
+ # Ensure target path under expected base dir
+ base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()
+ resolved_target = Path(target_data_path).resolve()
+ assert str(resolved_target).startswith(str(base_target_dir)), f"Refusing to load pickle outside {base_target_dir}: {resolved_target}"Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.14.1)
60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 48 to
62, add guardrails before unpickling: define the expected base directory (e.g.,
SHADOW_MODELS_DIR or a configured Path), verify each model_path.resolve() is
under that base (use Path.resolve() + Path.is_relative_to(base) or compare
str(Path(commonpath))), ensure the filename has an expected suffix (like .pkl),
and wrap the pickle.load call in a try/except that raises a clear error if
validation fails or unpickling raises an exception; this prevents loading files
outside the intended directory and surfaces failures safely.
| numerical_columns = [col for col in df_input.columns if col not in categorical_column_names] | ||
|
|
||
| for i in range(len(model_data[key.value])): | ||
| df_synthetic = model_data[key.value][i].synthetic_data.copy() | ||
|
|
||
| # Convert numerical columns to float (otherwise error in the numpy divide) | ||
| df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float) | ||
| df_input[numerical_columns] = df_input[numerical_columns].astype(float) | ||
|
|
||
| # Sample synthetic data points if there's too many | ||
| if len(df_synthetic) > min_length: | ||
| df_synthetic = df_synthetic.sample(n=min_length, random_state=random_seed) | ||
|
|
||
| df_synthetic = df_synthetic.drop(columns=[id_column_name]) | ||
|
|
||
| gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features) | ||
|
|
||
| assert np.all((gower_matrix >= 0) & (gower_matrix <= 1)), "Distances are falling outside of range [0, 1]." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Fix Gower inputs: drop ID from df_input and avoid in-place mutation.
Currently df_synthetic drops the ID column but df_input keeps it; this mismatches columns and can break gower_matrix. Also, casting df_input in-place mutates caller state.
- Drop ID from df_input as well for Gower.
- Build cat mask after removing ID.
- Cast on copies only.
Apply:
- categorical_features = [column in categorical_column_names for column in df_input.columns]
-
- gower_matrices = []
-
- numerical_columns = [col for col in df_input.columns if col not in categorical_column_names]
+ # Work on copies and exclude the ID column from distance computations
+ df_input_no_id = df_input.drop(columns=[id_column_name]).copy()
+ gower_matrices = []
+ categorical_features = [c in categorical_column_names for c in df_input_no_id.columns]
+ numerical_columns = [c for c in df_input_no_id.columns if c not in categorical_column_names]
@@
- # Convert numerical columns to float (otherwise error in the numpy divide)
- df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
- df_input[numerical_columns] = df_input[numerical_columns].astype(float)
+ # Convert numerical columns to float (otherwise error in the numpy divide)
+ df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float)
+ df_input_no_id[numerical_columns] = df_input_no_id[numerical_columns].astype(float)
@@
- df_synthetic = df_synthetic.drop(columns=[id_column_name])
-
- gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features)
+ df_synthetic = df_synthetic.drop(columns=[id_column_name])
+ gower_matrix = gower.gower_matrix(
+ data_x=df_input_no_id, data_y=df_synthetic, cat_features=categorical_features
+ )Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py around lines 69
to 86, the code drops the ID column only from df_synthetic and casts df_input
in-place which leads to mismatched columns for gower and mutates the caller
state; fix by creating local copies of df_input and df_synthetic, drop the
id_column_name from both copies before computing gower_matrix, rebuild the
categorical_features/categorical_column_names mask after the ID is removed, and
perform numeric astype(float) conversions on those local copies only so the
original data and model_data remain unchanged and columns align for
gower.gower_matrix.
| # Create masks for records in/out of training sets | ||
| mask_in_training = np.array( | ||
| [ | ||
| [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))] | ||
| for i in range(len(shadow_training_sets)) | ||
| ] | ||
| ) | ||
|
|
||
| mask_not_in_training = np.array( | ||
| [ | ||
| [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))] | ||
| for i in range(len(shadow_training_sets)) | ||
| ] | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Membership mask uses value in DataFrame (checks columns), producing wrong masks.
Compute membership against each training set’s ID column values. Also derive the OUT mask as negation to keep shapes aligned.
- mask_in_training = np.array(
- [
- [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))]
- for i in range(len(shadow_training_sets))
- ]
- )
-
- mask_not_in_training = np.array(
- [
- [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))]
- for i in range(len(shadow_training_sets))
- ]
- )
+ # Build membership masks per shadow training DF, using its ID column
+ shadow_training_id_sets = [
+ set(df[id_column_name].tolist()) for df in shadow_training_sets # type: ignore[index]
+ ]
+ mask_in_training = np.array(
+ [results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets]
+ )
+ mask_not_in_training = ~mask_in_training📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create masks for records in/out of training sets | |
| mask_in_training = np.array( | |
| [ | |
| [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))] | |
| for i in range(len(shadow_training_sets)) | |
| ] | |
| ) | |
| mask_not_in_training = np.array( | |
| [ | |
| [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))] | |
| for i in range(len(shadow_training_sets)) | |
| ] | |
| ) | |
| # Create masks for records in/out of training sets | |
| # Build membership masks per shadow training DF, using its ID column | |
| shadow_training_id_sets = [ | |
| set(df[id_column_name].tolist()) for df in shadow_training_sets # type: ignore[index] | |
| ] | |
| mask_in_training = np.array( | |
| [results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets] | |
| ) | |
| mask_not_in_training = ~mask_in_training |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (2)
298-312: Critical: Membership masks check column names instead of ID values.Lines 301 and 308 use
in shadow_training_sets[i]whereshadow_training_sets[i]is a DataFrame. Theinoperator on a DataFrame checks if the value exists as a column name, not within the data itself. This produces completely incorrect membership masks, corrupting all downstream RMIA signals.Apply this diff to fix the membership checks:
- # Create masks for records in/out of training sets + # Create masks for records in/out of training sets by checking ID column values + shadow_training_id_sets = [ + set(df[id_column_name].tolist()) for df in shadow_training_sets + ] + mask_in_training = np.array( - [ - [results_df[id_column_name][j] in shadow_training_sets[i] for j in range(len(results_df))] - for i in range(len(shadow_training_sets)) - ] + [results_df[id_column_name].isin(id_set).to_numpy() for id_set in shadow_training_id_sets] ) - mask_not_in_training = np.array( - [ - [results_df[id_column_name][j] not in shadow_training_sets[i] for j in range(len(results_df))] - for i in range(len(shadow_training_sets)) - ] - ) + mask_not_in_training = ~mask_in_training
65-84: Critical: Fix ID column mismatch and in-place mutation in Gower computation.The current implementation has two critical issues:
Column mismatch: Line 65 builds
categorical_featuresmask based ondf_input.columns(which includes the ID column), but line 82 drops the ID fromdf_syntheticonly. Whengower_matrixis called at line 84,df_inputstill contains the ID column whiledf_syntheticdoesn't, causing misaligned columns and an incorrect boolean mask.In-place mutation: Line 76 modifies
df_input[numerical_columns]in-place, corrupting the caller's DataFrame.Apply this diff to fix both issues:
+ # Work on a copy and exclude ID from distance computations + df_input_no_id = df_input.drop(columns=[id_column_name]).copy() + - categorical_features = [column in categorical_column_names for column in df_input.columns] + categorical_features = [column in categorical_column_names for column in df_input_no_id.columns] gower_matrices = [] - numerical_columns = [col for col in df_input.columns if col not in categorical_column_names] + numerical_columns = [col for col in df_input_no_id.columns if col not in categorical_column_names] for i in range(len(model_data[key.value])): df_synthetic = model_data[key.value][i].synthetic_data.copy() # Convert numerical columns to float (otherwise error in the numpy divide) df_synthetic[numerical_columns] = df_synthetic[numerical_columns].astype(float) - df_input[numerical_columns] = df_input[numerical_columns].astype(float) + df_input_no_id[numerical_columns] = df_input_no_id[numerical_columns].astype(float) # Sample synthetic data points if there's too many if len(df_synthetic) > min_length: df_synthetic = df_synthetic.sample(n=min_length, random_state=random_seed) df_synthetic = df_synthetic.drop(columns=[id_column_name]) - gower_matrix = gower.gower_matrix(data_x=df_input, data_y=df_synthetic, cat_features=categorical_features) + gower_matrix = gower.gower_matrix(data_x=df_input_no_id, data_y=df_synthetic, cat_features=categorical_features) assert np.all((gower_matrix >= 0) & (gower_matrix <= 1)), "Distances are falling outside of range [0, 1]."
🧹 Nitpick comments (3)
examples/ensemble_attack/run_metaclassifier_training.py (3)
14-27: Consider clarifying why three shadow model paths are required.The docstring mentions "three paths" but doesn't explain the rationale. A brief note about why exactly three sets of shadow models are needed for RMIA would help future maintainers understand this constraint.
Apply this diff to enhance the docstring:
Args: config: Configuration object set in config.yaml. attack_data_paths: List of paths to the trained shadow models and all their attributes and synthetic data. - The list should contain three paths, one for each set of shadow models. + The list must contain exactly three paths, one for each set of shadow models. The RMIA attack + requires three distinct shadow model sets to compute membership inference signals. target_data_path: Path to the target model and all its attributes and synthetic data.
46-50: Clarify whether the three-path requirement is temporary or permanent.The assertion message says "at this point of development," which suggests this might be a temporary limitation. If the requirement for exactly three shadow model sets is fundamental to RMIA (as indicated by the algorithm), the message should reflect that it's a permanent constraint rather than a development-phase restriction.
If this is a permanent requirement, apply this diff:
assert len(attack_data_paths) == 3, ( - "At this point of development, the attack_data_paths list must contain exactly three elements." + "The RMIA attack requires exactly three shadow model sets. Received {len(attack_data_paths)} paths." )
59-68: Consider more descriptive variable names for loaded data structures.Previous reviews noted that
shadow_data_and_resultandtarget_data_and_resultare somewhat misleading since they contain complete training artifacts (datasets, models, synthetic data, results) rather than just models. More descriptive names likeshadow_training_artifactsorshadow_model_bundlewould better convey their comprehensive content.Apply this diff:
with open(model_path, "rb") as f: - shadow_data_and_result = pickle.load(f) - attack_data_collection.append(shadow_data_and_result) + shadow_training_artifacts = pickle.load(f) + attack_data_collection.append(shadow_training_artifacts) assert target_data_path.exists(), ( f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model." ) with open(target_data_path, "rb") as f: - target_data_and_result = pickle.load(f) + target_training_artifacts = pickle.load(f) - synth = target_data_and_result["trained_results"][0].synthetic_data + synth = target_training_artifacts["trained_results"][0].synthetic_data assert synth is not None, "Target model pickle missing synthetic_data." df_synthetic = synth.copy()And update line 96:
blending_attacker = BlendingPlusPlus( config=config, attack_data_collection=attack_data_collection, - target_data=target_data_and_result, + target_data=target_training_artifacts, meta_classifier_type=meta_classifier_enum, random_seed=config.random_seed, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.gitignore(1 hunks)examples/ensemble_attack/run_metaclassifier_training.py(3 hunks)src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
🧰 Additional context used
🧬 Code graph analysis (1)
examples/ensemble_attack/run_metaclassifier_training.py (2)
src/midst_toolkit/attacks/ensemble/data_utils.py (1)
load_dataframe(31-52)src/midst_toolkit/attacks/ensemble/blending.py (3)
MetaClassifierType(20-22)BlendingPlusPlus(25-260)predict(209-260)
🪛 Ruff (0.14.1)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py
212-212: Avoid specifying long messages outside the exception class
(TRY003)
216-216: Avoid specifying long messages outside the exception class
(TRY003)
examples/ensemble_attack/run_metaclassifier_training.py
60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
🔇 Additional comments (9)
src/midst_toolkit/attacks/ensemble/rmia/rmia_calculation.py (3)
93-112: LGTM!The conditional averaging logic correctly handles edge cases by returning NaN when no values satisfy the condition, preventing division-by-zero issues.
197-217: LGTM!The validation logic properly guards against empty data collections and ensures k is within valid bounds before proceeding with distance computations.
328-351: LGTM!The RMIA ratio calculations properly use
np.dividewith guards to prevent division-by-zero, setting results to NaN when denominators are zero or negative. This ensures robust signal computation.examples/ensemble_attack/run_metaclassifier_training.py (6)
70-72: LGTM! Proper validation of synthetic data.The explicit check that synthetic data is not None before copying prevents AttributeError if training was skipped or synthesis was disabled. This addresses the previous review feedback effectively.
79-87: LGTM! Proper ID column handling for RMIA alignment.The explicit extraction of
trans_idcolumns before dropping them ensures correct alignment of RMIA features with the data samples. The assertions guard against missing columns, making the flow robust.
93-101: LGTM! Correct initialization with RMIA data.The
BlendingPlusPlusinitialization correctly passesattack_data_collectionand target data, aligning with the updated constructor signature for RMIA support.
105-113: LGTM! Proper ID column data passed to fit.The addition of
id_column_data=train_trans_idscorrectly provides the ID column for RMIA signal alignment during training, matching the updated method signature.
130-136: LGTM! Proper ID column data passed to predict.The predict call correctly includes
id_column_data=test_trans_idsfor RMIA feature alignment during evaluation, matching the updated method signature.
139-140: LGTM! Proper directory creation before saving results.Using
mkdir(parents=True, exist_ok=True)safely ensures the results directory exists before saving predictions, preventing potential FileNotFoundError.
| for model_path in attack_data_paths: | ||
| assert model_path.exists(), ( | ||
| f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first." | ||
| ) | ||
|
|
||
| with open(model_path, "rb") as f: | ||
| shadow_data_and_result = pickle.load(f) | ||
| attack_data_collection.append(shadow_data_and_result) | ||
|
|
||
| assert target_data_path.exists(), ( | ||
| f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model." | ||
| ) | ||
|
|
||
| with open(target_data_path, "rb") as f: | ||
| target_data_and_result = pickle.load(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Add path validation before unpickling to improve security posture.
Loading pickled data without validation poses a security risk if paths are manipulated. Although these are internal artifacts, adding basic guardrails ensures that only files from expected directories are loaded, preventing potential exploitation.
Apply this diff to add path validation:
+ # Validate that attack data paths are within expected shadow models directory
+ if hasattr(config.shadow_training, 'shadow_models_output_path'):
+ base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()
+
for model_path in attack_data_paths:
assert model_path.exists(), (
f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
)
+
+ # Security check: ensure path is within expected directory
+ if hasattr(config.shadow_training, 'shadow_models_output_path'):
+ resolved_path = Path(model_path).resolve()
+ try:
+ resolved_path.relative_to(base_shadow_dir)
+ except ValueError:
+ raise ValueError(f"Refusing to load pickle outside {base_shadow_dir}: {resolved_path}")
with open(model_path, "rb") as f:
shadow_data_and_result = pickle.load(f)
attack_data_collection.append(shadow_data_and_result)
assert target_data_path.exists(), (
f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
)
+
+ # Security check: ensure target path is within expected directory
+ if hasattr(config.shadow_training, 'target_model_output_path'):
+ base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()
+ resolved_target = Path(target_data_path).resolve()
+ try:
+ resolved_target.relative_to(base_target_dir)
+ except ValueError:
+ raise ValueError(f"Refusing to load pickle outside {base_target_dir}: {resolved_target}")
with open(target_data_path, "rb") as f:Run the following script to verify that the config contains the expected base directory paths:
🏁 Script executed:
#!/bin/bash
# Check if shadow_training configuration contains output path specifications
rg -n "shadow_models_output_path|target_model_output_path" --type=yamlLength of output: 762
🏁 Script executed:
#!/bin/bash
# Examine the full context of the file under review, especially how paths are created
cat examples/ensemble_attack/run_metaclassifier_training.py | head -100Length of output: 3674
🏁 Script executed:
#!/bin/bash
# Search for where attack_data_paths and target_data_path variables are assigned
rg -n "attack_data_paths\s*=|target_data_path\s*=" examples/ensemble_attack/run_metaclassifier_training.py -A 3Length of output: 0
🏁 Script executed:
#!/bin/bash
# Check for Path imports and usage context
rg -n "from pathlib|import.*Path|Path\(" examples/ensemble_attack/run_metaclassifier_training.py -B 2 -A 2Length of output: 1807
🏁 Script executed:
#!/bin/bash
# Check config usage and how base directories are referenced
rg -n "config\.shadow_training" examples/ensemble_attack/run_metaclassifier_training.py -B 2 -A 2Length of output: 0
🏁 Script executed:
#!/bin/bash
# Look for any existing path validation logic
rg -n "resolve|relative_to|ValueError" examples/ensemble_attack/run_metaclassifier_training.py -B 1 -A 1Length of output: 0
---
🏁 Script executed:
```shell
#!/bin/bash
# Find where run_metaclassifier_training is called
rg -n "run_metaclassifier_training" --type=py
Length of output: 485
🏁 Script executed:
#!/bin/bash
# Look for the main or script entry point that uses ensemble attack
fd -e py "main|__main__|run.*ensemble" examples/ensemble_attack/ --type fLength of output: 0
🏁 Script executed:
#!/bin/bash
# Check if there's a runner script in examples
ls -la examples/ensemble_attack/ | grep -E "\.py$"Length of output: 354
🏁 Script executed:
#!/bin/bash
# Search for how attack_data_paths list is constructed
rg -n "attack_data_paths\s*=|\[.*shadow.*\]" --type=py -A 5 | head -50Length of output: 5231
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine run_attack.py to see how attack_data_paths and target_data_path are constructed
cat examples/ensemble_attack/run_attack.py
Length of output: 3909
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine run_shadow_model_training to see where paths come from
head -120 examples/ensemble_attack/run_shadow_model_training.py
Length of output: 4314
🏁 Script executed:
#!/bin/bash
# Search for where paths are returned in shadow model training
rg -n "return.*path|append.*path" examples/ensemble_attack/run_shadow_model_training.py -B 2 -A 1Length of output: 179
---
🏁 Script executed:
```shell
#!/bin/bash
# Check if config.shadow_training always has these keys or if they're optional
rg -n "target_model_output_path|shadow_models_output_path" examples/ensemble_attack/config.yaml -B 2 -A 1
Length of output: 1097
🏁 Script executed:
#!/bin/bash
# Verify Path behavior with absolute paths to confirm the concern
python3 << 'EOF'
from pathlib import Path
# Test 1: Normal relative path
base = Path("/models")
rel_path = "shadow_set_1.pkl"
result1 = base / rel_path
print(f"Test 1 - base / rel_path: {result1}")
# Test 2: Absolute path (PROBLEM)
abs_path = "/etc/passwd"
result2 = base / abs_path
print(f"Test 2 - base / abs_path: {result2}")
print(f"Result is absolute? {result2.is_absolute()}")
# Test 3: Traversal with resolve()
traversal_path = "/models/../../../etc/passwd"
resolved = Path(traversal_path).resolve()
print(f"Test 3 - traversal.resolve(): {resolved}")
# Test 4: Validation with relative_to
try:
Path("/etc/passwd").resolve().relative_to(Path("/models").resolve())
print("Test 4 - relative_to succeeded (shouldn't)")
except ValueError as e:
print(f"Test 4 - relative_to failed as expected: {e}")
EOFLength of output: 296
Add path validation before unpickling to prevent directory traversal attacks.
Loading pickled data without validating that paths stay within expected directories creates a security risk. While config.shadow_training.shadow_models_output_path and target_model_output_path are defined in config, the relative paths from config could be absolute or contain traversal sequences that escape the base directory when combined using the / operator.
Add validation to ensure loaded files are within their expected base directories:
from pathlib import Path
# At the start of the for loop
base_shadow_dir = Path(config.shadow_training.shadow_models_output_path).resolve()
for model_path in attack_data_paths:
assert model_path.exists(), (
f"No file found at {model_path}. Make sure the path is correct, or run shadow model training first."
)
# Security check: ensure path is within expected directory
resolved_path = Path(model_path).resolve()
try:
resolved_path.relative_to(base_shadow_dir)
except ValueError:
raise ValueError(f"Refusing to load pickle outside {base_shadow_dir}: {resolved_path}")
with open(model_path, "rb") as f:
shadow_data_and_result = pickle.load(f)
attack_data_collection.append(shadow_data_and_result)
# Before loading target model
base_target_dir = Path(config.shadow_training.target_model_output_path).resolve()
assert target_data_path.exists(), (
f"No file found at {target_data_path}. Make sure the path is correct and that you have trained the target model."
)
resolved_target = Path(target_data_path).resolve()
try:
resolved_target.relative_to(base_target_dir)
except ValueError:
raise ValueError(f"Refusing to load pickle outside {base_target_dir}: {resolved_target}")
with open(target_data_path, "rb") as f:
target_data_and_result = pickle.load(f)🧰 Tools
🪛 Ruff (0.14.1)
60-60: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
68-68: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 54-68,
add path validation before unpickling: resolve the configured base directories
(shadow_models_output_path and target_model_output_path) to absolute paths, then
resolve each model_path/target_data_path and ensure
resolved_path.relative_to(base_dir) succeeds; if it raises ValueError, raise a
clear error refusing to load pickles outside the base dir, otherwise proceed to
open and pickle.load.
| # Get the synthetic data provided by the challenge for evaluation | ||
| # TODO: Check if the file is the correct one. | ||
| df_synthetic_original = load_dataframe( | ||
| Path(config.data_paths.processed_attack_data_path), | ||
| "synth.csv", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Address or remove the TODO comment about file correctness.
The TODO suggests uncertainty about whether synth.csv is the correct synthetic data for evaluation. Since training uses df_synthetic from the target model (line 72), while evaluation uses df_synthetic_original from this CSV, please verify this is the intended behavior. If different synthetic datasets are required for training vs. evaluation, the comment should clarify the rationale; otherwise, the inconsistency should be resolved.
Run the following script to understand the relationship between the target model's synthetic data and synth.csv:
Do you want me to help clarify the intended usage pattern for these two synthetic datasets, or would you like me to open an issue to track resolving this TODO?
🏁 Script executed:
#!/bin/bash
# Check for documentation or comments explaining the two synthetic datasets
rg -n "synth.csv|synthetic.*evaluation|df_synthetic_original" --type=py -A 3 -B 3
# Look for configuration or documentation about evaluation data
rg -n "processed_attack_data_path|synth" --type=yamlLength of output: 2267
🏁 Script executed:
#!/bin/bash
# Find where df_synthetic is loaded (mentioned in line 72)
rg -n "df_synthetic\s*=" --type=py -B 5 -A 5 examples/ensemble_attack/run_metaclassifier_training.py | head -60
# Look for how df_synthetic is used in fit call
rg -n "\.fit\(" --type=py -B 5 -A 5 examples/ensemble_attack/run_metaclassifier_training.py
# Check for any documentation comments around training vs evaluation data
rg -n "training|evaluation" --type=py -B 2 -A 2 examples/ensemble_attack/run_metaclassifier_training.py | head -40Length of output: 287
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the file first
fd "run_metaclassifier_training.py" --type f
# Also check what files exist in examples directory
fd --type f --max-depth 3 examples/ 2>/dev/null | head -30
Length of output: 55
---
🏁 Script executed:
```shell
#!/bin/bash
# View lines around 72 to see df_synthetic loading
sed -n '65,85p' examples/ensemble_attack/run_metaclassifier_training.py
# View the fit call around line 100-110
sed -n '100,115p' examples/ensemble_attack/run_metaclassifier_training.py
# View the full context around lines 122-136 (synthetic loading and predict)
sed -n '118,140p' examples/ensemble_attack/run_metaclassifier_training.py
Length of output: 2085
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for documentation about synth.csv
rg -n "synth.csv" --type=py --type=md -B 2 -A 2
# Look for README or documentation files in ensemble_attack
fd "README|readme|INSTRUCTIONS|instructions" examples/ensemble_attack/ --type f
# Check for test files that might explain the pattern
fd "test.*metaclassifier" --type f
Length of output: 466
---
🏁 Script executed:
```shell
#!/bin/bash
# View the README.md file
cat examples/ensemble_attack/README.md
Length of output: 5238
---
🏁 Script executed:
```shell
#!/bin/bash
# View the config.yaml to understand data paths
cat examples/ensemble_attack/config.yaml
# Search for any comments or documentation about synth.csv in config files
rg -n "synth" --type=yaml -B 2 -A 2
# Look at run_attack.py or data processing scripts to see where synth.csv comes from
fd "run_attack|data_process" examples/ensemble_attack/ --type f
# Search for synth.csv file creation or references in Python files
rg -n "synth\.csv|synth.csv" --type=py -B 3 -A 3 | head -40
Length of output: 6316
---
🏁 Script executed:
```shell
#!/bin/bash
# View run_attack.py to understand data processing and synth.csv
cat examples/ensemble_attack/run_attack.py | head -200
# Search for synth.csv creation or handling in run_attack.py
rg -n "synth" examples/ensemble_attack/run_attack.py -B 3 -A 3
Length of output: 3909
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the process_split_data module
fd "process_split_data.py" --type f
# Once found, examine its implementation to see what files it creates
rg -n "synth|csv" --type=py -B 2 -A 2 | grep -A 10 -B 10 "process_split_data"
Length of output: 12865
---
🏁 Script executed:
```shell
#!/bin/bash
# Global search for synth.csv creation or references
rg -n "synth\.csv|synth_csv" --type=py
# Also search in the shadow model training pipeline
rg -n "\.csv" src/midst_toolkit/attacks/ensemble/ --type=py | grep -i synth
# Check what files might be generated or expected
fd "synth" --type f 2>/dev/null | head -20
Length of output: 254
The TODO comment identifies a genuine issue: synth.csv is not created by the data processing pipeline and has no source in the codebase.
The file is loaded from processed_attack_data_path but the process_split_data.py pipeline only generates: real_train.csv, real_val.csv, real_test.csv, and master_challenge_*.csv files. Since synth.csv is never created by the codebase, either:
- The evaluation should use
df_syntheticfrom the target model (same as training) instead of loading a separate file, or synth.csvshould be generated by the data processing pipeline but is missing, orsynth.csvis expected to be externally provided without documentation
Clarify the intended behavior and either fix the code path or add the missing data generation step.
🤖 Prompt for AI Agents
In examples/ensemble_attack/run_metaclassifier_training.py around lines 122 to
127, the code attempts to load "synth.csv" which is not produced by the data
processing pipeline; update the code or pipeline to match the intended flow:
either (A) stop loading synth.csv and instead use the synthetic dataframe
already created/returned by the target model (replace the load call with the
in-memory df_synthetic source used during training/evaluation), or (B) add
generation of synth.csv in the data processing pipeline (process_split_data.py)
so that it is created under processed_attack_data_path and document its schema,
or (C) if synth.csv is an external input, add explicit validation and clear
documentation+config to require the external file; choose one of these options
and implement the corresponding change and tests so the file path lookup no
longer fails.
|
@sarakodeiri I am merging main to this branch to stop coderabbit from adding a review to every new commit that is pushed here. |
PR Type
Feature
Short Description
Clickup Ticket(s): https://app.clickup.com/t/868fatfqq
Using the trained shadow models and their respective generated synthetic data, this PR adds the RMIA attack to the meta-classifier training pipeline.
Main changes:
Refactored the
BlendingPlusPlusclass inblending.pyto acceptattack_data_collection(result of shadow model training) andtarget_data_collection, and to calculate RMIA signals as part of meta-feature preparation. The_prepare_meta_features,fit, andpredictmethods now accept and propagate ID column data and names, ensuring correct alignment and integration of RMIA features.Updated the meta-classifier training pipeline (
run_metaclassifier_training.py) to load and pass shadow and target model data, extract and usetrans_idcolumns, and ensure the attack results directory is created if missing. The function signature now requires explicit paths to attack and target data, and the pipeline is adjusted to handle these inputs.Added
id_column_nametodata_types.jsonto support flexible and explicit ID column handling throughout the pipeline.Tests Added
Added tests to tests/unit/attacks/ensemble/test_meta_classifier.py
Summary by CodeRabbit
New Features
Configuration
Improvements
Tests