How to set a seed for hyperparameter tuning jobs in AWS SageMaker for reproducibility? #3382
Answered
by
idanmoradarthas
idanmoradarthas
asked this question in
Help
-
Hello, I have a hyper-parameter tuning step in my sagemaker pipeline object as following: def get_pipelineopt_step_hyper(role, instance_type, instance_count, base_job_prefix, sagemaker_session,
step_network_config, data_input_bucket, cache_bucket, branch_name,
train_start_date, train_end_date, validation_start_date, validation_end_date,
search_type, search_iterations, n_jobs, seed, logger_level,
execution_id, volume_kms_key=None):
parameters = [
"--train_start_date", train_start_date,
"--train_end_date", train_end_date,
"--validation_start_date", validation_start_date,
"--validation_end_date", validation_end_date,
"--pipeline_steps", steps
"--intervention_adjustment", intervention_adjustment,
"--logger_level", logger_level
]
input_path =...
output_path = ...
estimator = Estimator(
image_uri=TRAIN_IMAGE_URI,
role=role,
instance_count=instance_count,
instance_type=instance_type,
volume_size=200,
base_job_name=f"{base_job_prefix}/mlp-pipeopt",
entry_point=str(Path(BASE_DIR).joinpath("pipline_extract_metric_param.py")),
dependencies=[str(Path(BASE_DIR).joinpath('utils')), str(Path(BASE_DIR).joinpath('features'))],
sagemaker_session=sagemaker_session,
output_path=output_path,
subnets=step_network_config.subnets,
security_group_ids=step_network_config.security_group_ids,
enable_network_isolation=step_network_config.enable_network_isolation,
volume_kms_key=volume_kms_key
)
hyperparameter_ranges = {
"learningWindow__windowsize": IntegerParameter(1, 15),
"SMOTE__k_neighbors": IntegerParameter(3, 17),
"SMOTE__sampling_strategy": ContinuousParameter(1 / 5, 4 / 5, scaling_type="Linear"),
"SampleUsers__lower_bound_hash": CategoricalParameter([0]),
"SampleUsers__upper_bound_hash": CategoricalParameter([15, 20, 30, 40, 50, 70, 90, 100])
}
objective_metric_name = "Harmonic mean score"
metric_definitions = [
{
"Name": objective_metric_name,
"Regex": r"FDR#, FDR\$ Harmonic mean score: (\d+(.|e\-)\d+)"
}
]
tuner = HyperparameterTuner(
estimator=estimator,
objective_metric_name=objective_metric_name,
metric_definitions=metric_definitions,
objective_type="Maximize",
hyperparameter_ranges=hyperparameter_ranges,
strategy=search_type,
max_jobs=search_iterations,
max_parallel_jobs=n_jobs
)
step_tuning = TuningStep(
name="PipelineOpt",
tuner=tuner,
inputs={
"train": TrainingInput(
s3_data=Join('/', [input_path, "train", Join("_", [train_start_date, train_end_date]), "train.csv"]),
content_type="text/csv"
),
"validation": TrainingInput(
s3_data=Join('/',
[input_path, "validation", Join("_", [validation_start_date, validation_end_date]),
"validation.csv"]),
content_type="text/csv"
)
},
job_arguments=parameters
)
return step_tuning How can I set a seed for hyperparameter tuning jobs in AWS SageMaker for reproducibility? |
Beta Was this translation helpful? Give feedback.
Answered by
idanmoradarthas
Dec 21, 2022
Replies: 1 comment
-
Fixed in v2.125.0 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
idanmoradarthas
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Fixed in v2.125.0