Skip to content

Support df_val argument in .fit() for Hyperparameter Tuning Purpose #1345

@jluo41

Description

@jluo41

Description

Feature Request: Add df_val to .fit() for validation on disjoint unique_ids during hyperparameter tuning

In my use case, each unique_id represents a short time series — a 26-hour CGM sequence where: - The first 24 hours (288 steps at 5-minute intervals) serve as input - The next 2 hours (24 steps) are the prediction target Each unique_id corresponds to one input–output pair, meaning intra-series validation via val_size is not applicable.

--- ### Problem When using models like AutoPatchTST, I want to tune hyperparameters via Optuna, but based on generalization to a separate set of unique_ids (e.g., unseen patients). Currently:

  • .fit() only supports val_size, which splits within each time series
  • I cannot pass a df_val with new unique_ids to guide early stopping or hyperparameter selection

Request

Enable:

nf.fit(df_train, df_val=df_valid)`` 

Where df_train and df_val contain disjoint unique_ids.

Use df_val for:

  • Early stopping

  • Evaluation for Optuna tuning

  • Model selection


Why This Matters

This is essential for:

  • Short series use cases (like CGM, wearable sensors)

  • Fair and generalizable models across individuals

  • Sample-level time series where each series is a single training example


Current Workaround

I currently fit only on df_train, then evaluate on df_valid manually after .predict(). But this breaks integration with automated tuning and early stopping logic.

Would love to see native support for df_val in future releases. Happy to share reproducible examples or help design the API.

Use case

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions