Skip to content

A package for machine learning debugging based on sample size analysis.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

focardozom/BreakNBuild

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

R-CMD-check

BreakNBuild: Optimize Machine Learning Models with Dynamic Data Splits DocumentData website

Overview

BreakNBuild is designed to evaluate model performance through progressively sampled training data. It offers a structured way to analyze how a model’s accuracy, error, or other metrics evolve as the amount of data increases. This iterative sampling approach is particularly useful for identifying bias-variance trade-offs, diagnosing overfitting or underfitting, and understanding how much data is needed to achieve optimal model performance. With BreakNBuild, users can visualize learning curves, helping to fine-tune algorithms, assess generalization, and debug machine learning models efficiently.

Features

  • Progressive Data Splitting: partition your dataset into training and validation subsets.
  • Customizable Sample Sizes: Control the size of your training data to understand model performance under different conditions.
  • Easy Integration: Built on the rsample package, BreakNBuild seamlessly integrates with the tidymodels framework.

![man/figures/schema_progressive_splits.svg]

Installation

To install the latest version from GitHub, use:

# install.packages("devtools")
devtools::install_github("https://github.com/focardozom/BreakNBuild")

Usage

Here's a quick example to get you started:

library(BreakNBuild)

splits <- progressive_splits(data, validation_size = 0.2, start_size = 10)

This will create a splits object that you can use to train your model using the tidymodels ecosystem for Machine Learning.

For more details on how to use the BreakNBuild package, please refer to the package vignette.

About

A package for machine learning debugging based on sample size analysis.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages