Educational and diagnostic stepwise model selection for statsmodels with multiple criteria: AIC, BIC, Adjusted R², and p-values.
This package provides a unified, flexible interface for exploratory stepwise regression with various selection criteria, supporting both OLS and GLM models with advanced features like interaction terms, transformations, and different statistical tests. Designed for educational purposes and model exploration, not production model selection.
- 🎯 Main Function:
step_criterion()- unified interface for all selection methods - 📊 Multiple Criteria: AIC, BIC, Adjusted R², and p-value based selection
- 🔧 Convenience Wrappers: Specialized functions for each criterion
- 📈 Model Support: OLS and GLM (including logistic, Poisson, etc.)
- 🧮 Advanced Formulas: Interaction terms, transformations, categorical variables
- ⚡ GLM Flexibility: Multiple test types (likelihood ratio, Wald)
- 🔇 Clean Output: Automatic suppression of technical warnings
- 📋 R-like Results: Familiar ANOVA-style step tables
This package is designed for educational and exploratory purposes. Stepwise selection has well-documented statistical limitations that users should understand:
- P-value Inflation: Multiple testing inflates Type I error rates. P-values from stepwise procedures are biased and should not be used for inference
- Overfitting: Selected models are optimistic and may not generalize well to new data
- Selection Bias: Standard confidence intervals and hypothesis tests are invalid after model selection
- Multiple Comparisons: The more variables considered, the higher the chance of spurious associations
- ✅ Educational: Learning about model selection and variable importance
- ✅ Exploratory Data Analysis: Initial investigation of relationships
- ✅ Diagnostic: Understanding which variables might be relevant
- ✅ Hypothesis Generation: Developing ideas for future confirmatory studies
- ❌ Confirmatory Analysis: Final statistical inference or hypothesis testing
- ❌ Production Models: Automated model selection in production systems
- ❌ P-value Reporting: Publishing p-values from stepwise-selected models
- ❌ Causal Inference: Establishing causal relationships
For reliable inference and model selection, consider:
- Cross-validation with penalized regression (LASSO, Ridge, Elastic Net)
- Information criteria with proper model averaging
- Bootstrap procedures for selection uncertainty
- Post-selection inference methods when stepwise is unavoidable
- Domain knowledge guided model specification
pip install step-criterionimport pandas as pd
import statsmodels.api as sm
from step_criterion import step_criterion
# Load your data
df = pd.read_csv("your_data.csv")
# Perform stepwise selection with BIC
result = step_criterion(
data=df,
initial="y ~ 1", # Start with intercept only
scope={"upper": "y ~ x1 + x2 + x3 + x1:x2 + I(x1**2)"},
direction="both", # Forward and backward steps
criterion="bic", # Selection criterion
trace=1 # Show step-by-step progress
)
# View results
print(result.model.summary())
print("\nStep-by-step path:")
print(result.anova)This is the recommended entry point - a unified interface supporting all selection criteria and model types.
step_criterion(
data, # pandas DataFrame
initial, # Initial formula string
scope=None, # Upper/lower bounds for model terms
direction="both", # "both", "forward", or "backward"
criterion="aic", # "aic", "bic", "adjr2", or "p-value"
trace=1, # Verbosity level (0=silent, 1=progress)
family=None, # statsmodels family (None=OLS, or sm.families.*)
glm_test="lr", # For GLM p-value: "lr", "wald", "score", "gradient"
alpha_enter=0.05, # p-value threshold for entering (p-value criterion)
alpha_exit=0.10, # p-value threshold for removal (p-value criterion)
steps=1000, # Maximum number of steps
keep=None, # Optional function to track custom metrics
fit_kwargs=None # Additional arguments passed to model.fit()
)# Using main function
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="aic")
# Using convenience wrapper (allows custom k penalty)
from step_criterion import step_aic
result = step_aic(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
k=2.0) # Standard AIC penalty# BIC automatically uses log(n) penalty
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="bic")
# Convenience wrapper
from step_criterion import step_bic
result = step_bic(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"})# Maximizes adjusted R-squared
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="adjr2")
# Convenience wrapper
from step_criterion import step_adjr2
result = step_adjr2(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"})# OLS with F-tests
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="p-value",
alpha_enter=0.05, alpha_exit=0.10)
# GLM with likelihood ratio tests
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="p-value",
family=sm.families.Binomial(),
glm_test="lr")
# Convenience wrapper with GLM Wald tests
from step_criterion import step_pvalue
result = step_pvalue(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
family=sm.families.Binomial(),
glm_test="wald")# family=None (default) uses OLS
result = step_criterion(
data=df,
initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="bic"
)import statsmodels.api as sm
# Logistic regression
result = step_criterion(
data=df,
initial="binary_outcome ~ 1",
scope={"upper": "binary_outcome ~ x1 + x2 + x3"},
criterion="aic",
family=sm.families.Binomial()
)
# Poisson regression
result = step_criterion(
data=df,
initial="count_outcome ~ 1",
scope={"upper": "count_outcome ~ x1 + x2 + x3"},
criterion="bic",
family=sm.families.Poisson()
)
# Gamma regression
result = step_criterion(
data=df,
initial="positive_outcome ~ 1",
scope={"upper": "positive_outcome ~ x1 + x2 + x3"},
criterion="aic",
family=sm.families.Gamma()
)Using Patsy formula syntax for complex model specifications:
# Interaction terms
scope = {"upper": "y ~ x1 + x2 + x1:x2"} # Specific interaction
scope = {"upper": "y ~ x1 * x2"} # Main effects + interaction
scope = {"upper": "y ~ (x1 + x2 + x3)**2"} # All pairwise interactions
# Transformations
scope = {"upper": "y ~ x1 + I(x1**2) + I(x1**3)"} # Polynomial terms
scope = {"upper": "y ~ x1 + np.log(x2) + np.sqrt(x3)"} # Math functions
# Categorical variables
scope = {"upper": "y ~ x1 + C(category)"} # Categorical encoding
scope = {"upper": "y ~ x1 + C(category, Treatment(reference='A'))"} # Custom reference
# Mixed interactions
scope = {"upper": "y ~ x1 + x2 + C(group) + x1:C(group) + I(x2**2)"}For GLM models with p-value criterion, choose the appropriate test:
# Likelihood Ratio Test (recommended for most cases)
result = step_criterion(data=df, initial="y ~ 1", criterion="p-value",
family=sm.families.Binomial(), glm_test="lr")
# Wald Test (faster, asymptotically equivalent)
result = step_criterion(data=df, initial="y ~ 1", criterion="p-value",
family=sm.families.Binomial(), glm_test="wald")
# Score and Gradient tests (currently mapped to LR with warning)
result = step_criterion(data=df, initial="y ~ 1", criterion="p-value",
family=sm.families.Binomial(), glm_test="score")Model averaging provides AIC/BIC weights for each model in the stepwise path, allowing you to assess relative model support and account for model uncertainty:
# Enable model averaging with any criterion
result = step_criterion(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="aic", model_averaging=True)
# Or use convenience functions
result = step_aic(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
model_averaging=True)
result = step_bic(data=df, initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
model_averaging=True)
# Access the model weights
print(result.model_weights)
# Model Score (AIC) Delta Weight
# 0 y ~ x1 156.2 0.0 0.524
# 1 y ~ x2 157.8 1.6 0.235
# 2 y ~ x3 159.1 2.9 0.123
# 3 y~1 161.4 5.2 0.039
# Interpret the weights
substantial_support = result.model_weights[result.model_weights['Weight'] > 0.1]
print(f"Models with substantial support: {len(substantial_support)}")
print(f"Top model weight: {result.model_weights['Weight'].iloc[0]:.3f}")Model weights are calculated as:
- Δᵢ = criterionᵢ - min(criterion)
- wᵢ = exp(-0.5 × Δᵢ) / Σ exp(-0.5 × Δⱼ)
Guidelines for interpretation:
- Weight > 0.1: Substantial support
- Weight > 0.05: Some support
- Weight < 0.05: Little support
# Both directions (recommended) - can add and remove terms
result = step_criterion(data=df, initial="y ~ x1", direction="both",
scope={"upper": "y ~ x1 + x2 + x3"})
# Forward only - only adds terms
result = step_criterion(data=df, initial="y ~ 1", direction="forward",
scope={"upper": "y ~ x1 + x2 + x3"})
# Backward only - only removes terms
result = step_criterion(data=df, initial="y ~ x1 + x2 + x3", direction="backward",
scope={"lower": "y ~ 1"})While step_criterion() is the main interface, specialized convenience functions are available:
from step_criterion import step_aic, step_bic, step_adjr2, step_pvalue
# AIC with custom penalty
result = step_aic(data=df, initial="y ~ 1", scope={"upper": "y ~ x1 + x2"}, k=2.5)
# BIC (automatic log(n) penalty)
result = step_bic(data=df, initial="y ~ 1", scope={"upper": "y ~ x1 + x2"})
# Adjusted R² (OLS only)
result = step_adjr2(data=df, initial="y ~ 1", scope={"upper": "y ~ x1 + x2"})
# P-value with custom thresholds
result = step_pvalue(data=df, initial="y ~ 1", scope={"upper": "y ~ x1 + x2"},
alpha_enter=0.01, alpha_exit=0.05)All functions return a StepwiseResult object with:
result.model # Final statsmodels Results object
result.anova # Step-by-step path DataFrame
result.keep # Optional custom metrics (if keep function provided)
# Access final model
print(result.model.summary())
print(f"Final AIC: {result.model.aic:.3f}")
print(f"R-squared: {result.model.rsquared:.3f}")
# View selection path
print(result.anova) Step Df Deviance Resid. Df Resid. Dev AIC
0 NaN NaN 15 305.619 308.392
1 + GNP 1.0 54.762 14 250.857 256.402
2 + UNEMP 1.0 8.363 13 242.494 250.812
3 + ARMED 1.0 4.177 12 238.317 249.408
4 + YEAR 1.0 18.662 11 219.655 233.518
Note: The following examples demonstrate the package's capabilities for exploratory analysis. Remember that p-values and model selection results should not be used for confirmatory inference.
import pandas as pd
import statsmodels.api as sm
from step_criterion import step_criterion
# Load Longley economic dataset
longley = sm.datasets.longley.load_pandas().data
longley.rename(columns={'TOTEMP': 'employment'}, inplace=True)
# Stepwise with BIC including interactions and polynomials
result = step_criterion(
data=longley,
initial="employment ~ 1",
scope={"upper": "employment ~ GNP + UNEMP + ARMED + POP + YEAR + GNPDEFL + GNP:YEAR + I(GNP**2)"},
direction="both",
criterion="bic",
trace=1
)
print("Final model:")
print(result.model.summary())# Simulated medical data
np.random.seed(42)
n = 1000
data = pd.DataFrame({
'age': np.random.normal(50, 15, n),
'bmi': np.random.normal(25, 5, n),
'cholesterol': np.random.normal(200, 40, n),
'smoking': np.random.choice([0, 1], n, p=[0.7, 0.3]),
'exercise': np.random.normal(3, 2, n) # hours per week
})
# Create outcome with realistic relationships
logit = (-5 + 0.05*data['age'] + 0.1*data['bmi'] +
0.01*data['cholesterol'] + 2*data['smoking'] - 0.2*data['exercise'])
data['disease'] = (np.random.random(n) < 1/(1+np.exp(-logit))).astype(int)
# Stepwise logistic regression
result = step_criterion(
data=data,
initial="disease ~ 1",
scope={"upper": "disease ~ age + bmi + cholesterol + smoking + exercise + age:smoking + I(bmi**2)"},
direction="both",
criterion="p-value",
family=sm.families.Binomial(),
glm_test="lr",
alpha_enter=0.05,
alpha_exit=0.10,
trace=1
)
print("Logistic regression results:")
print(result.model.summary())from step_criterion import step_criterion, step_aic, step_bic, step_adjr2
# Compare different selection criteria
criteria_results = {}
for criterion in ['aic', 'bic', 'adjr2']:
result = step_criterion(
data=df,
initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3 + x1:x2 + I(x1**2)"},
criterion=criterion,
trace=0 # Silent for comparison
)
criteria_results[criterion] = {
'formula': result.model.model.formula,
'aic': result.model.aic,
'bic': result.model.bic,
'rsquared_adj': getattr(result.model, 'rsquared_adj', None),
'n_params': len(result.model.params)
}
# Display comparison
comparison_df = pd.DataFrame(criteria_results).T
print("Comparison of selection criteria:")
print(comparison_df)def track_metrics(model, score):
"""Custom function to track additional metrics during selection"""
return {
'aic': model.aic,
'bic': model.bic,
'rsquared': getattr(model, 'rsquared', None),
'condition_number': np.linalg.cond(model.model.exog)
}
result = step_criterion(
data=df,
initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="bic",
keep=track_metrics # Track custom metrics at each step
)
# View tracked metrics
print(result.keep)# The package works with statsmodels' missing data handling
result = step_criterion(
data=df_with_missing,
initial="y ~ 1",
scope={"upper": "y ~ x1 + x2 + x3"},
criterion="aic",
fit_kwargs={'missing': 'drop'} # or 'raise', 'skip'
)Results from this package should be interpreted carefully:
- Use selected models for exploration and hypothesis generation only
- Do not report p-values from stepwise-selected models as if they were from pre-specified models
- Confidence intervals and standard errors are not valid after selection
- Effect sizes may be inflated due to selection bias
- Always validate findings with independent data or proper post-selection methods
step_criterion(): Unified stepwise selection interface
step_aic(): AIC-based selection with custom penalty parameterstep_bic(): BIC-based selectionstep_adjr2(): Adjusted R²-based selection (OLS only)step_pvalue(): P-value based selection with test options
StepwiseResult: Container withmodel,anova, and optionalkeepattributes
- Python ≥ 3.9
- pandas ≥ 1.5
- numpy ≥ 1.23
- statsmodels ≥ 0.13
MIT License - see LICENSE file for details.
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
- Issues: GitHub Issues
- Documentation: This README and inline docstrings
- Examples: See
examples_usage.ipynbin the repository
- 0.1.0: Initial release with comprehensive stepwise selection support