GPU-Accelerated Multi-Asset Portfolio Optimization with Macro Regime Detection

Overview

A practice repo covering portfolio optimization system combining GPU-accelerated machine learning (cuML), Bayesian regime detection, and stochastic programming for multi-asset allocation under market uncertainty.

Key Features

GPU-Accelerated Feature Engineering: 10x faster processing using RAPIDS cuDF/cuML
Market Regime Detection: Hidden Markov Models + Bayesian changepoint detection
Mathematical Optimization: Regime-aware mean-variance optimization with transaction costs
Comprehensive Backtesting: Walk-forward validation with realistic assumptions

Technologies

GPU Computing: RAPIDS cuDF, cuML
ML/AI: Scikit-learn, HMMLearn
Optimization: Pyomo
Bayesian: PyMC

Major Modules

1. Data Ingestion

Complete data pipeline for downloading and preprocessing financial market data for portfolio optimization.

Basic usage:

cd /src/data
python download_data.py

This downloads:

~50 assets across equities, bonds, commodities, currencies
~20 macro indicators (GDP, inflation, unemployment, etc.)
Sentiment data (VIX, market stress indicators)

Unit tests are provided in tests/data/test_data_download.py.

pytest tests/data/test_data_download.py -v

2. Feature Engineering

GPU-accelerated technical indicator computation and feature engineering pipeline for portfolio optimization.

Prerequisites: Make sure you've completed the data ingestion module first:

# You should have these files from data ingestion
data/raw/asset_prices.csv
data/raw/macro_data.csv (optional)
data/raw/sentiment_data.csv (optional)

Basic usage:

cd /src/feature_engineering
python technical_indicators.py

Features computed:

Technical Indicators (30+)

Returns & Log Returns

Daily returns (1-day)
Multi-period returns (5d, 21d, 63d)
Log returns (better for statistical modeling)

Trend Indicators

SMA (Simple Moving Average): 20, 50, 200-day windows
EMA (Exponential Moving Average): 12, 26, 50-day spans

Momentum Indicators

RSI (Relative Strength Index): 14-day window
- Range: 0-100
- 70 = Overbought, <30 = Oversold
MACD (Moving Average Convergence Divergence):
- MACD Line (EMA12 - EMA26)
- Signal Line (EMA9 of MACD)
- Histogram (MACD - Signal)
Momentum: 10, 20, 50-day price momentum

Volatility Indicators

Rolling Volatility: 20, 60, 252-day windows (annualized)
Bollinger Bands:
- Upper/Lower bands (±2σ)
- Bandwidth
- %B (position within bands)
Average True Range (ATR): only if columns High, Low and Adj Close are present

Correlation Features

Rolling correlation with benchmark (SPY)
60-day window

Engineered Features

Lag Features

Lagged returns: 1, 5, 21-day lags
Captures momentum and mean reversion

Cross-Sectional Features

Rank: Percentile rank within universe
Z-score: Standardized returns
Deviation from mean: Relative performance

Interaction Features

Pair-wise interactions/products (e.g., sma_20 and ema_12)

External Features (if provided)

Macroeconomic (from data ingestion)

Interest rates, inflation, employment
GDP growth, consumer sentiment
Credit spreads, money supply

Market Sentiment

VIX, SKEW, volatility indices

Unit tests are provided in tests/feature_engineering/test_feature_engineering.py.

3. Regime Detection

Advanced market regime detection using multiple methodologies: volatility analysis, clustering, Hidden Markov Models (HMM), and Bayesian changepoint detection.

Regime detectors:

Volatility-Based Detection (Fastest)

Simple but effective method based on rolling volatility quantiles.

How it works:

Calculates rolling volatility across all assets
Divides into regimes based on quantile thresholds
Identifies: Low Vol, Normal, High Vol, Crisis

Vol_t = σ(r_{t-w:t})
Regime_t = Quantile_bin(Vol_t)

Clustering-Based Detection (GPU-Accelerated)

Uses K-Means clustering on market features to identify regimes.

Features used:

Market returns (equal-weighted)
Rolling volatility
Average correlation
Return dispersion (cross-sectional std)

How it works:

Compute rolling market features
Standardize features
Optional: PCA for dimensionality reduction
K-Means clustering (GPU-accelerated with cuML)
Assign regime names based on characteristics

min Σ ||x_i - μ_{c(i)}||^2
subject to: c(i) ∈ {1,...,K}

Hidden Markov Model (HMM)

Statistical model that assumes market states are "hidden" and inferred from observed data.

How it works:

Assumes market evolves through hidden states
Observes returns and volatility
Uses Expectation-Maximization (EM) to learn:
- Transition probabilities between states
- Emission distributions (what each state looks like)
Viterbi algorithm finds most likely state sequence

P(s_t | s_{t-1}) = Transition matrix
P(x_t | s_t) = Emission distribution

Bayesian Changepoint Detection (Most Advanced)

Uses Bayesian inference to detect structural breaks and regime changes.

How it works:

Places priors on changepoint locations
Places priors on regime means and variances
Uses MCMC (Markov Chain Monte Carlo) sampling
Posterior distribution gives uncertainty estimates

τ ~ Uniform(T_min, T_max)
μ_k ~ Normal(0, σ_μ)
σ_k ~ HalfNormal(σ_σ)

Unit tests are provided in tests/regime_detection/test_regime_detection.py.

4. Stochastic Portfolio Optimization

Advanced portfolio optimization using Pyomo for mathematical modeling with GPU-accelerated covariance computation. Implements multiple optimization strategies including regime-aware allocation.

Prerequisites: Install IPOPT solver first (instructions for Mac OS):

brew install ipopt

Optimization methods:

Mean-Variance Optimization (Markowitz)

Classic portfolio optimization balancing return and risk.

Objective:

Maximize: E[R] - λ * Var[R]

Minimum Variance Portfolio

Finds the portfolio with minimum risk, regardless of return.

Objective:

Minimize: Var[R]

Maximum Sharpe Ratio

Finds the portfolio with the best risk-adjusted return.

Objective:

Maximize: (E[R] - rf) / σ[R]

Implementation Note: Solved via quadratic reformulation:

Minimize: w'Σw  subject to  (μ - rf)'w = 1

Then normalize: w_final = w / sum(w)

Risk Parity

Equalizes risk contribution from each asset.

Concept:

Risk Contribution_i = w_i * (Σw)_i / σ_p

Goal: RC_1 = RC_2 = ... = RC_n

Implementation: Simplified inverse volatility weighting

w_i ∝ 1/σ_i

Regime-Aware Optimization

Optimizes considering multiple market regimes and their probabilities.

Objective:

Maximize: Σ_r P(regime=r) * [E[R|r] - λ*Var[R|r]] - TC*|Δw|

Unit tests are provided in tests/optimization/test_portfolio_optimization.py.

Future Work

Multi-Regime Forecasting: Separate ML models (with cuML support) per market state
Stochastic Optimization: Use ML forecasting to generate scenarios (from bearish to bullish) for optimization
Visualization: Dashboards nad graphical user interfaces (Plotly, Streamlit)

Installation

Clone the repository

git git@github.com:bacalfa/gpu-port-opt.git
cd gpu-port-opt

Create virtual or a conda environment (example using uv)

uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies (example using uv)

uv sync

Create file .env in top folder and containing the following text (replace API keys with yours)

# FRED API Key from https://fred.stlouisfed.org/docs/api/api_key.html
FRED_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

bacalfa/gpu-port-opt

Folders and files

Latest commit

History

Repository files navigation

GPU-Accelerated Multi-Asset Portfolio Optimization with Macro Regime Detection

Overview

Key Features

Technologies

Major Modules

1. Data Ingestion

2. Feature Engineering

Technical Indicators (30+)

Returns & Log Returns

Trend Indicators

Momentum Indicators

Volatility Indicators

Correlation Features

Engineered Features

Lag Features

Cross-Sectional Features

Interaction Features

External Features (if provided)

Macroeconomic (from data ingestion)

Market Sentiment

3. Regime Detection

Volatility-Based Detection (Fastest)

Clustering-Based Detection (GPU-Accelerated)

Hidden Markov Model (HMM)

Bayesian Changepoint Detection (Most Advanced)

4. Stochastic Portfolio Optimization

Mean-Variance Optimization (Markowitz)

Minimum Variance Portfolio

Maximum Sharpe Ratio

Risk Parity

Regime-Aware Optimization

Future Work

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages