Skip to content

Hempstead/datacenter-cooling-sim

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Center Cooling Simulation Framework

A comprehensive open-source framework for simulating and optimizing data center cooling systems, combining AI workload simulation with thermal modeling and control strategies.

Overview

This repository provides a complete simulation environment for data center cooling optimization, featuring:

  • AI Datacenter Simulation Engine: Large-scale AI training workload simulation with GPU-aware thermal analysis
  • AlphaDataCenterCooling: Virtual testbed for evaluating data center cooling control strategies
  • Telemetry & Monitoring: Prometheus and Grafana integration for real-time metrics
  • Dashboard: Web-based visualization and control interface

Repository Structure

.
├── ai-datacenter-sim/          # AI workload simulation and telemetry stack
│   ├── SimAI/                  # SimAI large-scale training simulator
│   ├── simulation/             # Facility simulation (RDHx + CHW plant dynamics)
│   ├── adapters/               # Integration adapters for cooling systems
│   │   ├── alpha_adapter/      # AlphaDataCenterCooling adapter
│   │   └── simai_adapter/      # SimAI workload adapter
│   ├── monitoring/             # Prometheus and Grafana configuration
│   ├── dashboard/              # Frontend dashboard and API
│   └── telemetry/              # Telemetry ingestion utilities
│
├── AlphaDataCenterCooling/      # Virtual testbed for cooling system optimization
│   ├── AlphaDataCenterCooling_Gym/  # Gymnasium environment interface
│   ├── Resources/              # Model files and initialization data
│   └── docs/                   # Documentation and figures
│
└── [Documentation files]        # Project documentation and guides

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Python 3.8+ (for local development)
  • Git

Running the Simulation Stack

  1. Clone the repository:

    git clone <repository-url>
    cd datacenter-cooling-sim
  2. Configure environment variables:

    # Copy .env.example to ai-datacenter-sim directory
    cp .env.example ai-datacenter-sim/.env
    # Edit ai-datacenter-sim/.env and change Grafana admin credentials for production
  3. Start the AI Datacenter Simulation:

    cd ai-datacenter-sim
    docker-compose up -d

    Note: Docker Compose automatically reads the .env file in the same directory. The Grafana credentials (GRAFANA_ADMIN_USER and GRAFANA_ADMIN_PASSWORD) are loaded from this file.

    This starts:

    • Prometheus (metrics collection) on port 9090
    • Grafana (visualization) on port 3000
    • AlphaDataCenterCooling service on port 5001
    • Alpha adapter on port 8085
    • SimAI adapter
    • Dashboard API on port 8001
    • Dashboard frontend on port 5174
  4. Access the services:

Running AlphaDataCenterCooling Standalone

cd AlphaDataCenterCooling
docker-compose up

See AlphaDataCenterCooling/README.md for detailed usage instructions.

Documentation

Key Features

AI Datacenter Simulation

  • SimAI integration for large-scale AI training workload simulation
  • GPU-aware thermal modeling
  • Network topology simulation
  • Workload trace analysis

Cooling System Simulation

  • AlphaDataCenterCooling virtual testbed
  • Gymnasium-compatible environment for RL/control algorithms
  • REST API for external integration
  • Real-time disturbance updates

Monitoring & Visualization

  • Prometheus metrics collection
  • Grafana dashboards
  • Custom web dashboard
  • Real-time telemetry ingestion

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   SimAI         │────▶│  SimAI Adapter   │────▶│  Pushgateway    │
│   Workloads     │     │                  │     │                 │
└─────────────────┘     └──────────────────┘     └────────┬────────┘
                                                           │
┌─────────────────┐     ┌──────────────────┐             │
│ AlphaDataCenter │────▶│  Alpha Adapter   │──────────────┼─────┐
│ Cooling         │     │                  │              │     │
└─────────────────┘     └──────────────────┘              │     │
                                                           ▼     ▼
                                                  ┌─────────────────┐
                                                  │   Prometheus     │
                                                  │   (Metrics DB)   │
                                                  └────────┬────────┘
                                                           │
                                    ┌──────────────────────┼──────────────────────┐
                                    ▼                      ▼                      ▼
                            ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
                            │   Grafana    │    │ Dashboard API│    │   Dashboard   │
                            │              │    │              │    │   Frontend    │
                            └──────────────┘    └──────────────┘    └──────────────┘

Development

Building Components

Each component can be built independently:

# Build Alpha adapter
cd ai-datacenter-sim/adapters/alpha_adapter
docker build -t alpha-adapter .

# Build SimAI adapter
cd ai-datacenter-sim/adapters/simai_adapter
docker build -t simai-adapter .

# Build dashboard
cd ai-datacenter-sim/dashboard/frontend
npm install
npm run build

Running Tests

See individual component READMEs for testing instructions.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Third-party components have their own licenses:

  • SimAI: Apache License 2.0
  • Astra-Sim (Alibaba Cloud fork): MIT License
  • AlphaDataCenterCooling: See AlphaDataCenterCooling/README.md for citation information

Citation

If you use this framework in your research, please cite:

@misc{datacenter-cooling-sim2025,
  title={Data Center Cooling Simulation Framework},
  author={Kardashev Labs},
  year={2025},
  url={https://github.com/kardashev-lab/datacenter-cooling-sim}
}

Third-Party Components

This framework builds upon the following open-source projects and research:

SimAI - Large-scale AI training simulation framework:

AlphaDataCenterCooling - Virtual testbed for data center cooling optimization:

@article{wu2025alphadatacentercooling,
  title={AlphaDataCenterCooling: A virtual testbed for evaluating operational strategies in data center cooling plants},
  author={Wu, S. and Zheng, W. and Wang, Z. and Chen, G. and Yang, P. and Yue, S. and Li, D. and Wu, Y.},
  journal={Applied Energy},
  volume={380},
  pages={125100},
  year={2025}
}

Support

For questions and issues:

  • Open an issue on GitHub
  • Check the documentation in each component's README
  • Review the telemetry documentation for setup help

Acknowledgments

About

Data Center Cooling Simulation Framework

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 70.5%
  • C++ 20.5%
  • Python 6.6%
  • JavaScript 1.4%
  • Shell 0.5%
  • CSS 0.3%
  • Other 0.2%