ScalaGBM

Overview

ScalaGBM is an efficient GPU-based GBDT system, which can handle high-dimensional and large-scale dataset and train fast.

Prerequisites

cmake 2.8 or above
gcc 11.x for Linux
CUDA 11.7

Introduction

Download

git clone https://github.com/BoruiXu/ScalaGBM.git

Build on Linux. Before building, it is necessary to set the architecture of the GPU on line 28 (-arch) in CMakeLists.txt. For example, when using Nvidia RTX A6000, -arch=compute_86.

cd ScalaGNM
mkdir build
cd build
cmake ..
make -j

Usage example

./bin/scalagbm-train data=dataset/datasetname objective=binary:logistic tree_method=hist n_trees=40 depth=6

Datasets

All test datasts can be downloaded through the script in dataset floader.

sh ./dataset/get_datasets.sh

Parameter and Test

The meaning of parameters is the same as that in ThunderGBM. At present, only histogram-based training method is supported. We provide a bash script (train_test.sh) to train datasets mentioned in our paper. Befor running this script, please copy this script into the build floder. If you want to test the real-sim dataset. Please run:

sh train_test.sh real-sim

NOTED: all datasets need to be stored in the dataset folder!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
dataset		dataset
include/thundergbm		include/thundergbm
src/thundergbm		src/thundergbm
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
train_test.sh		train_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScalaGBM

Overview

Prerequisites

Introduction

Download

Usage example

Datasets

Parameter and Test

About

Releases

Packages

Languages

BoruiXu/ScalaGBM

Folders and files

Latest commit

History

Repository files navigation

ScalaGBM

Overview

Prerequisites

Introduction

Download

Usage example

Datasets

Parameter and Test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages