CARTE:
Pretraining and Transfer for Tabular Learning

This repository contains the implementation of the paper CARTE: Pretraining and Transfer for Tabular Learning.

CARTE is a pretrained model for tabular data by treating each table row as a star graph and training a graph transformer on top of this representation.

Colab Examples (Give it a test):

CARTERegressor on Wine Poland dataset
CARTEClassifier on Spotify dataset

01 Install 🚀

The library has been tested on Linux, MacOSX and Windows.

CARTE-AI can be installed from PyPI:

pip install carte-ai

Post installation check

After a correct installation, you should be able to import the module without errors:

import carte_ai

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

import pandas as pd
from carte_ai.data.load_data import *

num_train = 128  # Example: set the number of training groups/entities
random_state = 1  # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
print("Wina Poland dataset:", X_train.shape, X_test.shape)

2️⃣ Convert Table 2 Graph 🪵

The basic preparations are:

preprocess raw data
load the prepared data and configs; set train/test split
generate graphs for each table entries (rows) using the Table2GraphTransformer
create an estimator and make inference

import fasttext
from huggingface_hub import hf_hub_download
from carte_ai import Table2GraphTransformer

model_path = hf_hub_download(repo_id="hi-paris/fastText", filename="cc.en.300.bin")

preprocessor = Table2GraphTransformer(fasttext_model_path=model_path)

# Fit and transform the training data
X_train = preprocessor.fit_transform(X_train, y=y_train)

# Transform the test data
X_test = preprocessor.transform(X_test)

3️⃣ Make Predictions🔮

For learning, CARTE currently runs with the sklearn interface (fit/predict) and the process is:

Define parameters
Set the estimator
Run 'fit' to train the model and 'predict' to make predictions

from carte_ai import CARTERegressor, CARTEClassifier

# Define some parameters
fixed_params = dict()
fixed_params["num_model"] = 10 # 10 models for the bagging strategy
fixed_params["disable_pbar"] = False # True if you want cleanness
fixed_params["random_state"] = 0
fixed_params["device"] = "cpu"
fixed_params["n_jobs"] = 10
fixed_params["pretrained_model_path"] = config_directory["pretrained_model"]


# Define the estimator and run fit/predict

estimator = CARTERegressor(**fixed_params) # CARTERegressor for Regression
estimator.fit(X=X_train, y=y_train)
y_pred = estimator.predict(X_test)

# Obtain the r2 score on predictions

score = r2_score(y_test, y_pred)
print(f"\nThe R2 score for CARTE:", "{:.4f}".format(score))

03 Reproducing paper results ⚙️

➡️ installation instructions setup paper

04 Contribute to the package 🚀

➡️ read the contributions guidelines

05 CARTE-AI references 📚

@article{kim2024carte,
  title={CARTE: pretraining and transfer for tabular learning},
  author={Kim, Myung Jun and Grinsztajn, L{\'e}o and Varoquaux, Ga{\"e}l},
  journal={arXiv preprint arXiv:2402.16785},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github/workflows		.github/workflows
carte_ai		carte_ai
data/single_tables		data/single_tables
examples		examples
images		images
results/compiled_results		results/compiled_results
tests		tests
.flake8		.flake8
.gitignore		.gitignore
CONTRIBUTIONS.md		CONTRIBUTIONS.md
INSTALL.md		INSTALL.md
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CARTE:
Pretraining and Transfer for Tabular Learning

Colab Examples (Give it a test):

01 Install 🚀

Post installation check

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

2️⃣ Convert Table 2 Graph 🪵

3️⃣ Make Predictions🔮

03 Reproducing paper results ⚙️

04 Contribute to the package 🚀

05 CARTE-AI references 📚

About

Releases

Packages

Contributors 2

Languages

soda-inria/carte

Folders and files

Latest commit

History

Repository files navigation

CARTE: Pretraining and Transfer for Tabular Learning

Colab Examples (Give it a test):

01 Install 🚀

Post installation check

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

2️⃣ Convert Table 2 Graph 🪵

3️⃣ Make Predictions🔮

03 Reproducing paper results ⚙️

04 Contribute to the package 🚀

05 CARTE-AI references 📚

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

CARTE:
Pretraining and Transfer for Tabular Learning

Packages