Skip to content

LudovicTuncay/Audio-JEPA

Repository files navigation

Audio-JEPA

uv python pytorch lightning hydra

📌  Introduction

This repository is based on the very good Lightning-Hydra-Template, which is a template for PyTorch Lightning projects with Hydra configuration.

📌  Running the code

This code uses the uv as the package and project manager. Please refer to the uv documentation for more details on installation.

After cloning the repository, navigate to the project directory (Audio-JEPA) and run the following command:

uv sync

this will install all the dependencies.

To run the pre-training code, you can use the following command:

uv run src/train.py

This will run the training script with the default configuration (on CPU).

If you want to change the configuration, you can either change the configurations files directly in the config folder or use the command line arguments. For more information, check the Hydra documentation.

For example, to run the training script on GPU, logged on WandB, with a specific batch size, and some additional options you could use the following command:

uv run src/train.py logger=wandb trainer=gpu trainer.max_steps=100000 data.batch_size=64 callbacks.model_checkpoint.every_n_train_steps=20000 callbacks.model_checkpoint.save_top_k=-1

By default, training checkpoints are saved in the logs/train/runs/<date> folder. You can change this by modifying the logger configuration in the config folder.

To test the model, please refer to our fork of X-ARES where we added support for Audio-JEPA.

About

Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Built upon the I-JEPA paradigm, it uses a Vision Transformer (ViT) backbone to predict latent representations of masked spectrogram patches.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors