This is a write-up of how I tackled the BirdCLEF2023 challenge on Kaggle, including methodology, failed attempts, and learnings.
All of the code files in this directory were ran in Kaggle and not meant to be run locally.
There is 3 folders included in this folder:
Identify_DuplicatesSpeed_up_Audio_ProcessingExperiments_and_Learnings
Description of the process of identifying duplicate audio samples to avoid data leakage in the train/validation splits.
Description of why and how can we speed up the audio processing speed on the CPU. This is because Kaggle provides free P100 but only gives us 2-cores CPU, causing a great CPU bottleneck on the preprocessing and data augmentation on audio data.
Description of the main methods that I tried, reasoning behind them, the results, and also notes on addressing some of the issues.
The code is written and ran on Kaggle, therefore not meant to be run locally.