Hi we are working on fine-tuning the embedding model for Kazakh language for RAG system.
Maybe you can share with us your source code even if it will be just raw not organized files. It will help us a lot, don't actually know where to start from. Most important part is data generation and the training parameters and how exactly run the train.
Thank you in advance 🙏