-
Couldn't load subscription status.
- Fork 3.2k
Adding samples for files and fine tuning job #43443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds sample code demonstrating how to use fine-tuning jobs with the Azure AI Projects SDK. The samples cover three fine-tuning methods: supervised learning, reinforcement learning, and Direct Preference Optimization (DPO). Both synchronous and asynchronous implementations are provided for each method.
Key Changes:
- Added six new Python sample files demonstrating fine-tuning job operations
- Included sample training and validation datasets in JSONL format
- Updated CHANGELOG to document the new samples
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
sample_finetuning_supervised_job_async.py |
Async implementation for supervised fine-tuning with job creation, retrieval, listing, and cancellation |
sample_finetuning_supervised_job.py |
Synchronous implementation for supervised fine-tuning operations |
sample_finetuning_reinforcement_job_async.py |
Async implementation for reinforcement fine-tuning with pause/resume and checkpoint listing |
sample_finetuning_reinforcement_job.py |
Synchronous implementation for reinforcement fine-tuning operations |
sample_finetuning_dpo_job_async.py |
Async implementation for DPO fine-tuning job creation and retrieval |
sample_finetuning_dpo_job.py |
Synchronous implementation for DPO fine-tuning operations |
validation_set.jsonl |
Sample validation data with conversational prompts and responses |
training_set.jsonl |
Sample training data with conversational prompts and responses |
dpo_validation_set.jsonl |
DPO-specific validation data with preferred and non-preferred outputs |
dpo_training_set.jsonl |
DPO-specific training data with preferred and non-preferred outputs |
countdown_valid_50.jsonl |
Validation dataset for arithmetic problem-solving tasks |
CHANGELOG.md |
Documents the addition of fine-tuning samples |
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines