-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix up for the GitDiffService #2116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix up for the GitDiffService #2116
Conversation
|
|
||
| beforeEach(() => { | ||
| // Create mock workspace.fs.readFile if it doesn't exist | ||
| if (!vscode.workspace?.fs?.readFile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The product code here uses workspace.fs directly and not the file service, so I added this mock here so that my changes for this test were just in this test file as opposed to trying to mess with the top level vscode test shims.
|
|
||
| // Add original/modified file paths | ||
| // 100644 is standard file mode for new git files. Saves us from trying to check file permissions and handling | ||
| // UNIX vs Windows permission differences. Skipping calculating the SHA1 hashes as well since they are not strictly necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put a comment here on this. Did a bit of research into what the diff patch actually contains and what was required to actually apply a patch.
Using 100644 for the file mode simplifies not having to do extra file permission checks for a synthetic patch and handles the basic git new file scenario. Also the SHA1 hash line is skipped as git patch doesn't require it for applying so it's just more calculations that we don't need.
| } | ||
|
|
||
| // Range header and content | ||
| patch.push(`@@ -0,0 +1,${lines.length} @@`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug with the old code was here, where it was using the buffer.length and not the lines length for the range header. This would lead to patches that looked incorrect in size.
|
|
||
| return patch.join('\n'); | ||
| // The patch itself should always end with a newline per git patch standards | ||
| return patch.join('\n') + '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch itself should always end with a newline.
|
|
||
| // Git standard to add this comment if the file does not end with a newline | ||
| if (!content.endsWith('\n')) { | ||
| patch.push('\\ No newline at end of file'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was also missing in the old code. Prevents you from getting a warning when applying a patch that doesn't end with a new line.
| readFileSpy.mockRestore(); | ||
| }); | ||
|
|
||
| describe('_getUntrackedChangePatch', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitDiffService didn't have unit tests previously. But while looking into this some of the cases (newline at end, newline not at end, empty file) were a bit quirky, so added some quick tests in for those specific cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the _getUntrackedChangePatch method in GitDiffService to generate proper git diff patches for untracked files. The existing implementation had issues with range headers using byte length instead of line count, missing standard git patch headers, and incorrect handling of files without trailing newlines. These issues prevented patches from being applied correctly, affecting the repoInfoTelemetry service used by the science team for model development.
Key Changes:
- Fixed range header calculation to use line count instead of byte length
- Added "new file mode 100644" header and proper handling of files without trailing newlines
- Added comprehensive unit tests covering empty files, files with/without trailing newlines, and single blank line files
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/extension/prompt/vscode-node/gitDiffService.ts | Fixed _getUntrackedChangePatch to generate standards-compliant git patches with correct headers, line counts, and newline handling |
| src/extension/prompt/vscode-node/test/gitDiffService.spec.ts | Added comprehensive unit tests for _getUntrackedChangePatch covering various edge cases including empty files, files with/without trailing newlines |
| (vscode as any).workspace = { | ||
| ...vscode.workspace, | ||
| fs: { | ||
| ...vscode.workspace?.fs, | ||
| readFile: vi.fn() | ||
| } | ||
| }; |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using 'as any' to override the vscode object is an anti-pattern. According to the project's testing guidelines (CodingGuidelineID: 1000002), you should prefer explicit Mock classes over mutating real instances. Consider creating a MockFileSystemService instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so I did notice this was a bit odd. My goal here was that I didn't want to mess with the existing gitDiffService since other code paths are using this existing code. And gitDiffService was using workspace.fs directly and not the file system service for file operation. So this test bit here I did so that I could unit test this function without having to change the product code (workspace.fs -> WorkspaceFileService) or make bigger changes to the vscode test shim. If we don't like this for a test file, down to change it to one of the other options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did remove the 'as any' for a more strict type here, but I think the core issue might still apply?
|
@Yoyokrazy @sbatten has context on this change. But you can ping me as well if you need any info. |
|
@vritant24 Tagging for context while I'm oof |
|
@lszomoru I'd really appreciate your review |
The repoInfoTelemetry service was being used by the science team for some model development. This telemetry uses the gitDiffService to provide diffs and there were some git diff patches that were not able to be applied coming specifically from new untracked files.
These new untracked files are actually handled a bit differently as the main git services doesn't pick up these changes. Instead a synthetic patch is created via _getUntrackedChangePatch. This function had some issue with generating an actual synthetic patch that could be applied without fixup.
Fixes:
Note: This fix is specifically for the usage of this code path for repoInfoTelemetry, but this existing function was used already by gitCommitMessageService and scmChanges tool. In both those changes the git diffs look to be just handled as a string and passed to an LLM, so improving the accuracy here should not cause any issues. But that is why I avoided added in extra (possibly risky) work for checking file permissions or calculating the SHA1 hash (which are not required to apply the patch) into this work.
Manually tested with pulling the telemetry locally and creating and applying a diff patch for:
Patch applied and created the file as it existed before in all four cases.