Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 6, 2025

  • Understand the issue: SourceFile.Save() uses Encoding.UTF8 which adds BOM, breaking shebang lines
  • Locate the problematic code in FileLevelDirectiveHelpers.cs
  • Identify test infrastructure in FileBasedAppSourceEditorTests.cs
  • Fix SourceFile struct to preserve encoding using SourceText.Encoding property
  • Add test case to verify BOM is not added when file doesn't have one
  • Add test case to verify BOM is preserved when file has one
  • Build and run new tests to validate the fix (tests pass)
  • Address PR feedback (round 1):
    • Use stackalloc for 3-byte BOM detection buffer (with netstandard2.0 fallback)
    • Leverage SourceText's Encoding property instead of custom tracking
    • Update tests to use _testAssetsManager.CreateTestDirectory() infrastructure
    • Add issue reference to test documentation
  • Address PR feedback (round 2):
    • Move <see href/> inside <summary> elements
    • Simplify to use SourceText's built-in encoding auto-detection (pass encoding: null)
    • Remove custom BOM detection logic entirely
  • Address PR feedback (round 3):
    • Add test for non-UTF-8 encoding preservation (UTF-16)
    • Verify that SourceText auto-detection works for encodings beyond UTF-8
  • Address PR feedback (round 4):
    • Use collection expressions in assertions (is pattern matching)
    • Verify complete file content in PreservesNoBomEncoding test
  • Address PR feedback (round 5):
    • Fix test to use explicit newlines instead of raw string literal
    • Resolve test failure on Windows due to line ending normalization
  • Manually verify fix works end-to-end with shebang scripts and UTF-16 files
  • Reply to all PR comments
Original prompt

This section details on the original issue you should resolve

<issue_title>file-based apps: dotnet add file.cs package changes encoding of file.cs</issue_title>
<issue_description>### Describe the bug
The SDK changes encoding of file-based app files when adding packages.

To Reproduce

1 - Create a a file.cs (UTF-8 encoded)
2 - Add a shebang : #!/usr/bin/env dotnet run and chmod +x file.cs
3- Run it: ./file.cs
4 - Add any package dotnet add file.cs AnyNugetPackageIdHere => file.cs is converted to UTF-8 with BOM
5 - Running ./file.cs again after the package is added results in failure:

> ./file.cs
./file.cs: line 1: #!/usr/bin/env: No such file or directory

Further technical details

.NET 10 SDK, 1.0.100</issue_description>

<agent_instructions>Look at the root cause analysis and suggested fix in the issue comments</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@DamianEdwards Analysis from Copilot:

Root Cause Analysis

The issue is in the SourceFile struct in [src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers. cs](https://github.com/dotnet/sdk/blob/e300dea82e0248b0801f706914c96073c1bd5350/src/Cli/Microsoft.DotNet. FileBasedPrograms/FileLevelDirectiveHelpers.cs#L256-L274):

public void Save()
{
    using var stream = File.Open(Path, FileMode.Create, FileAccess. Write);
    using var writer = new StreamWriter(stream, Encoding.UTF8);
    Text.Write(writer);
}

Problem: Encoding. UTF8 (the static property) emits a UTF-8 BOM by default. When dotnet add file. cs package <package> modifies the file and saves it, the BOM (0xEF 0xBB 0xBF) is prepended to the file content.

On Unix-like systems, the shebang #!/usr/bin/env dotnet run becomes <BOM>#!/usr/bin/env dotnet run, which the kernel doesn't recognize as a valid interpreter directive, causing:

./file.cs: line 1: #!/usr/bin/env: No such file or directory

Fix: The Save() method should preserve the original file's encoding/BOM characteristics. Specifically, for files without a BOM, it should use:

new UTF8Encoding(encoderShouldEmitUTF8Identifier: false)

The Load() method should also detect and store whether the original file had a BOM so it can be preserved on save. </comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

- Modified SourceFile.Load() to detect UTF-8 BOM in files
- Modified SourceFile.Save() to preserve original BOM state
- Added HasUtf8Bom property to SourceFile struct
- Updated InternalAPI.Unshipped.txt with new API members
- Added tests to verify BOM preservation (no-BOM and with-BOM cases)

Fixes issue where dotnet add package command was adding BOM to files
that originally didn't have one, breaking shebang scripts.

Co-authored-by: DamianEdwards <[email protected]>
Copilot AI changed the title [WIP] Fix SDK encoding issue when adding package to file-based apps Preserve UTF-8 BOM state in file-based app source files Dec 6, 2025
Copilot AI requested a review from DamianEdwards December 6, 2025 20:19
Copy link
Member

@jjonescz jjonescz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjonescz jjonescz added the Area-run-file Items related to the "dotnet run <file>" effort label Dec 7, 2025
- Use SourceText's Encoding property instead of custom HasUtf8Bom tracking
- Use stackalloc for 3-byte buffer on .NET Core (with fallback for netstandard2.0)
- Update tests to use _testAssetsManager.CreateTestDirectory() infrastructure
- Add issue reference (#52054) to test docs
- Remove unnecessary API surface from InternalAPI.Unshipped.txt

Co-authored-by: jjonescz <[email protected]>
Copilot AI requested a review from jjonescz December 7, 2025 20:58
Copy link
Member

@jjonescz jjonescz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Remove custom DetectEncoding() method and use SourceText.From(stream, encoding: null)
- SourceText automatically detects encoding including BOM when encoding is null
- Move <see href/> tags inside <summary> elements for better documentation
- Simplifies implementation while maintaining full functionality

Co-authored-by: jjonescz <[email protected]>
Copilot AI requested a review from jjonescz December 7, 2025 21:27
@jjonescz jjonescz requested a review from a team December 8, 2025 09:11
@jjonescz jjonescz marked this pull request as ready for review December 8, 2025 09:12
Copilot AI review requested due to automatic review settings December 8, 2025 09:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug where dotnet add file.cs package <package> was inadvertently adding a UTF-8 BOM to file-based app source files, breaking shebang (#!) scripts on Unix-like systems. The fix preserves the original encoding state (with or without BOM) by leveraging Roslyn's SourceText encoding detection and preservation capabilities.

Key Changes:

  • Modified SourceFile.Load() to auto-detect encoding by passing encoding: null to SourceText.From()
  • Updated SourceFile.Save() to use the detected encoding from SourceText.Encoding property
  • Added comprehensive tests verifying BOM preservation for both cases (with and without BOM)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs Updated SourceFile.Load() to auto-detect encoding and SourceFile.Save() to preserve it using SourceText.Encoding
test/dotnet.Tests/CommandTests/Run/FileBasedAppSourceEditorTests.cs Added two test cases: PreservesNoBomEncoding() for files without BOM and PreservesBomEncoding() for files with BOM

- Add PreservesNonUtf8Encoding test to verify UTF-16 encoding is preserved
- Tests that files with UTF-16 LE encoding (with BOM 0xFF 0xFE) maintain their encoding
- Ensures SourceText's auto-detection works for encodings beyond UTF-8
- All encoding preservation tests now pass (UTF-8 no BOM, UTF-8 with BOM, UTF-16)

Co-authored-by: jjonescz <[email protected]>
Copilot AI changed the title Preserve UTF-8 BOM state in file-based app source files Preserve file encoding in file-based app source files Dec 9, 2025
Copilot AI requested a review from jjonescz December 9, 2025 11:55
Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments on the tests.

@jjonescz
Copy link
Member

@333fred @RikkiGibson for another look, thanks

Copy link
Member

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, comment is not blocking.

- Replace Assert.StartsWith with Assert.Equal to verify the entire file content
- Ensures the test validates that Save() correctly writes the complete content
- Makes the test more robust by checking the full output including the added package directive

Co-authored-by: jjonescz <[email protected]>
- Replace raw string literal with explicit \n characters in expected content
- Raw string literals with indentation were causing blank lines to be stripped
- Explicit newlines ensure consistent comparison across platforms
- Fixes test failure on Windows where line endings differ

Co-authored-by: jjonescz <[email protected]>
Copilot AI requested a review from jjonescz December 15, 2025 14:07
@jjonescz jjonescz merged commit b300744 into main Dec 15, 2025
29 checks passed
@jjonescz jjonescz deleted the copilot/fix-file-encoding-bug branch December 15, 2025 15:33
@jjonescz
Copy link
Member

/backport to release/10.0.2xx

@github-actions
Copy link
Contributor

Started backporting to release/10.0.2xx (link to workflow run)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-run-file Items related to the "dotnet run <file>" effort

Projects

None yet

Development

Successfully merging this pull request may close these issues.

file-based apps: dotnet add file.cs package <package> changes encoding of file.cs

6 participants