Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion common/src/main/java/io/mybatis/common/util/Utils.java
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ public static boolean isBlankChar(int c) {
return Character.isWhitespace(c)
|| Character.isSpaceChar(c)
|| c == '\ufeff'
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit check for '\ufeff' on line 95 is now redundant since U+FEFF (Zero Width No-Break Space / BOM) is a FORMAT character and will be matched by Character.getType(c) == Character.FORMAT on line 96. Consider removing the explicit check on line 95 to avoid duplication.

Suggested change
|| c == '\ufeff'

Copilot uses AI. Check for mistakes.
|| c == '\u202a';
|| Character.getType(c) == Character.FORMAT;
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from checking a specific character (c == '\u202a') to checking all FORMAT characters (Character.getType(c) == Character.FORMAT) significantly broadens the scope. The Character.FORMAT category includes directional formatting marks (U+202A-U+202E, U+061C) and other format control characters that control text rendering but don't represent whitespace or blank space.

According to the Javadoc comment above this method (lines 83-84), blank characters should include "空格、制表符、全角空格和不间断空格" (space, tab, full-width space, and non-breaking space). Directional formatting marks like U+202A (LEFT-TO-RIGHT EMBEDDING) don't semantically represent blank space - they control text direction.

Consider whether all FORMAT characters should truly be treated as "blank," or if only specific zero-width space characters (like U+200B) should be added to the existing check. If the intent is to detect all invisible characters, the documentation should be updated to reflect this broader definition.

Copilot uses AI. Check for mistakes.
}

/**
Expand Down
36 changes: 36 additions & 0 deletions common/src/test/java/io/mybatis/common/util/UtilsTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
package io.mybatis.common.util;
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is missing the Apache License header that is present in other test files in this project (e.g., I18nTest.java). For consistency, please add the standard license header at the beginning of this file.

Copilot uses AI. Check for mistakes.

import org.junit.Assert;
import org.junit.Test;

/**
* Utils单元测试用例
*/
public class UtilsTest {

/**
* 测试常见的空白字符
*/
@Test
public void testBlankChar() {
Assert.assertTrue(Utils.isBlankChar(' '));
Assert.assertTrue(Utils.isBlankChar('\n'));
Assert.assertTrue(Utils.isBlankChar('\r'));
Assert.assertTrue(Utils.isBlankChar('\t'));
Assert.assertTrue(Utils.isBlankChar('\f'));

Assert.assertTrue(Utils.isBlankChar('\u00A0'));
Assert.assertTrue(Utils.isBlankChar('\ufeff'));
Assert.assertTrue(Utils.isBlankChar('\u3000'));
Assert.assertTrue(Utils.isBlankChar('\u202a'));
// 处理来自Word的文本时,偶见此空白字符
Assert.assertTrue(Utils.isBlankChar('\u200B'));

Assert.assertFalse(Utils.isBlankChar('a'));
Assert.assertFalse(Utils.isBlankChar('z'));
Assert.assertFalse(Utils.isBlankChar('0'));
Assert.assertFalse(Utils.isBlankChar('9'));
Assert.assertFalse(Utils.isBlankChar('/'));
Assert.assertFalse(Utils.isBlankChar('\\'));
}
}
Loading