-
Notifications
You must be signed in to change notification settings - Fork 14
Scanner
The PGN Scanner provides functionality to read and tokenize chess games in PGN (Portable Game Notation) format. It handles multiple games in a single input stream, manages game metadata and moves, and properly handles PGN-specific syntax like comments and move variations.
type Scanner struct {
scanner *bufio.Scanner // Underlying scanner
nextGame *GameScanned // Buffered next game
lastError error // Last encountered error
}
type GameScanned struct {
Raw string // Raw PGN text
}- Game Scanning: Reads complete games from input
- Tokenization: Converts raw PGN text into tokens
- State Management: Tracks scanning state and buffers
// Create new scanner
scanner := NewScanner(reader)The scanner is initialized with:
- Custom split function for PGN games
- Buffer for peek operations
- Error tracking capability
The scanner provides two main methods for reading games:
// Read next game
game, err := scanner.ScanGame()
// Check if more games exist
hasMore := scanner.HasNext()- Buffered reading
- Error handling
- EOF detection
- Game boundary detection
The split function (splitPGNGames) handles PGN-specific parsing:
-
Whitespace Handling
- Skips leading whitespace
- Preserves significant whitespace
- Handles line endings
-
Game Boundary Detection
- Finds game start markers
- Handles metadata sections
- Detects game endings
-
State Tracking
- Bracket tracking (for tags)
- Comment tracking
- Result detection
The TokenizeGame function processes raw PGN text:
// Convert game text to tokens
tokens, err := TokenizeGame(game)Handles:
- Move notation
- Comments
- Annotations
- Game metadata
- Special characters
The tokenizer tracks multiple states:
-
Bracket State
- Inside/outside brackets
- Nested bracket handling
-
Comment State
- Block comments
- Line comments
- Nested comment handling
-
Game Content
- Move text
- Move numbers
- Game results
- Annotations
func skipLeadingWhitespace(data []byte) int- Skips insignificant whitespace
- Preserves structural whitespace
- Handles multiple whitespace types
func findGameStart(data []byte, start int, atEOF bool) int- Finds start of games
- Handles multiple game formats
- Manages partial reads
func processGameContent(data []byte, start int, atEOF bool) (int, []byte, error)- Processes game content
- Manages state transitions
- Handles special cases
-
Scanner Usage
scanner := NewScanner(reader) for scanner.HasNext() { game, err := scanner.ScanGame() if err != nil { // Handle error } // Process game }
-
Error Handling
- Check scanner errors
- Handle EOF conditions
- Manage partial reads
-
Memory Management
- Process games incrementally
- Avoid loading entire file
- Clean up resources
-
Buffering
- Uses bufio.Scanner for efficiency
- Maintains minimal buffer state
- Handles large files effectively
-
State Tracking
- Minimal state maintenance
- Efficient string operations
- Optimized boundary detection
-
Memory Usage
- Streaming processing
- No unnecessary allocations
- Efficient buffer management
-
Single Game Reading
game, err := scanner.ScanGame() if err != nil { // Handle error } // Process single game
-
Multi-Game Processing
for scanner.HasNext() { game, err := scanner.ScanGame() // Process each game }
-
Game Tokenization
tokens, err := TokenizeGame(game) // Process tokens
-
Input Format
- Requires well-formed PGN
- Limited error recovery
- No partial game support
-
Memory Usage
- Game-at-a-time processing
- Complete game buffering
- Token list generation
-
Error Handling
- Basic error reporting
- No detailed error context
- Limited recovery options
-
Enhanced Error Handling
- Detailed error messages
- Error recovery options
- Context preservation
-
Performance Optimization
- Reduced allocations
- Streaming tokenization
- Better buffer management
-
Feature Extensions
- Partial game support
- Better variation handling
- Enhanced comment processing