Skip to content

Commit 154dbed

Browse files
committed
feat: enhance Lucene query parser to support full Apache Lucene syntax
Implements comprehensive Apache Lucene query parser syntax support as specified in https://lucene.apache.org/core/2_9_4/queryparsersyntax.html - **Quoted phrases**: "hello world" for exact matching - **Alternative boolean operators**: && (AND), || (OR), ! (NOT) - **Required operator**: +term (must match) - **Prohibited operator**: -term (must not match) - **Range queries**: [min TO max] (inclusive), {min TO max} (exclusive) - **Open-ended ranges**: [18 TO *], [* TO 100] - **Date ranges**: [2024-01-01 TO 2024-12-31] - **Wildcard searches**: Enhanced support for *, ? patterns - **Grouping**: Complex nested boolean expressions - **Special character handling**: Backslash escaping support - New lexer (lexer.go) for proper tokenization of all Lucene operators - Enhanced parser (parser_new.go) with recursive descent parsing - Comparison operators support (>, <, >=, <=) for range queries - Automatic parser selection based on query syntax - Backward compatibility with existing simple queries - **SQL**: Generates parameterized WHERE clauses with comparison operators - **DynamoDB PartiQL**: Converts to PartiQL with range support - **Map format**: Preserves all query semantics for custom backends - Comprehensive test suite (parser_test.go) with 60+ test cases - Tests for all new operators and syntax features - Performance benchmarks included - Coverage for edge cases and complex nested queries - Complete README with usage examples - Feature matrix showing supported syntax - Backend compatibility notes - Performance characteristics - Known limitations documented ```go // Range queries "age:[18 TO 65] AND status:active" // Alternative operators "(name:john || name:jane) && !status:inactive" // Quoted phrases with wildcards `title:"Apache Lucene" AND email:*@example.com` // Complex nested queries "(name:john* OR email:*@example.com) AND age:[25 TO *]" ``` - Zero breaking changes to existing API - Enhanced parser auto-selected for advanced syntax - Legacy parser maintained for simple queries - Best-effort conversion for backends without native support - Range queries converted to comparison operators in SQL - Fuzzy/proximity searches approximated with wildcards
1 parent b799c39 commit 154dbed

7 files changed

Lines changed: 2042 additions & 343 deletions

File tree

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ require (
3636
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
3737
github.com/go-sql-driver/mysql v1.8.1 // indirect
3838
github.com/golang-jwt/jwt/v5 v5.3.0 // indirect
39+
github.com/grindlemire/go-lucene v0.0.26 // indirect
3940
github.com/jackc/pgpassfile v1.0.0 // indirect
4041
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
4142
github.com/jackc/pgx/v5 v5.6.0 // indirect

go.sum

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@ github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
7575
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
7676
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
7777
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
78+
github.com/grindlemire/go-lucene v0.0.26 h1:81ttZkMvU3rFD0TfmjdIZT2U0Fd4TT7buDy+xq1x5EQ=
79+
github.com/grindlemire/go-lucene v0.0.26/go.mod h1:INRJBdhkLjS4jc7XgkGPfzC5wuFg3BHDukXMTc+OTbc=
7880
github.com/jackc/pgpassfile v1.0.0 h1:/6Hmqy13Ss2zCq62VdNG8tM1wchn8zjSGOBJ6icpsIM=
7981
github.com/jackc/pgpassfile v1.0.0/go.mod h1:CEx0iS5ambNFdcRtxPj5JhEz+xB6uRky5eyVu/W2HEg=
8082
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 h1:iCEnooe7UlwOQYpKFhBabPMi4aNAfoODPEFNiAnClxo=

mql/parser.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,15 +94,16 @@ func (p *Parser) parseTerm() (Expr, error) {
9494
key := p.text
9595
p.next()
9696

97-
if p.text == ":" {
97+
switch p.text {
98+
case ":":
9899
p.next()
99100
val := p.text
100101
p.next()
101102
return &TermExpr{Key: key, Op: ":", Value: val}, nil
102-
} else if p.text == "IN" {
103+
case "IN":
103104
p.next()
104105
return p.parseList(key, "IN")
105-
} else if p.text == "NOT" {
106+
case "NOT":
106107
p.next()
107108
if p.text != "IN" {
108109
return nil, fmt.Errorf("expected 'IN' after 'NOT'")

0 commit comments

Comments
 (0)