Add ChunkMatch.BestLineMatch to return the best-scoring line #884

jtibshirani · 2025-01-07T23:05:26Z

This PR adds a new field ChunkMatch.BestLineMatch with the line number of top-scoring line in the chunk. This will let us address a long-standing issue with our new flexible keyword search, where chunk matches can become very large. Since our search results UX only shows the start of a chunk, the most relevant line may not even be visible. With this information on the best line match, we can adjust the search results UX to center the chunk on the most relevant line.

Relates to SPLF-188

jtibshirani · 2025-01-07T23:13:55Z

api_test.go

@@ -149,7 +149,7 @@ func TestMatchSize(t *testing.T) {
 		size: 256,
 	}, {
 		v:    ChunkMatch{},
-		size: 112,
+		size: 120,


I ran the fieldalignment tool as this test suggests, and did not see a regression. Here is the output for api.go where ChunkMatch lives ... there is no mention of ChunkMatch or FileMatch:

/Users/jtibshirani/code/zoekt/api.go:232:16: struct with 136 pointer bytes could be 96 /Users/jtibshirani/code/zoekt/api.go:301:24: struct with 32 pointer bytes could be 8 /Users/jtibshirani/code/zoekt/api.go:503:19: struct with 216 pointer bytes could be 24 /Users/jtibshirani/code/zoekt/api.go:561:17: struct of size 224 could be 208 /Users/jtibshirani/code/zoekt/api.go:753:20: struct with 88 pointer bytes could be 56 /Users/jtibshirani/code/zoekt/api.go:833:27: struct with 16 pointer bytes could be 8 /Users/jtibshirani/code/zoekt/api.go:873:15: struct with 32 pointer bytes could be 16 /Users/jtibshirani/code/zoekt/api.go:929:20: struct of size 88 could be 80

janhartman

Nice!

contentprovider.go

stefanhengl · 2025-01-08T11:23:42Z

contentprovider.go

@@ -336,7 +335,7 @@ func (p *contentProvider) fillContentChunkMatches(ms []*candidateMatch, numConte
 	chunks := chunkCandidates(ms, newlines, numContextLines)
 	chunkMatches := make([]ChunkMatch, 0, len(chunks))
 	for _, chunk := range chunks {
-		score, debugScore, symbolInfo := p.candidateMatchScore(chunk.candidates, language, debug)
+		bestMatch, symbolInfo := p.candidateMatchScore(chunk.candidates, language, debug)


nice cleanup!

stefanhengl · 2025-01-08T11:31:03Z

contentprovider.go

@@ -364,18 +363,24 @@ func (p *contentProvider) fillContentChunkMatches(ms []*candidateMatch, numConte
 		}
 		firstLineStart := newlines.lineStart(firstLineNumber)

+		bestLineMatch := 0


It is worth adding this to the debugScore?

Good idea, I added it to the chunk match debug output. It looks like this:

score:5000.00 <- OverlapSymbol:4000.00, kind:Java:classes:1000.00, (line: 3)

Kind of weird but I couldn't come up with something better...

camdencheek

nice!

cla-bot bot added the cla-signed label Jan 7, 2025

Add ChunkMatch.BestLineMatch to return the best-scoring line

671bcb9

jtibshirani force-pushed the jtibs/best-line branch from ce59d31 to 671bcb9 Compare January 7, 2025 23:11

jtibshirani requested review from camdencheek and a team January 7, 2025 23:12

jtibshirani commented Jan 7, 2025

View reviewed changes

janhartman approved these changes Jan 8, 2025

View reviewed changes

stefanhengl approved these changes Jan 8, 2025

View reviewed changes

camdencheek approved these changes Jan 8, 2025

View reviewed changes

camdencheek reviewed Jan 8, 2025

View reviewed changes

Address review comments

728c237

jtibshirani merged commit b51a233 into main Jan 8, 2025
10 checks passed

jtibshirani deleted the jtibs/best-line branch January 8, 2025 17:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ChunkMatch.BestLineMatch to return the best-scoring line #884

Add ChunkMatch.BestLineMatch to return the best-scoring line #884

jtibshirani commented Jan 7, 2025

jtibshirani Jan 7, 2025

janhartman left a comment

stefanhengl Jan 8, 2025

stefanhengl Jan 8, 2025

jtibshirani Jan 8, 2025 •

edited

Loading

camdencheek left a comment

Add ChunkMatch.BestLineMatch to return the best-scoring line #884

Add ChunkMatch.BestLineMatch to return the best-scoring line #884

Conversation

jtibshirani commented Jan 7, 2025

jtibshirani Jan 7, 2025

Choose a reason for hiding this comment

janhartman left a comment

Choose a reason for hiding this comment

stefanhengl Jan 8, 2025

Choose a reason for hiding this comment

stefanhengl Jan 8, 2025

Choose a reason for hiding this comment

jtibshirani Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

camdencheek left a comment

Choose a reason for hiding this comment

jtibshirani Jan 8, 2025 •

edited

Loading