Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ChunkMatch.BestLineMatch to return the best-scoring line #884

Merged
merged 2 commits into from
Jan 8, 2025

Conversation

jtibshirani
Copy link
Member

This PR adds a new field ChunkMatch.BestLineMatch with the line number of top-scoring line in the chunk. This will let us address a long-standing issue with our new flexible keyword search, where chunk matches can become very large. Since our search results UX only shows the start of a chunk, the most relevant line may not even be visible. With this information on the best line match, we can adjust the search results UX to center the chunk on the most relevant line.

Relates to SPLF-188

@cla-bot cla-bot bot added the cla-signed label Jan 7, 2025
@jtibshirani jtibshirani requested review from camdencheek and a team January 7, 2025 23:12
@@ -149,7 +149,7 @@ func TestMatchSize(t *testing.T) {
size: 256,
}, {
v: ChunkMatch{},
size: 112,
size: 120,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the fieldalignment tool as this test suggests, and did not see a regression. Here is the output for api.go where ChunkMatch lives ... there is no mention of ChunkMatch or FileMatch:

/Users/jtibshirani/code/zoekt/api.go:232:16: struct with 136 pointer bytes could be 96
/Users/jtibshirani/code/zoekt/api.go:301:24: struct with 32 pointer bytes could be 8
/Users/jtibshirani/code/zoekt/api.go:503:19: struct with 216 pointer bytes could be 24
/Users/jtibshirani/code/zoekt/api.go:561:17: struct of size 224 could be 208
/Users/jtibshirani/code/zoekt/api.go:753:20: struct with 88 pointer bytes could be 56
/Users/jtibshirani/code/zoekt/api.go:833:27: struct with 16 pointer bytes could be 8
/Users/jtibshirani/code/zoekt/api.go:873:15: struct with 32 pointer bytes could be 16
/Users/jtibshirani/code/zoekt/api.go:929:20: struct of size 88 could be 80

Copy link

@janhartman janhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

contentprovider.go Outdated Show resolved Hide resolved
@@ -336,7 +335,7 @@ func (p *contentProvider) fillContentChunkMatches(ms []*candidateMatch, numConte
chunks := chunkCandidates(ms, newlines, numContextLines)
chunkMatches := make([]ChunkMatch, 0, len(chunks))
for _, chunk := range chunks {
score, debugScore, symbolInfo := p.candidateMatchScore(chunk.candidates, language, debug)
bestMatch, symbolInfo := p.candidateMatchScore(chunk.candidates, language, debug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice cleanup!

@@ -364,18 +363,24 @@ func (p *contentProvider) fillContentChunkMatches(ms []*candidateMatch, numConte
}
firstLineStart := newlines.lineStart(firstLineNumber)

bestLineMatch := 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth adding this to the debugScore?

Copy link
Member Author

@jtibshirani jtibshirani Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I added it to the chunk match debug output. It looks like this:

score:5000.00 <- OverlapSymbol:4000.00, kind:Java:classes:1000.00, (line: 3)

Kind of weird but I couldn't come up with something better...

Copy link
Member

@camdencheek camdencheek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@jtibshirani jtibshirani merged commit b51a233 into main Jan 8, 2025
10 checks passed
@jtibshirani jtibshirani deleted the jtibs/best-line branch January 8, 2025 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants