Topic Change Detection #134

vdpappu · 2019-10-31T02:54:30Z

Let's discuss the potential next steps for improving topic change detection and update the activities here.

reaganrewop · 2019-10-31T17:10:14Z

The ideal goal for the topic change detection is to, slice the meeting into multiple partition where each partition carries enough information to redeem itself as a discussion.

The following needs to be addressed to achieve this:

cosine similarity as solo edge weights.
A mixture of topics in a single segment (not an ideal case.)
what is the factors for grouping of segments (currently it's the order of the segments by which they were spoken at)
pruning of the edges. How do we prune the respective irrelevant edge?
filler sentences in the segment causing overlapping groups.

going with our current implementation, I made few extra implementations to try to fix the last issue.

handling spillover sentences was rather important because it caused many overlapping groups. I am currently handling this by checking for duplicate segments across the communities and if found, I remove them if majority of the sentences from that segment are placed in a different community.

doing this increased the accuracy of the communities by a large amount and no overlapping groups would be formed.

reaganrewop · 2019-11-19T10:59:04Z

To improve the current communities approach (the one on staging) or to be precise, to understand what is best for communities, I went through few papers and methods to understand how effective it can be. Based on that I made few changes to the algorithm.

Instead of fully connected network, we connect two sentences only if they are either from same segment or from the next. This helps to reduce cosine similarity noise.
Normalizing the graph is now a bit different. we compute local normalization score for each node and then for the overlapping edge values, we average the score.
community approach relies on self-loops, so that is also added.
Based on this paper https://arxiv.org/pdf/0812.1770.pdf , we add another resolution parameter t, which helps to control the stability of the network.

Based on the validation set, the accuracy increased form 47 percent to 79 percent.

vdpappu assigned reaganrewop Oct 31, 2019

vdpappu added the priority label Oct 31, 2019

karthikmuralidharan mentioned this issue Nov 20, 2019

Topic based PIMs. #117

Closed

saibaggins unassigned reaganrewop Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topic Change Detection #134

Topic Change Detection #134

vdpappu commented Oct 31, 2019

reaganrewop commented Oct 31, 2019 •

edited

Loading

reaganrewop commented Nov 19, 2019 •

edited by unfurl-links bot

Loading

Topic Change Detection #134

Topic Change Detection #134

Comments

vdpappu commented Oct 31, 2019

reaganrewop commented Oct 31, 2019 • edited Loading

reaganrewop commented Nov 19, 2019 • edited by unfurl-links bot Loading

reaganrewop commented Oct 31, 2019 •

edited

Loading

reaganrewop commented Nov 19, 2019 •

edited by unfurl-links bot

Loading