Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DupePredictor should assign more weight for recent samples #1

Open
lopuhin opened this issue Apr 26, 2016 · 0 comments
Open

DupePredictor should assign more weight for recent samples #1

lopuhin opened this issue Apr 26, 2016 · 0 comments

Comments

@lopuhin
Copy link
Contributor

lopuhin commented Apr 26, 2016

It makes sense to assing more weight to recent samples in DupePredictor; together with TeamHG-Memex/undercrawler#41 it should allow to handle a case when crawler first visits a large part A of website, learns a pattern, then it goes to another part B of a website where this pattern is no longer valid.

Moved from TeamHG-Memex/undercrawler#42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant