Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed #54

Open
hg2051 opened this issue Oct 26, 2020 · 1 comment
Open

Speed #54

hg2051 opened this issue Oct 26, 2020 · 1 comment

Comments

@hg2051
Copy link

hg2051 commented Oct 26, 2020

Spacy Stanza is much slower than merely Stanza

@adrianeboyd
Copy link
Contributor

There is some overhead in aligning the annotation and creating the spacy Doc, although I wouldn't have expected it to be that significant vs. the stanza processing time.

The main difference may be batching, though. stanza doesn't have any native support for batching (they just suggest concatenating docs with \n\n) so the spacy-stanza wrapper processes each text individually, even with nlp.pipe, since we also want to be able to process docs that contain \n\n without problems.

Can you provide more details about how you're using spacy-stanza?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants