Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreNLP error: "Error while loading a tagger model (probably missing model file)" #24

Open
inariksit opened this issue Nov 9, 2021 · 0 comments

Comments

@inariksit
Copy link

I have successfully run the installation instructions and ant build succeeded. I am running macOS Big Sur, version 11.6.

However, when trying to run any of the demos, such as cat input-english.txt | sh run-english.sh, I get an error that says things like

  • Error while loading a tagger model (probably missing model file)
  • Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger"
    (Full version of the error is at the bottom of this message.)

Googling the error message, I find this issue in the CoreNLP itself: stanfordnlp/CoreNLP#1101
Just like the original author of that issue, I have also verified that english-left3words-distsim.tagger is present in lib/stanford-corenlp-3.6.0-models.jar:

$ unzip -l lib/stanford-corenlp-3.6.0-models.jar | grep words
        0  01-19-2016 04:03   edu/stanford/nlp/models/pos-tagger/english-left3words/
     1522  01-19-2016 04:03   edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger.props
 12409329  01-19-2016 04:03   edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger

According to the CoreNLP issue 1101, the problem of not finding the model is due to some change in class loader in the CoreNLP library. I don't really know Java, so that doesn't say anything to me. :-P

Does the CoreNLP library included in UDepLambda need to be updated? Should the Java version requirements be different?

Alternatively, would it be possible to include a script that just reads e.g. conllu files and produces the logical forms from that? I tried just splitting out this part into a separete .sh file and giving it a tab-separated .conllu file as an input, but it could not read the conllu file.

# split only the semantic parser into its own shellscript
java -Dfile.encoding="UTF-8" -cp bin:lib/* deplambda.others.NlpPipeline `# This pipeline runs semantic parser` \
    annotators tokenize,ssplit \
    tokenize.whitespace true \
    ssplit.eolonly true \
    languageCode en \
    deplambda true \
    deplambda.definedTypesFile lib_data/ud.types.txt \
    deplambda.treeTransformationsFile lib_data/ud-enhancement-rules.proto \
    deplambda.relationPrioritiesFile lib_data/ud-obliqueness-hierarchy.proto  \
    deplambda.lambdaAssignmentRulesFile lib_data/ud-substitution-rules.proto \
    deplambda.lexicalizePredicates true \
    deplambda.debugToFile debug.txt \
    nthreads 1

Finally, here's the full output when I run run-english.sh.

$ cat input-english.txt | sh run-english.sh 
{tokenize.whitespace=true, annotators=tokenize,ssplit, preprocess.addNamedEntities=true, ssplit.eolonly=true, preprocess.addDateEntities=true, nthreads=1}
{annotators=tokenize,ssplit,pos,lemma,ner,depparse, tokenize.language=en, ssplit.eolonly=true, nthreads=1}
{deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, tokenize.whitespace=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, deplambda.lexicalizePredicates=true, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, ssplit.eolonly=true, nthreads=1, languageCode=en, deplambda.debugToFile=debug.txt}
{tokenize.whitespace=true, posTagKey=UD, annotators=tokenize,ssplit,pos, ssplit.eolonly=true, nthreads=1, languageCode=en, pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger}
NlpPipeline Specified Options : {preprocess.addDateEntities=true, ssplit.eolonly=true, tokenize.whitespace=true, nthreads=1, preprocess.addNamedEntities=true, annotators=tokenize,ssplit}
NlpPipeline Specified Options : {depparse.extradependencies=ref_only_uncollapsed, ssplit.eolonly=true, tokenize.language=en, nthreads=1, annotators=tokenize,ssplit,pos,lemma,ner,depparse}
NlpPipeline Specified Options : {pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger, posTagKey=UD, ssplit.eolonly=true, tokenize.whitespace=true, languageCode=en, nthreads=1, annotators=tokenize,ssplit,pos}NlpPipeline Specified Options : {nthreads=1, deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, ssplit.eolonly=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, tokenize.whitespace=true, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.debugToFile=debug.txt, deplambda.lexicalizePredicates=true, languageCode=en}

Adding annotator tokenize
Adding annotator tokenize
Adding annotator tokenize
Adding annotator tokenize
Adding annotator ssplit
Adding annotator ssplit
Adding annotator ssplit
Adding annotator ssplit
Adding annotator pos
Adding annotator pos
{tokenize.whitespace=true, annotators=tokenize,ssplit, preprocess.addNamedEntities=true, ssplit.eolonly=true, preprocess.addDateEntities=true, nthreads=1}
{deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto, tokenize.whitespace=true, deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto, annotators=tokenize,ssplit, deplambda=true, deplambda.lexicalizePredicates=true, deplambda.definedTypesFile=lib_data/ud.types.txt, deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto, ssplit.eolonly=true, nthreads=1, languageCode=en, deplambda.debugToFile=debug.txt}
Loading DepLambda Model.. 
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)
	at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:799)
	at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:320)
	at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:273)
	at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:85)
	at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:73)
	at edu.stanford.nlp.pipeline.AnnotatorImplementations.posTagger(AnnotatorImplementations.java:53)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$43(StanfordCoreNLP.java:544)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$70(StanfordCoreNLP.java:625)
	at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
	at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
	at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:495)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:201)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:194)
	at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:181)
	at in.sivareddy.graphparser.util.NlpPipeline.<init>(NlpPipeline.java:144)
	at deplambda.others.NlpPipeline.<init>(NlpPipeline.java:41)
	at deplambda.others.NlpPipeline.main(NlpPipeline.java:137)
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as class path, filename or URL
	at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:481)
	at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:797)
	... 17 more
deplambda.definedTypesFile=lib_data/ud.types.txt
deplambda.treeTransformationsFile=lib_data/ud-enhancement-rules.proto
deplambda.relationPrioritiesFile=lib_data/ud-obliqueness-hierarchy.proto
deplambda.lambdaAssignmentRulesFile=lib_data/ud-substitution-rules.proto
Loaded DepLambda Model.. 
Loading POS tagger from lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger ... done [0.9 sec].
{tokenize.whitespace=true, posTagKey=UD, annotators=tokenize,ssplit,pos, ssplit.eolonly=true, nthreads=1, languageCode=en, pos.model=lib_data/ud-models-v1.2/en/pos-tagger/utb-caseless-en-bidirectional-glove-distsim-lower.full.tagger}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant