You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have phrases with named entities that I want the word_segmentation API to ignore. I tried replacing the named entities with SPECIAL_TOKEN_1, SPECIAL_TOKEN_2 etc in the phrase itself, then passing SPECIAL_TOKEN_1 and SPECIAL_TOKEN_2 as ignore_token to the call to word_segmentation. I cannot get this to work.
phrase = "Hello SPECIAL_TOKEN_1, I am happyto meet you tomorrowmorning. Thanks, SPECIAL_TOKEN_2"
phrase_suggestions = sym_spell.word_segmentation(test_phrase)
phrase_suggestions looks like this:
Composition(segmented_string='Hello **SPECIAL _TOKEN_ 1,** I am happy to meet you tomorrow morning. Thanks, **SPECIAL_ TOKEN_2**', corrected_string='Hello Special token of I am happy to meet you tomorrow morning Thanks Special Token', distance_sum=14, log_prob_sum=-55.6460931972679)
Notice how SPECIAL_TOKEN_1 and SPECIAL_TOKEN_2 get broken.
I tried using the ignore_token argument but cannot get it to work--
phrase = "Hello SPECIAL_TOKEN_1, I am happyto meet you tomorrowmorning. Thanks, SPECIAL_TOKEN_2"
phrase_suggestions = sym_spell.word_segmentation(test_phrase, ignore_token='SPECIAL_TOKEN_1')
I get back the same phrase_suggestions as before. Also not sure how to pass multiple tokens to ignore.
and I get the following returned as phrase_suggestions:
Composition(segmented_string='Hello **SPECIAL _TOKEN_ 1**, I am happy to meet you tomorrow morning. Thanks, **SPECIAL_ TOKEN_2**', corrected_string='Hello Special token of I am happy to meet you tomorrow morning Thanks Special Token', distance_sum=14, log_prob_sum=-55.6460931972679)
Could you please help and also add more documentation on using this parameter?
What's the recommended way to deal with named entities?
The text was updated successfully, but these errors were encountered:
I have phrases with named entities that I want the
word_segmentation
API to ignore. I tried replacing the named entities withSPECIAL_TOKEN_1
,SPECIAL_TOKEN_2
etc in the phrase itself, then passingSPECIAL_TOKEN_1
andSPECIAL_TOKEN_2
asignore_token
to the call toword_segmentation
. I cannot get this to work.phrase_suggestions looks like this:
Notice how
SPECIAL_TOKEN_1
andSPECIAL_TOKEN_2
get broken.I tried using the
ignore_token
argument but cannot get it to work--I get back the same
phrase_suggestions
as before. Also not sure how to pass multiple tokens to ignore.Also tried:
and I get the following returned as
phrase_suggestions
:Could you please help and also add more documentation on using this parameter?
What's the recommended way to deal with named entities?
The text was updated successfully, but these errors were encountered: