Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magpie used in binary classification..... #157

Closed
JessicaKuo opened this issue Sep 25, 2018 · 4 comments
Closed

Magpie used in binary classification..... #157

JessicaKuo opened this issue Sep 25, 2018 · 4 comments

Comments

@JessicaKuo
Copy link

Hello,

I have used magpie for multi-label text classification before and found that it's a powerful tool.

Recently, I tried to use magpie to run a binary text classification. And I have 217 cases totally and split them into 4:1 for training and testing in this research. But I got the output result like this:

answerICD= tumor positive,, 2

Predict: 0 tumor positive

tumor positive 0.51141936

tumor negative 0.5093596 answerICD= tumor positive,, 2

Predict: 0 tumor positive

tumor positive 0.51141936

tumor negative 0.5093596 answerICD= tumor negative, 1

tumor positive 0.51141936

tumor negative 0.5093596 answerICD= tumor negative, 1

tumor positive 0.51141936

tumor negative 0.5093596

As you can see , it output the same probabilities of these two labels in each testing case...it's result is pretty strange, so I want to ask is there any suggestion or explanation of this output result?

Thanks for your patient looking!

@dorg-ekrolewicz
Copy link

dorg-ekrolewicz commented Sep 25, 2018 via email

@JessicaKuo
Copy link
Author

Thanks for your reply. I found that I didn't do the preprocessing procedure of line break problem so it only read the first line of text (all the same content) so it all output the same probability. It has been solved now.
Sorry for inconvenience and thanks for your kind reply.

@kaundinya5
Copy link

This is happening to me as well, I'm trying to classify policy numbers and account numbers, both of which are alphanumeric. I trained the model and I'm always getting the same probabilities! Since the .txt files contain just 1 word, I changed the minimum number of words in word2vec to 1. Am I doing something wrong?

@jstypka
Copy link
Collaborator

jstypka commented Sep 26, 2018

as mentioned in #158, Magpie is not of much help if your document contains only one word.

@jstypka jstypka closed this as completed Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants