-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Added support for SpamAssassin client #43
base: master
Are you sure you want to change the base?
Conversation
Ability to use SA client for auditing the message instead of embedded SA instance, this will decouple SpamAssassin and this proxy to be managed and run indepedently.
|
||
# Check spamminess (returns Mail::SpamAssassin:PerMsgStatus object) | ||
my $result = $assassin->check($mail); | ||
# use Mail::SpamAssassin:PerMsgStatus object to rewrite message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't bother rewriting the message at all unless it isspam
or tagall
flag is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As assassin client will always rewrite the message. So in order to keep both implementation consistence this is done. I agree this operation is unwanted and ignored afterwards.
my $status = $assassin->check($mail); | ||
|
||
my $status = $self->audit(@msglines); | ||
undef @msglines; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens with SA < v3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As now the @msglines are not passed as reference it's safe to set it undef here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice that before. My Perl is rusty but this means audit()
is making a copy of the lines array (in my ($self, $msglines) = @_;
), doesn't it? In fact it has to reconstruct the array from individual members (since what is actually passed is audit($msglines[0], $msglines[1], ...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fixed now. instead of passing array of message lines. Now i am passing message as string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now SA has to re-split the message into lines.
SA::parse() ends up here: https://github.com/apache/spamassassin/blob/trunk/lib/Mail/SpamAssassin/Message.pm#L184
What was the issue with passing the array of lines by reference?
We still have the issue of re-writing the message using SA each time even if it's not spam or tag-all
.
My number one priority for these changes is that it doesn't impact existing implementations in any significant way. All this text parsing is already super expensive, no further overhead should be introduced. If anything I'd love to find ways to make this more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, the reason for parsing string instead of array is as per the documentation both method are expecting scalar string contains the mail
SpamAssassin::Client
public instance (\%) process (String $msg)
SpamAssassin
parse($message, $parse_now [, $suppl_attrib])
Parse will return a Mail::SpamAssassin::Message object with just the headers parsed. When calling this function, there are two optional parameters that can be passed in: $message is either undef (which will use STDIN), a scalar - a string containing an entire message, a reference to such string, an array reference of the message with one line per array element, or either a file glob or an IO::File object which holds the entire contents of the message; and $parse_now, which specifies whether or not to create a MIME tree at parse time or later as necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the tag-all
unfortunately i don't find a way to keep the behaviour consistent between two implementation as client always return a processed message
- rename isspam ->is_spam - call finish() $assassin response
my $status = $assassin->check($mail); | ||
|
||
my $status = $self->audit(@msglines); | ||
undef @msglines; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now SA has to re-split the message into lines.
SA::parse() ends up here: https://github.com/apache/spamassassin/blob/trunk/lib/Mail/SpamAssassin/Message.pm#L184
What was the issue with passing the array of lines by reference?
We still have the issue of re-writing the message using SA each time even if it's not spam or tag-all
.
My number one priority for these changes is that it doesn't impact existing implementations in any significant way. All this text parsing is already super expensive, no further overhead should be introduced. If anything I'd love to find ways to make this more efficient.
Ability to use SA client for auditing the message instead of embedded SA instance, this will decouple SpamAssassin and this proxy to be managed and run indepedently.
#42