Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Added support for SpamAssassin client #43

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

ramank775
Copy link

Ability to use SA client for auditing the message instead of embedded SA instance, this will decouple SpamAssassin and this proxy to be managed and run indepedently.

#42

Ability to use SA client for auditing the message instead of embedded SA instance, this will decouple SpamAssassin and this proxy to be managed and run indepedently.

# Check spamminess (returns Mail::SpamAssassin:PerMsgStatus object)
my $result = $assassin->check($mail);
# use Mail::SpamAssassin:PerMsgStatus object to rewrite message
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't bother rewriting the message at all unless it isspam or tagall flag is set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As assassin client will always rewrite the message. So in order to keep both implementation consistence this is done. I agree this operation is unwanted and ignored afterwards.

my $status = $assassin->check($mail);

my $status = $self->audit(@msglines);
undef @msglines;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with SA < v3?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As now the @msglines are not passed as reference it's safe to set it undef here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice that before. My Perl is rusty but this means audit() is making a copy of the lines array (in my ($self, $msglines) = @_;), doesn't it? In fact it has to reconstruct the array from individual members (since what is actually passed is audit($msglines[0], $msglines[1], ...).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now. instead of passing array of message lines. Now i am passing message as string.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, now SA has to re-split the message into lines.
SA::parse() ends up here: https://github.com/apache/spamassassin/blob/trunk/lib/Mail/SpamAssassin/Message.pm#L184

What was the issue with passing the array of lines by reference?

We still have the issue of re-writing the message using SA each time even if it's not spam or tag-all.

My number one priority for these changes is that it doesn't impact existing implementations in any significant way. All this text parsing is already super expensive, no further overhead should be introduced. If anything I'd love to find ways to make this more efficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the reason for parsing string instead of array is as per the documentation both method are expecting scalar string contains the mail

SpamAssassin::Client

public instance (\%) process (String $msg)

SpamAssassin

parse($message, $parse_now [, $suppl_attrib])
Parse will return a Mail::SpamAssassin::Message object with just the headers parsed. When calling this function, there are two optional parameters that can be passed in: $message is either undef (which will use STDIN), a scalar - a string containing an entire message, a reference to such string, an array reference of the message with one line per array element, or either a file glob or an IO::File object which holds the entire contents of the message; and $parse_now, which specifies whether or not to create a MIME tree at parse time or later as necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the tag-all unfortunately i don't find a way to keep the behaviour consistent between two implementation as client always return a processed message

spampd.pl Outdated Show resolved Hide resolved
spampd.pl Outdated Show resolved Hide resolved
spampd.pl Outdated Show resolved Hide resolved
spampd.pl Outdated Show resolved Hide resolved
spampd.pl Outdated Show resolved Hide resolved
my $status = $assassin->check($mail);

my $status = $self->audit(@msglines);
undef @msglines;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, now SA has to re-split the message into lines.
SA::parse() ends up here: https://github.com/apache/spamassassin/blob/trunk/lib/Mail/SpamAssassin/Message.pm#L184

What was the issue with passing the array of lines by reference?

We still have the issue of re-writing the message using SA each time even if it's not spam or tag-all.

My number one priority for these changes is that it doesn't impact existing implementations in any significant way. All this text parsing is already super expensive, no further overhead should be introduced. If anything I'd love to find ways to make this more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants