Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions for Designing Your Experiments and Creating a Motivation Example #1

Open
pooyanjamshidi opened this issue Sep 19, 2024 · 1 comment
Assignees

Comments

@pooyanjamshidi
Copy link

pooyanjamshidi commented Sep 19, 2024

Specific Task:

For this project, your main challenge is improving phishing detection by developing a real-time, multimodal system based on transformers and other features like URLs and metadata.

Experiment Design:

  • Dataset: Start by fine-tuning a pre-trained transformer (e.g., BERT, GPT) on datasets such as SpamAssassin or PhishTank.
  • Model: Focus on testing the GRU model from the paper “Multimodel Phishing URL Detection” for real-time classification, as it has lower latency.
  • Model Enhancements: Experiment with combining text-based embeddings with URL and metadata features. Measure how well the multimodal model improves phishing classification accuracy.

Motivation Example:

Present a plot comparing phishing classification accuracy and latency between your GRU-based multimodal model and baseline models like BERT or GPT-2. This will demonstrate whether your system can handle real-time detection more efficiently than existing offline solutions.

Evaluation Focus:

  • Metrics: Use accuracy metrics like the F1 score and precision-recall curves. Compare these metrics against the thresholds from the existing paper (LSTM: 96.9%, Bi-LSTM: 99%, GRU: 97.5%).
  • Real-time testing: Show how well the model performs on live email streams and evaluate its speed (latency).
@Keshawn7B
Copy link
Collaborator

Keshawn7B commented Sep 19, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants