Software Failure Analysis with GPT and Gemini Models

Overview

This project leverages Large Language Models (LLMs) like GPT-4 and Gemini to classify and analyze software failure incidents. The primary objective is to automate the summarization, categorization, and extraction of key information from articles related to software vulnerabilities, supply chain attacks, and other cyber incidents.

The project uses multiple APIs, including OpenAI's GPT API and Google's Gemini API, to generate content, categorize articles, and analyse data. It utilizes environment variables to manage API keys securely.

Files Overview

In src:

1. `Geminiprompts.py`

This script utilizes the Google Gemini API to generate content and analyze software-related articles. It connects to the Gemini API using the gemini-1.5-pro-latest model to generate summaries, identify key vulnerabilities, and classify information.

Key Features:

Configures Gemini models using an API key stored in environment variables.
Summarizes articles to extract critical insights.
Lists available models for content generation.
Demonstrates the usage of the Gemini models for content generation and analysis.

2. `Gpt4oPrompts.py`

This script focuses on utilizing the OpenAI GPT API for content generation. It uses the GPT-4o model to analyze software articles, classify incidents, and generate summaries.

Key Features:

Configures GPT models using environment variables for secure access.
Extracts insights from articles related to software vulnerabilities and incidents.
Includes functionality for categorizing incidents into predefined categories like negligence, malicious maintainers, attack chaining, etc.
Uses metrics like Cohen's Kappa Score for evaluating classification consistency.

In data:

3. `articles.py`

Contains a collection of software-related articles used for testing the Gemini and GPT models. These articles describe real-world incidents involving software vulnerabilities, supply chain attacks, and other cybersecurity issues.

4. `articles.csv`

A dataset containing articles with metadata used for training and evaluating the models.

5. `Classifying articles_analysis.xlsx`, `LLM Data.xlsx`, `Manual Labels.xlsx`

Excel files used for manual classification, data analysis, and evaluation of model performance. These files help in comparing the automated classification results with human-labeled data.

Setup Instructions

Clone the Repository:

git clone https://github.com/YourUsername/YourRepository.git
cd YourRepository

Create a .env File:

Add the following lines to your .env file:

API_KEY=your_gemini_api_key
OPENAI_API_KEY=your_openai_api_key

Install Dependencies:
```
pip install -r requirements.txt
```

Usage

Run the Geminiprompts.py script:
```
python Geminiprompts.py
```
This will generate summaries and analyze the articles using the Gemini API.
Run the Gpt4oPrompts.py script:
```
python Gpt4oPrompts.py
```
This will categorize articles and extract insights using the GPT API.

Future Work

Improve the classification accuracy by fine-tuning the prompts for GPT and Gemini models.
Expand the dataset with more articles to enhance model training.
Integrate additional metrics for performance evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
src		src
ECE_570_CP3.pdf		ECE_570_CP3.pdf
Overview_and_Demo.mp4		Overview_and_Demo.mp4
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Software Failure Analysis with GPT and Gemini Models

Overview

Table of Contents

Files Overview

In src:

1. `Geminiprompts.py`

2. `Gpt4oPrompts.py`

In data:

3. `articles.py`

4. `articles.csv`

5. `Classifying articles_analysis.xlsx`, `LLM Data.xlsx`, `Manual Labels.xlsx`

Setup Instructions

Usage

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Tanmay182003/ECE-570-project

Folders and files

Latest commit

History

Repository files navigation

Software Failure Analysis with GPT and Gemini Models

Overview

Table of Contents

Files Overview

In src:

1. Geminiprompts.py

2. Gpt4oPrompts.py

In data:

3. articles.py

4. articles.csv

5. Classifying articles_analysis.xlsx, LLM Data.xlsx, Manual Labels.xlsx

Setup Instructions

Usage

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. `Geminiprompts.py`

2. `Gpt4oPrompts.py`

3. `articles.py`

4. `articles.csv`

5. `Classifying articles_analysis.xlsx`, `LLM Data.xlsx`, `Manual Labels.xlsx`

Packages