This repository contains the source code and the experimental results related to the paper Automating the Correctness Assessment of AI-generated Code for Security Contexts accepted for publication in the Journal of Systems and Software (JSS).
The paper presents ACCA, a fully automated method to evaluate the correctness of AI-generated code for security purposes. The method uses symbolic execution to assess whether the AI-generated code behaves as a reference implementation, demonstrating a very strong correlation with human-based evaluation, which is considered the ground truth for the assessment in the field.
This repository contains:
- The source code for ACCA and references file and predictions file to perform the semantic evaluation of AI-generated code and replicate our empirical analysis. The folder also contains a README.md file explaining how to run the code and how to test ACCA on a different pair of references and predictions files. (
ACCA
folder). - The files necessary for setting up the working environment, including the NASM assembler (
requirements.txt
andnasm_setup.sh
). - The results we obtained by evaluating the code generated by five AI models encompassed in our analysis, i.e., Seq2Seq, CodeBERT, CodeT5+, PLBart and ChatGPT-3.5. The folder contains an XLSX file with the results of our empirical analysis and a README.md file describing how to interpret the results (
Experimental Results
folder).
The README file is written based on our setup experience on Ubuntu 18.04.3 LTS, but ACCA works on both Windows and Linux OS.
It is strongly recommended to set up an anaconda virtual environment.
Ensure you have Anaconda3 installed, if not install Python 3.7 from Anaconda with the following steps:
- Install the list of dependencies described here
- Download the installer here. For example, you can use the
wget
command:wget https://repo.anaconda.com/archive/Anaconda3-2021.05-Linux-x86_64.sh
, then typechmod +x Anaconda3-2021.05-Linux-x86_64.sh
and runbash Anaconda3-2021.05-Linux-x86_64.sh
to complete the installation. - You may need to add anaconda directory to the PATH environment variable (e.g., you can add
export PATH="/path_to_anaconda/anaconda3/bin:$PATH"
to thebashrc
file).
- Create an anaconda Python 3.7 virtual environment using the command
conda create -n yourenvname python=3.7
. - Activate the environment by typing
source activate yourenvname
. - Run
pip install -r requirements.txt --user
to install the dependencies.
- To perform the automatic evaluation of syntactic and semantic correctness of the code snippets generated by the NMT models, you need to set up the NASM assembler. To download and install NASM (version 2.15.05), run the following command
./nasm_setup.sh
.
If you find this method to be useful for your research, please consider citing:
@article{cotroneo2024automating,
title={Automating the correctness assessment of AI-generated code for security contexts},
author={Cotroneo, Domenico and Foggia, Alessio and Improta, Cristina and Liguori, Pietro and Natella, Roberto},
journal={Journal of Systems and Software},
pages={112113},
year={2024},
publisher={Elsevier}
}
For further information, contact us via email: [email protected] (Pietro) and [email protected] (Cristina).