LogPPT consists of the following components:
-
Data Sampling: A few-shot data sampling algorithm, which is used to select
$K$ labelled logs for training ($K$ is small). - Prompt-based Parsing: A module to tune a pre-trained language model using prompt tuning for log parsing
- Online Parsing: A caching mechanism to support efficient and consistent online parsing.
- Python >=3.9
- torch
- transformers
- ...
To install all library:
$ pip install -r requirements.txt
To download the pre-trained language model:
$ cd pretrained_models/roberta-base
$ bash download.sh
$ cd demo
$ python 01_sampling.py
$ cd demo
$ export dataset=Apache
$ python 02_run_logppt.py --log_file ../datasets/loghub-full/$dataset/${dataset}_full.log_structured.csv --model_name_or_path roberta-base --train_file ../datasets/loghub-full/$dataset/samples/logppt_32.json --validation_file ../datasets/loghub-full/$dataset/validation.json --dataset_name $dataset --parsing_num_processes 4 --output_dir ./results/models/$dataset --task_output_dir ./results/logs --max_train_steps 1000
The parsed logs (parsing results) are saved in the outputs
folder.
For the descriptions of all parameters, please use:
python 02_run_logppt.py --help
python 03_evaluate.py
** Implementations for baselines are adopted from Tools and Benchmarks for Automated Log Parsing, and Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques.
- Accuracy:
- Robustness:
Robustness across different log data types
Robustness across different numbers of training data
- Accuracy on Unseen Logs:
Running time of different log parsers under different volume
- We exclude the Virtual Label Token Generation module and let the pre-trained model automatically assign the embedding for the virtual label token “I-PAR”. To measure the contribution of the Adaptive Random Sampling module, we remove it from our model and randomly sample the log messages for labelling.
- We vary the number of label words from 1 to 16 used in the Virtual Label Token Generation module.
Results with different numbers of label words
We compare LogPPT with fine-tuning, hard-prompt, and soft-prompt.
- Effectiveness:
Accuracy across different tuning methods
- Efficiency:
Parsing time across different tuning methods
Additional results with PTA and RTA metrics
-
PTA: The ratio of correctly identified templates over the total number of identified templates.
-
RTA: The ratio of correctly identified templates over the total number of oracle templates.
Raw logs | Events |
---|---|
TEST 9/13884 [2/2 concurrent test workers running] | TEST <*> [<*> concurrent test workers running] |
(1.039 s) Test touch() function : basic functionality [ext/standard/tests/file/touch_basic.phpt] | <*> Test touch() function : basic functionality <*> |
(120.099 s) Bug #60120 (proc_open hangs when data in stdin/out/err is getting larger or equal to 2048) [ext/standard/tests/file/bug60120.phpt] | <*> Bug <*> (proc_open hangs when data in <*> is getting larger or equal to <*>) <*> |
SKIP Bug #54977 UTF-8 files and folder are not shown [ext/standard/tests/file/windows_mb_path/bug54977.phpt] reason: windows only test | SKIP Bug <*> UTF-8 files and folder are not shown <*> reason: windows only test |
Exts skipped : 17 | Exts skipped : <*> |