Skip to content

Latest commit

 

History

History
45 lines (43 loc) · 1.54 KB

File metadata and controls

45 lines (43 loc) · 1.54 KB

LLM Output Similarity Assessment

We assess the similarity between LLM outputs and input graph content using character-level (i.e., Levenshtein distance) and word-level (i.e., TF-IDF) metrics. For PostgreSQL anomalies, we obtain similarity scores of 0.10 and 0.37, showing that the outputs are not merely copied from the knowledge graph.

Method PostgreSQL Oracle
difflib Levenshtein Jaccard TF-IDF difflib Levenshtein Jaccard TF-IDF
DBAIOps
(DeepSeek-R1 32B)
0.04 0.07 0.20 0.30 0.04 0.05 0.15 0.39
DBAIOps
(DeepSeek-R1 671B)
0.04 0.10 0.24 0.37 0.05 0.09 0.29 0.50