DBAIOps/LLM_output_similarity.md at master · OpenDataBox/DBAIOps

LLM Output Similarity Assessment

We assess the similarity between LLM outputs and input graph content using character-level (i.e., Levenshtein distance) and word-level (i.e., TF-IDF) metrics. For PostgreSQL anomalies, we obtain similarity scores of 0.10 and 0.37, showing that the outputs are not merely copied from the knowledge graph.

Method	PostgreSQL				Oracle
Method	difflib	Levenshtein	Jaccard	TF-IDF	difflib	Levenshtein	Jaccard	TF-IDF
DBAIOps (DeepSeek-R1 32B)	0.04	0.07	0.20	0.30	0.04	0.05	0.15	0.39
DBAIOps (DeepSeek-R1 671B)	0.04	0.10	0.24	0.37	0.05	0.09	0.29	0.50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Output Similarity Assessment

FilesExpand file tree

LLM_output_similarity.md

Latest commit

History

LLM_output_similarity.md

File metadata and controls

LLM Output Similarity Assessment