GitHub - mcstaiano/ITALERT: ITALERT: Italian Emergency Response Text is a novel bilingual corpus designed to investigate the performance of Large Language Models (LLMs) and Neural Machine Translation (NMT) systems in translating high-stakes emergency messages.

ITALERT (Italian Emergency Response Text) is a novel bilingual corpus designed to investigate the performance of Large Language Models (LLMs) and Neural Machine Translation (NMT) systems in translating high-stakes emergency messages. The dataset is part of a broader effort to assess translation quality in critical contexts, using a human-centric post-editing based metric (HOPE) and inter-annotator agreement analysis.

The initial version of the ITALERT corpus contains 440 sentence-level segments extracted from the official website of the Italian Civil Protection Department, as part of the "Io non rischio" public communication campaign. The texts cover eight crisis scenarios: flooding, earthquake, forest fire, volcanic eruption, tsunami, industrial accident, nuclear risk, and dam failure.

The corpus currently comprises a total of 13,218 words — 6,622 in Italian and 6,596 in English — and is distributed across the eight emergency subdomains. Each segment has been translated automatically by different systems (namely GPT-4o by OpenAI and Google Translate) and then annotated manually by three human annotators to assess translation quality.

The annotations capture:

Binary error presence
Fine-grained error types (accuracy, fluency, register, terminology, etc.)
Inter-annotator agreement metrics (e.g., Fleiss' Kappa, Krippendorff's Alpha, Cohen's Kappa, IRR)

Paper:
Staiano, M. C., Han, L., Monti, J., & Chiusaroli, F. (2025). ITALERT: Assessing the Quality of LLMs and NMT in Translating Italian Emergency Response Text. In Proceedings of the 20th Machine Translation Summit, Translator and Users Track. Geneva, Switzerland.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
IAA_data		IAA_data
corpus		corpus
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

mcstaiano/ITALERT

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages