Official repository for our ADMA'25 paper "LegalDuet: Learning Fine-grained Representations for Legal Judgment Prediction via a Dual-View Contrastive Learning".
🎉 We are honored that LegalDuet has received the Best Paper Award at ADMA 2025 (The 21st International Conference on Advanced Data Mining and Applications)!
This repository provides resources for our paper LegalDuet, which proposes a new method to enhance the accuracy of Legal Judgment Prediction (LJP). Our model leverages a dual-view legal reasoning mechanism designed to emulate a judge's reasoning process when analyzing legal cases. This approach involves:
- Law Case Clustering: Utilizing past legal decisions to inform current judgments.
- Legal Decision Matching: Extracting specific legal rules and triggers to improve prediction quality.
We used the CAIL benchmark, based on the CAIL2018 dataset, to comprehensively evaluate legal judgment prediction models.
Key tasks include:
- Law Article Prediction: Determining the correct legal articles applicable to a given case.
- Charge Prediction: Predicting the correct charge based on the criminal facts.
- Imprisonment Prediction: Estimating the sentence length based on case specifics.
LegalDuet employs two key reasoning modules:
- Law Case Clustering: Uses past cases and decisions to inform new judgments, identifying subtle differences between similar cases to refine predictions.
- Legal Decision Matching: Focuses on the specific legal articles and charges related to a case, enabling a more structured legal decision-making process.
The model is pre-trained using these dual mechanisms, creating a more tailored embedding space for legal tasks.
conda create -n LegalDuet_env python==3.8
conda activate LegalDuet_env
Check out and install requirements.
git clone https://github.com/NEUIR/LegalDuet.git
cd LegalDuet
pip install -r requirements.txt
To quickly start using our model, you can download our pretrained model from Hugging Face: 🤗 Model
Once downloaded, navigate to the Fine-Tuning directory to begin fine-tuning:
cd Fine-Tuning
For detailed instructions on how to use the pretrained model, refer to Fine-Tuning/README.md
To reproduce the LegalDuet pretraining process, you will need the pretraining data.
The pretraining data rest_data.json can be downloaded from the following link:📂 Pretraining Dataset
Once downloaded, navigate to the LegalDuet directory to begin reproducing:
cd LegalDuet
For detailed instructions on how to use the pretrained model, refer to LegalDuet/README.md
We conducted a comparative study of embedding spaces to evaluate the discriminative power of LegalDuet embeddings. Using t-SNE, we visualized the embedding spaces of BERT, BERT+LegalDuet, and calculated the DBI reduction values to further investigate its capability in fine-grained differentiation.
The Legal Judgment Prediction Performance on the CAIL-small Dataset. The best evaluation results are highlighted in bold, and the underlined scores indicate the second-best results across all models.
The Legal Judgment Prediction Performance on the CAIL-big Dataset. The best evaluation results are highlighted in bold, and the underlined scores indicate the second-best results across all models.
Please cite the paper and star the repo if you use LegalDuet and find it helpful.
@inproceedings{xu2025legalduet,
title={LegalDuet: Learning Fine-Grained Representations for Legal Judgment Prediction via a Dual-View Contrastive Learning},
author={Xu, Buqiang and Dai, Xin and Liu, Zhenghao and Xie, Huiyuan and Yi, Xiaoyuan and Wang, Shuo and Yan, Yukun and Yang, Liner and Gu, Yu and Yu, Ge},
booktitle={International Conference on Advanced Data Mining and Applications},
pages={337--352},
year={2025},
organization={Springer}
}
Feel free to contact 20223953@stu.neu.edu.cn or open an issue if you have any questions.



