From fcb63e09f8f5b5bf702208ba9a062940b307fa4d Mon Sep 17 00:00:00 2001
From: Leon Knauer <git@leonknauer.com>
Date: Tue, 18 Jun 2024 16:01:37 +0200
Subject: [PATCH 1/3] Fix incorrect links to papers

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 10644b5..66e149d 100644
--- a/README.md
+++ b/README.md
@@ -141,7 +141,7 @@ _Methods for analysing LLM deliberation and assessing reasoning quality._
 
 - 🎓👩‍💻 Comprehensive LLM-based reasoning analytics that breaks texts down into individual reasons. "DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models." 2024-01-04. [[>paper](https://arxiv.org/abs/2304.10703)] [[>code](https://github.com/intuit-ai-research/DCR-consistency)]
 - 🎓🤗 Highly performant, open LLM (T5-based) for inference verification. "Minds versus Machines: Rethinking Entailment Verification with Language Models." 2024-02-06. [[>paper](https://arxiv.org/abs/2402.03686)] [[>model](soumyasanyal/entailment-verifier-xxl)]
-- 🎓👩‍💻 Test dataset for CoT evaluators. "A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains." 2023-11-23. [[>paper](https://arxiv.org/abs/2304.10703)] [[>dataset](https://huggingface.co/datasets/google/reveal)]
+- 🎓👩‍💻 Test dataset for CoT evaluators. "A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains." 2023-11-23. [[>paper](https://arxiv.org/abs/2402.00559)] [[>dataset](https://huggingface.co/datasets/google/reveal)]
 - 🎓👩‍💻 Framework for evaluating reasoning chains by viewing them as informal proofs that derive the final answer. "ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness." 2023-11-23. [[>paper](https://arxiv.org/abs/2304.10703)] [[>code](https://github.com/archiki/ReCEval)]
 - 🎓 GPT-4 is 5x better at predicting whether math reasoning is correct than GPT-3.5. "Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs." 2023-12-28. [[>paper](https://arxiv.org/abs/2312.17080)]
 - 🎓 Minimalistic GPT-4 prompts for assessing reasoning quality. "SocREval: Large Language Models with the Socratic Method for Reference-Free Reasoning Evaluation." 2023-09-29. [[>paper](https://arxiv.org/abs/2310.00074)] [[>code](https://github.com/facebookresearch/ParlAI/tree/main/projects/roscoe#meta-evaluation)]

From d3f2c1c0e2be00aa5820d179ea89cf39ea97c4b6 Mon Sep 17 00:00:00 2001
From: Leon Knauer <git@leonknauer.com>
Date: Tue, 18 Jun 2024 16:04:12 +0200
Subject: [PATCH 2/3] Fix incorrect paper link

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 66e149d..854fdd5 100644
--- a/README.md
+++ b/README.md
@@ -139,7 +139,7 @@ _Let LLMs self-correct their deliberation._
 
 _Methods for analysing LLM deliberation and assessing reasoning quality._
 
-- 🎓👩‍💻 Comprehensive LLM-based reasoning analytics that breaks texts down into individual reasons. "DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models." 2024-01-04. [[>paper](https://arxiv.org/abs/2304.10703)] [[>code](https://github.com/intuit-ai-research/DCR-consistency)]
+- 🎓👩‍💻 Comprehensive LLM-based reasoning analytics that breaks texts down into individual reasons. "DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models." 2024-01-04. [[>paper](https://arxiv.org/abs/2212.07919)] [[>code](https://github.com/intuit-ai-research/DCR-consistency)]
 - 🎓🤗 Highly performant, open LLM (T5-based) for inference verification. "Minds versus Machines: Rethinking Entailment Verification with Language Models." 2024-02-06. [[>paper](https://arxiv.org/abs/2402.03686)] [[>model](soumyasanyal/entailment-verifier-xxl)]
 - 🎓👩‍💻 Test dataset for CoT evaluators. "A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains." 2023-11-23. [[>paper](https://arxiv.org/abs/2402.00559)] [[>dataset](https://huggingface.co/datasets/google/reveal)]
 - 🎓👩‍💻 Framework for evaluating reasoning chains by viewing them as informal proofs that derive the final answer. "ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness." 2023-11-23. [[>paper](https://arxiv.org/abs/2304.10703)] [[>code](https://github.com/archiki/ReCEval)]

From b695b6af8a73353c49c8af65320b3e6d59926cfe Mon Sep 17 00:00:00 2001
From: Leon Knauer <git@leonknauer.com>
Date: Tue, 18 Jun 2024 16:06:23 +0200
Subject: [PATCH 3/3] Fix incorrect paper links

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 854fdd5..a879a1b 100644
--- a/README.md
+++ b/README.md
@@ -139,7 +139,7 @@ _Let LLMs self-correct their deliberation._
 
 _Methods for analysing LLM deliberation and assessing reasoning quality._
 
-- 🎓👩‍💻 Comprehensive LLM-based reasoning analytics that breaks texts down into individual reasons. "DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models." 2024-01-04. [[>paper](https://arxiv.org/abs/2212.07919)] [[>code](https://github.com/intuit-ai-research/DCR-consistency)]
+- 🎓👩‍💻 Comprehensive LLM-based reasoning analytics that breaks texts down into individual reasons. "DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models." 2024-01-04. [[>paper](https://arxiv.org/abs/2401.02132)] [[>code](https://github.com/intuit-ai-research/DCR-consistency)]
 - 🎓🤗 Highly performant, open LLM (T5-based) for inference verification. "Minds versus Machines: Rethinking Entailment Verification with Language Models." 2024-02-06. [[>paper](https://arxiv.org/abs/2402.03686)] [[>model](soumyasanyal/entailment-verifier-xxl)]
 - 🎓👩‍💻 Test dataset for CoT evaluators. "A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains." 2023-11-23. [[>paper](https://arxiv.org/abs/2402.00559)] [[>dataset](https://huggingface.co/datasets/google/reveal)]
 - 🎓👩‍💻 Framework for evaluating reasoning chains by viewing them as informal proofs that derive the final answer. "ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness." 2023-11-23. [[>paper](https://arxiv.org/abs/2304.10703)] [[>code](https://github.com/archiki/ReCEval)]