[go: up one dir, main page]

Zexiong Ma


2024

pdf bib
Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks
Shengnan An | Zexiong Ma | Siqi Cai | Zeqi Lin | Nanning Zheng | Jian-Guang Lou | Weizhu Chen
Findings of the Association for Computational Linguistics: EMNLP 2024

Towards enhancing the chain-of-thought (CoT) reasoning of large language models (LLMs), much existing work has revealed the effectiveness of straightforward learning on annotated/generated CoT paths. However, there is less evidence yet that reasoning capabilities can be enhanced through a reverse learning process, i.e., learning from potential mistakes in reasoning. To investigate whether LLMs can learn from mistakes, we construct mistake-correction datasets, using GPT-4 to identify and correct the mistakes in inaccurate CoTs. With these mistake-correction datasets, we fine-tune open-source LLMs and arrive at the following conclusions. (1) LLMs can indeed learn from mistakes to enhance their CoT reasoning performances. (2) Compared to CoT data, the mistake-correction data provides additional knowledge on the explanations and reasons for the potential mistakes in CoTs, which consistently contributes to the effectiveness of learning from mistakes. (3) Evolution techniques, especially the correction-centric evolution we introduced, can further enhance the effectiveness of learning from mistakes.