[go: up one dir, main page]

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT

Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng


Abstract
While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications. Conversely, previous studies have found that although automatic metrics tend to favor smaller fine-tuned models, the quality of the summaries they generate is inferior to that of larger models like GPT-3 when assessed by human evaluators. To address this issue, we propose InheritSumm, a versatile and compact summarization model derived from GPT-3.5 through distillation. InheritSumm not only exhibits comparable zeroshot and fewshot summarization capabilities to GPT-3.5 but is also sufficiently compact for fine-tuning purposes. Experimental results demonstrate that InheritSumm achieves similar or superior performance to GPT-3.5 in zeroshot and fewshot settings. Furthermore, it outperforms the previously established best small models in both prefix-tuning and full-data fine-tuning scenarios.
Anthology ID:
2023.findings-emnlp.927
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13879–13892
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.927
DOI:
10.18653/v1/2023.findings-emnlp.927
Bibkey:
Cite (ACL):
Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, and Michael Zeng. 2023. InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 13879–13892, Singapore. Association for Computational Linguistics.
Cite (Informal):
InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT (Xu et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.927.pdf