AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation

Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee

Abstract

Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language. For large-scale conversational systems, where it is common to have over hundreds of intents and thousands of slots, neither template-based approaches nor model-based approaches are scalable. Recently, neural NLGs started leveraging transfer learning and showed promising results in few-shot settings. This paper proposes AugNLG, a novel data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model, to automatically create MR-to-Text data from open-domain texts. The proposed system mostly outperforms the state-of-the-art methods on the FewshotWOZ data in both BLEU and Slot Error Rate. We further confirm improved results on the FewshotSGD data and provide comprehensive analysis results on key components of our system. Our code and data are available at https://github.com/XinnuoXu/AugNLG.

Anthology ID:: 2021.acl-long.95
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1183–1195
Language:
URL:: https://aclanthology.org/2021.acl-long.95
DOI:: 10.18653/v1/2021.acl-long.95
Bibkey:
Cite (ACL):: Xinnuo Xu, Guoyin Wang, Young-Bum Kim, and Sungjin Lee. 2021. AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1183–1195, Online. Association for Computational Linguistics.
Cite (Informal):: AugNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation (Xu et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.acl-long.95.pdf
Video:: https://aclanthology.org/2021.acl-long.95.mp4
Code: XinnuoXu/AugNLG

PDF Cite Search Code Video