TestAug: A Framework for Augmenting Capability-based NLP Tests

Guanqun Yang, Mirazul Haque, Qiaochu Song, Wei Yang, Xueqing Liu

Abstract

The recently proposed capability-based NLP testing allows model developers to test the functional capabilities of NLP models, revealing functional failures for models with good held-out evaluation scores. However, existing work on capability-based testing requires the developer to compose each individual test template from scratch. Such approach thus requires extensive manual efforts and is less scalable. In this paper, we investigate a different approach that requires the developer to only annotate a few test templates, while leveraging the GPT-3 engine to generate the majority of test cases. While our approach saves the manual efforts by design, it guarantees the correctness of the generated suites with a validity checker. Moreover, our experimental results show that the test suites generated by GPT-3 are more diverse than the manually created ones; they can also be used to detect more errors compared to manually created counterparts. Our test suites can be downloaded at https://anonymous-researcher-nlp.github.io/testaug/.

Anthology ID:: 2022.coling-1.307
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Editors:: Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 3480–3495
Language:
URL:: https://aclanthology.org/2022.coling-1.307
DOI:
Bibkey:
Cite (ACL):: Guanqun Yang, Mirazul Haque, Qiaochu Song, Wei Yang, and Xueqing Liu. 2022. TestAug: A Framework for Augmenting Capability-based NLP Tests. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3480–3495, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: TestAug: A Framework for Augmenting Capability-based NLP Tests (Yang et al., COLING 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.coling-1.307.pdf
Code: guanqun-yang/testaug
Data: HELP, SST

PDF Cite Search Code