Noise Stability Regularization for Improving BERT Fine-tuning

Hang Hua, Xingjian Li, Dejing Dou, Chengzhong Xu, Jiebo Luo

Abstract

Fine-tuning pre-trained language models suchas BERT has become a common practice dom-inating leaderboards across various NLP tasks. Despite its recent success and wide adoption,this process is unstable when there are onlya small number of training samples available. The brittleness of this process is often reflectedby the sensitivity to random seeds. In this pa-per, we propose to tackle this problem basedon the noise stability property of deep nets,which is investigated in recent literature (Aroraet al., 2018; Sanyal et al., 2020). Specifically,we introduce a novel and effective regulariza-tion method to improve fine-tuning on NLPtasks, referred to asLayer-wiseNoiseStabilityRegularization (LNSR). We extend the theo-ries about adding noise to the input and provethat our method gives a stabler regularizationeffect. We provide supportive evidence by ex-perimentally confirming that well-performingmodels show a low sensitivity to noise andfine-tuning with LNSR exhibits clearly bet-ter generalizability and stability. Furthermore,our method also demonstrates advantages overother state-of-the-art algorithms including L2-SP (Li et al., 2018), Mixout (Lee et al., 2020)and SMART (Jiang et al., 20)

Anthology ID:: 2021.naacl-main.258
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3229–3241
Language:
URL:: https://aclanthology.org/2021.naacl-main.258
DOI:: 10.18653/v1/2021.naacl-main.258
Bibkey:
Cite (ACL):: Hang Hua, Xingjian Li, Dejing Dou, Chengzhong Xu, and Jiebo Luo. 2021. Noise Stability Regularization for Improving BERT Fine-tuning. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3229–3241, Online. Association for Computational Linguistics.
Cite (Informal):: Noise Stability Regularization for Improving BERT Fine-tuning (Hua et al., NAACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.naacl-main.258.pdf
Video:: https://aclanthology.org/2021.naacl-main.258.mp4
Data: CoLA, GLUE, MRPC

PDF Cite Search Video