k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Abe Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He

Abstract

Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

Anthology ID:: 2024.findings-acl.98
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1706–1715
Language:
URL:: https://aclanthology.org/2024.findings-acl.98
DOI:: 10.18653/v1/2024.findings-acl.98
Bibkey:
Cite (ACL):: Abe Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, and Tianxing He. 2024. k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text. In Findings of the Association for Computational Linguistics: ACL 2024, pages 1706–1715, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text (Hou et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-acl.98.pdf

PDF Cite Search