[go: up one dir, main page]

A Structure-Aware Generative Adversarial Network for Bilingual Lexicon Induction

Bocheng Han, Qian Tao, Lusi Li, Zhihao Xiong


Abstract
Bilingual lexicon induction (BLI) is the task of inducing word translations with a learned mapping function that aligns monolingual word embedding spaces in two different languages. However, most previous methods treat word embeddings as isolated entities and fail to jointly consider both the intra-space and inter-space topological relations between words. This limitation makes it challenging to align words from embedding spaces with distinct topological structures, especially when the assumption of isomorphism may not hold. To this end, we propose a novel approach called the Structure-Aware Generative Adversarial Network (SA-GAN) model to explicitly capture multiple topological structure information to achieve accurate BLI. Our model first incorporates two lightweight graph convolutional networks (GCNs) to leverage intra-space topological correlations between words for generating source and target embeddings. We then employ a GAN model to explore inter-space topological structures by learning a global mapping function that initially maps the source embeddings to the target embedding space. To further align the coarse-grained structures, we develop a pair-wised local mapping (PLM) strategy that enables word-specific transformations in an unsupervised manner. Extensive experiments conducted on public datasets, including languages with both distant and close etymological relationships, demonstrate the effectiveness of our proposed SA-GAN model.
Anthology ID:
2023.findings-emnlp.721
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10763–10775
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.721
DOI:
10.18653/v1/2023.findings-emnlp.721
Bibkey:
Cite (ACL):
Bocheng Han, Qian Tao, Lusi Li, and Zhihao Xiong. 2023. A Structure-Aware Generative Adversarial Network for Bilingual Lexicon Induction. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 10763–10775, Singapore. Association for Computational Linguistics.
Cite (Informal):
A Structure-Aware Generative Adversarial Network for Bilingual Lexicon Induction (Han et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.721.pdf