[go: up one dir, main page]

Reordered Search, and Tuple Unfolding for Ngram-based SMT

Josep M. Crego, José B. Mariño, Adrià de Gispert


Abstract
In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfolding. These strategies interact to improve translation quality in a Chinese to English task. On the one hand, we allow for an Ngram-based decoder (MARIE) to perform a reordered search over the source sentence, while combining a translation tuples Ngram model, a target language model, a word penalty and a word distance model. Interestingly, even though the translation units are learnt sequentially, its reordered search produces an improved translation. On the other hand, we allow for a modification of the translation units that unfolds the tuples, so that shorter units are learnt from a new parallel corpus, where the source sentences are reordered according to the target language. This tuple unfolding technique reduces data sparseness and, when combined with the reordered search, further boosts translation performance. Translation accuracy and efficency results are reported for the IWSLT 2004 Chinese to English task.
Anthology ID:
2005.mtsummit-papers.37
Volume:
Proceedings of Machine Translation Summit X: Papers
Month:
September 13-15
Year:
2005
Address:
Phuket, Thailand
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
283–289
Language:
URL:
https://aclanthology.org/2005.mtsummit-papers.37
DOI:
Bibkey:
Cite (ACL):
Josep M. Crego, José B. Mariño, and Adrià de Gispert. 2005. Reordered Search, and Tuple Unfolding for Ngram-based SMT. In Proceedings of Machine Translation Summit X: Papers, pages 283–289, Phuket, Thailand.
Cite (Informal):
Reordered Search, and Tuple Unfolding for Ngram-based SMT (Crego et al., MTSummit 2005)
Copy Citation:
PDF:
https://aclanthology.org/2005.mtsummit-papers.37.pdf