[go: up one dir, main page]

  Summary of the paper

Title Use of Domain-Specific Language Resources in Machine Translation
Authors Sanja Štajner, Andreia Querido, Nuno Rendeiro, João António Rodrigues and António Branco
Abstract In this paper, we address the problem of Machine Translation (MT) for a specialised domain in a language pair for which only a very small domain-specific parallel corpus is available. We conduct a series of experiments using a purely phrase-based SMT (PBSMT) system and a hybrid MT system (TectoMT), testing three different strategies to overcome the problem of the small amount of in-domain training data. Our results show that adding a small size in-domain bilingual terminology to the small in-domain training corpus leads to the best improvements of a hybrid MT system, while the PBSMT system achieves the best results by adding a combination of in-domain bilingual terminology and a larger out-of-domain corpus. We focus on qualitative human evaluation of the output of two best systems (one for each approach) and perform a systematic in-depth error analysis which revealed advantages of the hybrid MT system over the pure PBSMT system for this specific task.
Topics Machine Translation, SpeechToSpeech Translation, Evaluation Methodologies, Tools, Systems, Applications
Full paper Use of Domain-Specific Language Resources in Machine Translation
Bibtex @InProceedings{TAJNER16.179,
  author = {Sanja Štajner and Andreia Querido and Nuno Rendeiro and João António Rodrigues and António Branco},
  title = {Use of Domain-Specific Language Resources in Machine Translation},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA