[PDF][PDF] Domain adaptable semantic clustering in statistical nlg
B Howald, R Kondadadi, F Schilder - Proceedings of the 10th …, 2013 - aclanthology.org
B Howald, R Kondadadi, F Schilder
Proceedings of the 10th International Conference on Computational …, 2013•aclanthology.orgWe present a hybrid natural language generation system that utilizes Discourse
Representation Structures (DRSs) for statistically learning syntactic templates from a given
domain of discourse in sentence “micro” planning. In particular, given a training corpus of
target texts, we extract semantic predicates and domain general tags from each sentence
and then organize the sentences using supervised clustering to represent the “conceptual
meaning” of the corpus. The sentences, additionally tagged with domain specific information …
Representation Structures (DRSs) for statistically learning syntactic templates from a given
domain of discourse in sentence “micro” planning. In particular, given a training corpus of
target texts, we extract semantic predicates and domain general tags from each sentence
and then organize the sentences using supervised clustering to represent the “conceptual
meaning” of the corpus. The sentences, additionally tagged with domain specific information …
Abstract
We present a hybrid natural language generation system that utilizes Discourse Representation Structures (DRSs) for statistically learning syntactic templates from a given domain of discourse in sentence “micro” planning. In particular, given a training corpus of target texts, we extract semantic predicates and domain general tags from each sentence and then organize the sentences using supervised clustering to represent the “conceptual meaning” of the corpus. The sentences, additionally tagged with domain specific information (determined separately), are reduced to templates. We use a SVM ranking model trained on a subset of the corpus to determine the optimal template during generation. The combination of the conceptual unit, a set of ranked syntactic templates, and a given set of information, constrains output selection and yields acceptable texts. Our system is evaluated with automatic, non–expert crowdsourced and expert evaluation metrics and, for generated weather, financial and biography texts, falls within acceptable ranges. Consequently, we argue that our DRS driven statistical and template–based method is robust and domain adaptable as, while content will be dictated by a target domain of discourse, significant investments in sentence planning can be minimized without sacrificing performance.
aclanthology.org
Résultat de recherche le plus pertinent Voir tous les résultats