[PDF][PDF] A statistical nlg framework for aggregated planning and realization

R Kondadadi, B Howald, F Schilder - Proceedings of the 51st …, 2013 - aclanthology.org
Proceedings of the 51st Annual Meeting of the Association for …, 2013aclanthology.org
We present a hybrid natural language generation (NLG) system that consolidates macro and
micro planning and surface realization tasks into one statistical learning process. Our novel
approach is based on deriving a template bank automatically from a corpus of texts from a
target domain. First, we identify domain specific entity tags and Discourse Representation
Structures on a per sentence basis. Each sentence is then organized into semantically
similar groups (representing a domain specific concept) by k-means clustering. After this …
Abstract
We present a hybrid natural language generation (NLG) system that consolidates macro and micro planning and surface realization tasks into one statistical learning process. Our novel approach is based on deriving a template bank automatically from a corpus of texts from a target domain. First, we identify domain specific entity tags and Discourse Representation Structures on a per sentence basis. Each sentence is then organized into semantically similar groups (representing a domain specific concept) by k-means clustering. After this semi-automatic processing (human review of cluster assignments), a number of corpus–level statistics are compiled and used as features by a ranking SVM to develop model weights from a training corpus. At generation time, a set of input data, the collection of semantically organized templates, and the model weights are used to select optimal templates. Our system is evaluated with automatic, non–expert crowdsourced and expert evaluation metrics. We also introduce a novel automatic metric–syntactic variability–that represents linguistic variation as a measure of unique template sequences across a collection of automatically generated documents. The metrics for generated weather and biography texts fall within acceptable ranges. In sum, we argue that our statistical approach to NLG reduces the need for complicated knowledge-based architectures and readily adapts to different domains with reduced development time.
aclanthology.org
Résultat de recherche le plus pertinent Voir tous les résultats