Data-to-text Generation with Entity Modeling

Ratish Puduppully, Li Dong, Mirella Lapata

Abstract

Recent approaches to data-to-text generation have shown great promise thanks to the use of large-scale datasets and the application of neural network architectures which are trained end-to-end. These models rely on representation learning to select content appropriately, structure it coherently, and verbalize it grammatically, treating entities as nothing more than vocabulary tokens. In this work we propose an entity-centric neural architecture for data-to-text generation. Our model creates entity-specific representations which are dynamically updated. Text is generated conditioned on the data input and entity memory representations using hierarchical attention at each time step. We present experiments on the RotoWire benchmark and a (five times larger) new dataset on the baseball domain which we create. Our results show that the proposed model outperforms competitive baselines in automatic and human evaluation.

Anthology ID:: P19-1195
Volume:: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2019
Address:: Florence, Italy
Editors:: Anna Korhonen, David Traum, Lluís Màrquez
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2023–2035
Language:
URL:: https://aclanthology.org/P19-1195
DOI:: 10.18653/v1/P19-1195
Bibkey:
Cite (ACL):: Ratish Puduppully, Li Dong, and Mirella Lapata. 2019. Data-to-text Generation with Entity Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2023–2035, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Data-to-text Generation with Entity Modeling (Puduppully et al., ACL 2019)
Copy Citation:
PDF:: https://aclanthology.org/P19-1195.pdf
Code: ratishsp/data2text-entity-py + additional community code
Data: MLB Dataset, RotoWire

PDF Cite Search Code