PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations

Arezoo Hatefi, Anton Eklund, Mona Forsman

Abstract

Given the importance of identifying and monitoring news stories within the continuous flow of news articles, this paper presents PromptStream, a novel method for unsupervised news story discovery. In order to identify coherent and comprehensive stories across the stream, it is crucial to create article representations that incorporate as much topic-related information from the articles as possible. PromptStream constructs these article embeddings using cloze-style prompting. These representations continually adjust to the evolving context of the news stream through self-supervised learning, employing a contrastive loss and a memory of the most confident article-story assignments from the most recent days. Extensive experiments with real news datasets highlight the notable performance of our model, establishing a new state of the art. Additionally, we delve into selected news stories to reveal how the model’s structuring of the article stream aligns with story progression.

Anthology ID:: 2024.lrec-main.1157
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 13222–13232
Language:
URL:: https://aclanthology.org/2024.lrec-main.1157
DOI:
Bibkey:
Cite (ACL):: Arezoo Hatefi, Anton Eklund, and Mona Forsman. 2024. PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 13222–13232, Torino, Italia. ELRA and ICCL.
Cite (Informal):: PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations (Hatefi et al., LREC-COLING 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.lrec-main.1157.pdf

PDF Cite Search