2024
pdf
bib
abs
Rethinking the Reversal Curse of LLMs: a Prescription from Human Knowledge Reversal
Zhicong Lu
|
Li Jin
|
Peiguang Li
|
Yu Tian
|
Linhao Zhang
|
Sirui Wang
|
Guangluan Xu
|
Changyuan Tian
|
Xunliang Cai
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) have exhibited exceptional performance across diverse domains. However, recent studies reveal that LLMs are plagued by the “reversal curse”. Most existing methods rely on aggressive sample permutation and pay little attention to delving into the underlying reasons for this issue, resulting in only partial mitigation. In this paper, inspired by human knowledge reversal, we investigate and quantify the individual influence of three potential reasons on the reversal curse: 1) knowledge clarity, 2) entity correlation modeling, and 3) pairwise relationship reasoning capability. Motivated by the analysis of these reasons, we propose a novel **P**airwise entity **O**rder- and **R**elationship-**E**nhanced (**PORE**) data strategy, which facilitates bidirectional entity correlation modeling and pairwise relationship reasoning to overcome the reversal curse. Specifically, PORE augments the samples with entity order-reversal and semantically preserved question-answer pairs, enhancing the encoding of entity correlations in both directions. PORE also employs entity-interleaved pairwise relationship data, which elevates the model’s capability for relationship reasoning. Additionally, to improve the recall of reverse relationships, we leverage knowledge clarity to construct high-clarity data for PORE. Extensive experimental results on available and two newly assembled datasets demonstrate the effectiveness and generalization of our method in both data-sufficient and -constrained situations.
pdf
bib
abs
GOME: Grounding-based Metaphor Binding With Conceptual Elaboration For Figurative Language Illustration
Linhao Zhang
|
Jintao Liu
|
Li Jin
|
Hao Wang
|
Kaiwen Wei
|
Guangluan Xu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The illustration or visualization of figurative language, such as linguistic metaphors, is an emerging challenge for existing Large Language Models (LLMs) and multimodal models. Due to their comparison of seemingly unrelated concepts in metaphors, existing LLMs have a tendency of over-literalization, which illustrates figurative language solely based on literal objects, ignoring the underlying groundings and associations across disparate metaphorical domains. Furthermore, prior approaches have ignored the binding process between visual objects and metaphorical attributes, which further intensifies the infidelity of visual metaphors. To address the issues above, we propose GOME (Grounding-based Metaphor Binding), which illustrates linguistic metaphors from the grounding perspective elaborated through LLMs. GOME consists of two steps for metaphor illustration, including grounding-based elaboration and scenario visualization. In the elaboration step, metaphorical knowledge is integrated into systematic instructions for LLMs, which employs a CoT prompting method rooted in rhetoric. This approach specifies metaphorical devices such as vehicles and groundings, to ensure accurate and faithful descriptions consumed by text-to-image models. In the visualization step, an inference-time metaphor binding method is realized based on elaboration outputs, which register attentional control during the diffusion process, and captures the underlying attributes from the abstract metaphorical domain. Comprehensive evaluations using multiple downstream tasks confirm that, GOME is superior to isolated LLMs, diffusion models, or their direct collaboration.
2023
pdf
bib
abs
Guide the Many-to-One Assignment: Open Information Extraction via IoU-aware Optimal Transport
Kaiwen Wei
|
Yiran Yang
|
Li Jin
|
Xian Sun
|
Zequn Zhang
|
Jingyuan Zhang
|
Xiao Li
|
Linhao Zhang
|
Jintao Liu
|
Guo Zhi
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Open Information Extraction (OIE) seeks to extract structured information from raw text without the limitations of close ontology. Recently, the detection-based OIE methods have received great attention from the community due to their parallelism. However, as the essential step of those models, how to assign ground truth labels to the parallelly generated tuple proposals remains under-exploited. The commonly utilized Hungarian algorithm for this procedure is restricted to handling one-to-one assignment among the desired tuples and tuple proposals, which ignores the correlation between proposals and affects the recall of the models. To solve this problem, we propose a dynamic many-to-one label assignment strategy named IOT. Concretely, the label assignment process in OIE is formulated as an Optimal Transport (OT) problem. We leverage the intersection-over-union (IoU) as the assignment quality measurement, and convert the problem of finding the best assignment solution to the one of solving the optimal transport plan by maximizing the IoU values. To further utilize the knowledge from the assignment, we design an Assignment-guided Multi-granularity loss (AM) by simultaneously considering word-level and tuple-level information. Experiment results show the proposed method outperforms the state-of-the-art models on three benchmarks.
2022
pdf
bib
abs
PILE: Pairwise Iterative Logits Ensemble for Multi-Teacher Labeled Distillation
Lianshang Cai
|
Linhao Zhang
|
Dehong Ma
|
Jun Fan
|
Daiting Shi
|
Yi Wu
|
Zhicong Cheng
|
Simiu Gu
|
Dawei Yin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Pre-trained language models have become a crucial part of ranking systems and achieved very impressive effects recently. To maintain high performance while keeping efficient computations, knowledge distillation is widely used. In this paper, we focus on two key questions in knowledge distillation for ranking models: 1) how to ensemble knowledge from multi-teacher; 2) how to utilize the label information of data in the distillation process. We propose a unified algorithm called Pairwise Iterative Logits Ensemble (PILE) to tackle these two questions simultaneously. PILE ensembles multi-teacher logits supervised by label information in an iterative way and achieved competitive performance in both offline and online experiments. The proposed method has been deployed in a real-world commercial search system.
2021
pdf
bib
Do It Once: An Embarrassingly Simple Joint Matching Approach to Response Selection
Linhao Zhang
|
Dehong Ma
|
Sujian Li
|
Houfeng Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
2020
pdf
bib
abs
Syntax-Aware Graph Attention Network for Aspect-Level Sentiment Classification
Lianzhe Huang
|
Xin Sun
|
Sujian Li
|
Linhao Zhang
|
Houfeng Wang
Proceedings of the 28th International Conference on Computational Linguistics
Aspect-level sentiment classification aims to distinguish the sentiment polarities over aspect terms in a sentence. Existing approaches mostly focus on modeling the relationship between the given aspect words and their contexts with attention, and ignore the use of more elaborate knowledge implicit in the context. In this paper, we exploit syntactic awareness to the model by the graph attention network on the dependency tree structure and external pre-training knowledge by BERT language model, which helps to model the interaction between the context and aspect words better. And the subwords of BERT are integrated into the dependency tree graphs, which can obtain more accurate representations of words by graph attention. Experiments demonstrate the effectiveness of our model.