HyCxG

The official code for paper "Enhancing Language Representation with Constructional Information for Natural Language Understanding"

English | 简体中文

🔗 Data • Tutorial • Guideline • Quick Start • Related Work • FAQ❓

Note

This repository is still under construction and will take some time to complete.

📖 Introduction of HyCxG

Construction Grammar (CxG) is a branch of cognitive linguistics. It assumes that grammar is a meaningful continuum of lexicon, morphology and syntax. Constructions can be defined as linguistic patterns that store different form and meaning pairs. As the meaning of a construction is assigned to a linguistic pattern rather than specific words, learning constructional information can be more challenging via PLMs and requires large bulk training data, which may lead to failure in NLU tasks.

It motivates us to incorporate construction grammar with PLMs. Therefore, we propose a preliminary framework HyCxG (Hypergraph network of construction grammar) to enhance the language representation with constructional information via a three stage solution. First, we extract and select the discriminative constructions from the input sentence. Then the Relational Hypergraph Attention Network are applied to attach the constructional information to the words. Then we can acquire the final representation to fine-tune on a variety of downstream tasks.

📃 About this Repository

The content contained in each section of this repository includes:

HyCxG includes the entire code for HyCxG framework.
Data contains all the datasets used in this work as well as processing scripts. Most of the datasets will be downloaded from our mirror source. Meanwhile, some data processing scripts for baseline models are also provided.
Tutorial includes some tutorials for HyCxG and related resources to our work.
Guideline (Under construction) illustrates the information about baseline models & FAQ.

🐍 Quick Start

1 Experimental environment setup

We adopt Python=3.8.5 as the base environment, You can create the environment and install the dependencies with the following code:

conda create -n hycxg_env python=3.8.5
source activate hycxg_env
pip install -r requirements.txt

2 Prepare the dataset

We provide the script for data download in the data folder. You can directly use the following command to get the data:

cd data
bash data_pipeline.sh

After downloading the data, please move each data folder (e.g., JSONABSA_MAMS) to the HyCxG/dataset directory.

3 Prepare the data for components

Before running the code, it is necessary to download the required data for components (e.g., construction lists). The download process is under HyCxG/dataset and HyCxG/Tokenizer respectively. You can also obtain the data directly using the following command:

cd HyCxG/dataset
bash download_vocab.sh
cd ../Tokenizer
bash download_cxgdict.sh

4 Run HyCxG

We provide some examples of code for running HyCxG in HyCxG/run_hycxg.sh.

🙏 Appreciation

c2xg for extracting the constructions from the sentence
simanneal for a convenient simulated annealing framework to solve problems

👋 How to Cite

If you think our work is helpful, feel free to cite our paper "Enhancing Language Representation with Constructional Information for Natural Language Understanding":

@inproceedings{xu2023enhancing,
    title = "Enhancing Language Representation with Constructional Information for Natural Language Understanding",
    author = "Xu, Lvxiaowei  and
      Wu, Jianwang  and
      Peng, Jiawei  and
      Gong, Zhilin  and
      Cai, Ming  and
      Wang, Tianxiang",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    year = "2023",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.258",
    pages = "4685--4705",
}

📧 Contact

If you have any questions about the code, feel free to submit an Issue or contact xlxw@zju.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
HyCxG		HyCxG
data		data
figures		figures
guidelines		guidelines
tutorials		tutorials
LICENSE		LICENSE
README.md		README.md
README_ZH.md		README_ZH.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HyCxG

🌀 Content

📖 Introduction of HyCxG

📃 About this Repository

🐍 Quick Start

🙏 Appreciation

👋 How to Cite

📧 Contact

About

Releases

Packages

Contributors 3

Languages

License

xlxwalex/HyCxG

Folders and files

Latest commit

History

Repository files navigation

HyCxG

🌀 Content

📖 Introduction of HyCxG

📃 About this Repository

🐍 Quick Start

🙏 Appreciation

👋 How to Cite

📧 Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages