This repository contains the source code, data, and documentation for the research paper:
@inproceedings{kachwala-etal-2024-rematch,
title = "{REMATCH}: Robust and Efficient Matching of Local Knowledge Graphs to Improve Structural and Semantic Similarity",
author = "Kachwala, Zoher and
An, Jisun and
Kwak, Haewoon and
Menczer, Filippo",
editor = "Duh, Kevin and
Gomez, Helena and
Bethard, Steven",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024",
month = jun,
year = "2024",
address = "Mexico City, Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-naacl.64",
doi = "10.18653/v1/2024.findings-naacl.64",
pages = "1018--1028",
abstract = "Knowledge graphs play a pivotal role in various applications, such as question-answering and fact-checking. Abstract Meaning Representation (AMR) represents text as knowledge graphs. Evaluating the quality of these graphs involves matching them structurally to each other and semantically to the source text. Existing AMR metrics are inefficient and struggle to capture semantic similarity. We also lack a systematic evaluation benchmark for assessing structural similarity between AMR graphs. To overcome these limitations, we introduce a novel AMR similarity metric, rematch, alongside a new evaluation for structural similarity called RARE. Among state-of-the-art metrics, rematch ranks second in structural similarity; and first in semantic similarity by 1{--}5 percentage points on the STS-B and SICK-R benchmarks. Rematch is also five times faster than the next most efficient metric.",
}
An example of rematch similarity calculation for a pair of AMRs. After AMRs are parsed from sentences, rematch has a two-step process to calculate similarity. First, sets of motifs are generated. Second, the two sets are used to calculate the Jaccard similarity (intersecting motifs shown in color).
Knowledge graphs play a pivotal role in various applications, such as question-answering and fact-checking. Abstract Meaning Representation (AMR) represents text as knowledge graphs. Evaluating the quality of these graphs involves matching them structurally to each other and semantically to the source text. Existing AMR metrics are inefficient and struggle to capture semantic similarity. We also lack a systematic evaluation benchmark for assessing structural similarity between AMR graphs. To overcome these limitations, we introduce a novel AMR similarity metric, rematch, alongside a new evaluation for structural similarity called RARE. Among state-of-the-art metrics, rematch ranks second in structural similarity; and first in semantic similarity by 1--5 percentage points on the STS-B and SICK-R benchmarks. Rematch is also five times faster than the next most efficient metric.
Knowledge Graphs, Graph Matching, Abstract Meaning Representation (AMR), Semantic Graphs, Graph Isomorphism, Semantic Similarity, Structural Similarity.
-
Clone the repository:
git clone https://github.com/Zoher15/Rematch-RARE.git
-
Create and activate conda Environment:
conda env create -f rematch_rare.yml
conda activate rematch_rare
- License and download AMR Annotation 3.0
- Preprocess data by:
bash methods/preprocess_data/preprocess_amr3.sh <dir>
<dir>
is the directory where youramr_annotation_3.0_LDC2020T02.tgz
file is located
Steps to reproduce these results:
- Generate Randomized AMRs with Rewired Edges (RARE):
python experiments/structural_consistency/randomize_amr_rewire.py
- Evaluate any metric on RARE test:
bash experiments/structural_consistency/structural_consistency.sh <metric>
<metric>
should be one ofrematch
,smatch
,s2match
,sembleu
,wlk
orwwlk
. Depending on the metric, this could take a while to run.
Steps to reproduce these results:
-
Parse AMRs from STS-B and SICK-R:
a. Follow the instructions to install the transition_amr_parser. Highly recommend creating an independent conda environment called
transition_amr_parser
. ParseAMR3-structbart-L-smpl
andAMR3-joint-ontowiki-seed42
by activating the environment and executing the script (requires cuda):conda env create -f transition_amr_parser.yml conda activate transition_amr_parser bash experiments/semantic_consistency/parse_amrs.sh
b. (optional) Parse
Spring
by cloning the repo and following the instructions to install. Highly recommend creating an independent conda environment calledspring
. Also download and unzip the AMR3 pretrained checkpoint. Ensure that the resulting unzipped file (AMR3.parsing.pt
) is in the cloned repo directoryspring/
. Then run the following, where<spring_dir>
is the location of your Spring repo (requires cuda):conda env create -f spring.yml conda activate spring bash experiments/semantic_consistency/parse_spring.sh <spring_dir>
c. (optional) Parse
Amrbart
by cloning the repo and following the instructions to install. Highly recommend creating an independent conda environment calledamrbart
. Then run the following, where<amrbart_dir>
is the location of your Amrbart repo (requires cuda):conda env create -f amrbart.yml conda activate amrbart bash experiments/semantic_consistency/parse_amrbart.sh <amrbart_dir>
-
Evaluate a metric on the test set:
conda activate rematch_rare bash experiments/semantic_consistency/semantic_consistency.sh <metric> <parser>
<metric>
should be one ofrematch
,smatch
,s2match
,sembleu
,wlk
orwwlk
.<parser>
should be one ofAMR3-structbart-L-smpl
,AMR3-joint-ontowiki-seed42
,spring_unwiki
oramrbart_unwiki
. Ensure the chosen<parser>
has been executed in the previous step.
Please follow the instructions in the Bamboo repo. Do note that by default, Bamboo uses Pearsonr, but for our analysis we chose Spearmanr. That change can be made easily in the evaluation script by using find and replace. The word pearsonr
needs to be replaced with spearmanr
.
AMR Metric | Time(s) | RAM(GB) |
---|---|---|
smatch | 927 | 0.2 |
s2match | 7718 | 2 |
sembleu | 275 | 0.2 |
WLK | 315 | 30 |
rematch | 51 | 0.2 |
Steps to reproduce this experiment:
- Generate the time testbed by:
conda activate rematch_rare python experiments/efficiency/generate_matchups.py
- Evaluate a specific
<metric>
, one ofrematch
,smatch
,s2match
,sembleu
orwlk
:bash experiments/efficiency/efficiency.sh <metric>
- If all metrics have been executed, the plots from the paper can be reproduced by (save in
data/processed/AMR3.0
):python experiments/efficiency/plot_complexity.py