All Projects → SUDA-LA → ucca-parser

SUDA-LA / ucca-parser

Licence: MIT license
[SemEval'19] Code for "HLT@SUDA at SemEval 2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing"

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to ucca-parser

ContextualSP
Multiple paper open-source codes of the Microsoft Research Asia DKI group
Stars: ✭ 224 (+1144.44%)
Mutual labels:  semantic-parsing
lang2logic-PyTorch
PyTorch port of the paper "Language to Logical Form with Neural Attention"
Stars: ✭ 34 (+88.89%)
Mutual labels:  semantic-parsing
parse seq2seq
A tensorflow implementation of neural sequence-to-sequence parser for converting natural language queries to logical form.
Stars: ✭ 26 (+44.44%)
Mutual labels:  semantic-parsing
sede
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Stars: ✭ 83 (+361.11%)
Mutual labels:  semantic-parsing
gap-text2sql
GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Stars: ✭ 83 (+361.11%)
Mutual labels:  semantic-parsing
r2sql
🌶️ R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)
Stars: ✭ 60 (+233.33%)
Mutual labels:  semantic-parsing
spring
SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).
Stars: ✭ 103 (+472.22%)
Mutual labels:  semantic-parsing
Hanlp
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Stars: ✭ 24,626 (+136711.11%)
Mutual labels:  semantic-parsing
Compositional-Generalization-in-Natural-Language-Processing
Compositional Generalization in Natual Language Processing. A roadmap.
Stars: ✭ 26 (+44.44%)
Mutual labels:  semantic-parsing
flowsense
FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System
Stars: ✭ 40 (+122.22%)
Mutual labels:  semantic-parsing
SPARQA
SPARQA: Skeleton-based Semantic Parsing for Complex Questions over Knowledge Bases, AAAI 2020
Stars: ✭ 64 (+255.56%)
Mutual labels:  semantic-parsing
WikiTableQuestions
A dataset of complex questions on semi-structured Wikipedia tables
Stars: ✭ 81 (+350%)
Mutual labels:  semantic-parsing
Question-Answering
Question Answering over Knowledge Bases
Stars: ✭ 24 (+33.33%)
Mutual labels:  semantic-parsing
text2sql-lgesql
This is the project containing source codes and pre-trained models about ACL2021 Long Paper ``LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations".
Stars: ✭ 68 (+277.78%)
Mutual labels:  semantic-parsing
weak-supervised-Rule-Text2SQL
Using Database Rule for Weak Supervised Text-to-SQL Generation
Stars: ✭ 13 (-27.78%)
Mutual labels:  semantic-parsing
semantic-parsing-dual
Source code and data for ACL 2019 Long Paper ``Semantic Parsing with Dual Learning".
Stars: ✭ 17 (-5.56%)
Mutual labels:  semantic-parsing
TabularSemanticParsing
Translating natural language questions to a structured query language
Stars: ✭ 148 (+722.22%)
Mutual labels:  semantic-parsing

UCCA Parser

An implementation of "HLT@SUDA at SemEval 2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing".

This version of the implementation uses lexical features in the corpus, including POS tags, dependency labels and entity labels, just as described in the paper.

For other models or versions, please see different branches.

Requirements

python >= 3.6.0
pytorch == 1.0.0
ucca == 1.0.127

Note that the code has not been tested on the newest version of ucca module.

Datasets

The datasets are all provided by SemEval-2019 Task 1: Cross-lingual Semantic Parsing with UCCA. The official website is https://competitions.codalab.org/competitions/19160.

Pre-trained embeddings: http://fasttext.cc

Performance

Here are the results I re-ran on June 13, 2019, which are almost the same as the results in the paper.

description dev primary dev remote dev average test wiki primary test wiki remote test wiki average test 20K primary test 20K remote test 20K average
English-Topdown-Lexical 79.7 52.2 79.2 77.9 48.0 77.4 74.0 23.4 73.0
German-Topdown-Lexical 82.9 57.1 82.4 / / / 83.5 61.1 83.0

Usage

You can start the training, evaluation and prediction process by using subcommands registered in parser.cmds or just use the shell scripts included in.

$ python run.py -h
usage: run.py [-h] {train,predict,evaluate} ...

UCCA Parser.

optional arguments:
  -h, --help            show this help message and exit

Commands:
  {train,predict,evaluate}
    train               Train a model.
    predict             Use a trained model to make predictions.
    evaluate            Evaluate the specified model and dataset.

Optional arguments of the subparsers are as follows:

Note that the path to save the model is a directory. After training, there are three files in the directory which are named "config.json", "vocab.pt" and "parser.pt".

$ python run.py train -h
usage: run.py train [-h] --train_path TRAIN_PATH --dev_path DEV_PATH
                    [--emb_path EMB_PATH] --save_path SAVE_PATH --config_path
                    CONFIG_PATH [--test_wiki_path TEST_WIKI_PATH]
                    [--test_20k_path TEST_20K_PATH] [--gpu GPU] [--seed SEED]
                    [--threads THREADS]

optional arguments:
  -h, --help            show this help message and exit
  --train_path TRAIN_PATH
                        train data dir
  --dev_path DEV_PATH   dev data dir
  --emb_path EMB_PATH   pretrained embedding path
  --save_path SAVE_PATH
                        dic to save all file
  --config_path CONFIG_PATH
                        dic to save all file
  --test_wiki_path TEST_WIKI_PATH
                        wiki test data dir
  --test_20k_path TEST_20K_PATH
                        20k data dir
  --gpu GPU             gpu id
  --seed SEED           random seed
  --threads THREADS     thread num


$ python run.py evaluate -h
usage: run.py evaluate [-h] --gold_path GOLD_PATH --save_path SAVE_PATH
                       [--batch_size BATCH_SIZE] [--gpu GPU] [--seed SEED]
                       [--threads THREADS]

optional arguments:
  -h, --help            show this help message and exit
  --gold_path GOLD_PATH
                        gold test data dir
  --save_path SAVE_PATH
                        path to save the model
  --batch_size BATCH_SIZE
                        batch size
  --gpu GPU             gpu id
  --seed SEED           random seed
  --threads THREADS     thread num


$ python run.py predict -h
usage: run.py predict [-h] --test_path TEST_PATH --save_path SAVE_PATH
                      --pred_path PRED_PATH [--batch_size BATCH_SIZE]
                      [--gpu GPU] [--seed SEED] [--threads THREADS]

optional arguments:
  -h, --help            show this help message and exit
  --test_path TEST_PATH
                        test data dir
  --save_path SAVE_PATH
                        path to save the model
  --pred_path PRED_PATH
                        save predict passages
  --batch_size BATCH_SIZE
                        batch size
  --gpu GPU             gpu id
  --seed SEED           random seed
  --threads THREADS     thread num

Conversion

Conversion code is included in parser.convert. The function UCCA2tree is used to convert a UCCA passage to a tree. The function to_UCCA is used to convert a tree to a UCCA passage. Remote edge recovery code is included in parser.submodel.remote_parser.py independently.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].