All Projects → caldreaming → CAIL

caldreaming / CAIL

Licence: other
法研杯CAIL2019阅读理解赛题参赛模型

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to CAIL

TriB-QA
吹逼我们是认真的
Stars: ✭ 45 (+32.35%)
Mutual labels:  mrc, bert
ChineseNER
中文NER的那些事儿
Stars: ✭ 241 (+608.82%)
Mutual labels:  mrc, bert
MRC Competition Dureader
机器阅读理解 冠军/亚军代码及中文预训练MRC模型
Stars: ✭ 552 (+1523.53%)
Mutual labels:  mrc, bert
korpatbert
특허분야 특화된 한국어 AI언어모델 KorPatBERT
Stars: ✭ 48 (+41.18%)
Mutual labels:  mrc, bert
Transformer-QG-on-SQuAD
Implement Question Generator with SOTA pre-trained Language Models (RoBERTa, BERT, GPT, BART, T5, etc.)
Stars: ✭ 28 (-17.65%)
Mutual labels:  bert
TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
Stars: ✭ 209 (+514.71%)
Mutual labels:  bert
R-AT
Regularized Adversarial Training
Stars: ✭ 19 (-44.12%)
Mutual labels:  bert
GEANet-BioMed-Event-Extraction
Code for the paper Biomedical Event Extraction with Hierarchical Knowledge Graphs
Stars: ✭ 52 (+52.94%)
Mutual labels:  bert
backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Stars: ✭ 229 (+573.53%)
Mutual labels:  bert
DiscEval
Discourse Based Evaluation of Language Understanding
Stars: ✭ 18 (-47.06%)
Mutual labels:  bert
BiaffineDependencyParsing
BERT+Self-attention Encoder ; Biaffine Decoder ; Pytorch Implement
Stars: ✭ 67 (+97.06%)
Mutual labels:  bert
BERT-chinese-text-classification-pytorch
This repo contains a PyTorch implementation of a pretrained BERT model for text classification.
Stars: ✭ 92 (+170.59%)
Mutual labels:  bert
question generator
An NLP system for generating reading comprehension questions
Stars: ✭ 188 (+452.94%)
Mutual labels:  bert
ExpBERT
Code for our ACL '20 paper "Representation Engineering with Natural Language Explanations"
Stars: ✭ 28 (-17.65%)
Mutual labels:  bert
Sohu2019
2019搜狐校园算法大赛
Stars: ✭ 26 (-23.53%)
Mutual labels:  bert
rasa-bert-finetune
支持rasa-nlu 的bert finetune
Stars: ✭ 46 (+35.29%)
Mutual labels:  bert
CheXbert
Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Stars: ✭ 51 (+50%)
Mutual labels:  bert
oreilly-bert-nlp
This repository contains code for the O'Reilly Live Online Training for BERT
Stars: ✭ 19 (-44.12%)
Mutual labels:  bert
BERT-QE
Code and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".
Stars: ✭ 43 (+26.47%)
Mutual labels:  bert
neural-ranking-kd
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation
Stars: ✭ 74 (+117.65%)
Mutual labels:  bert

模型

基于BERT实现的阅读理解模型,在答案片段抽取网络的基础上额外加了一个分类器用于判别是否型问题和不可回答问题。

数据集

该模型专为法研杯(赛事官网)阅读理解赛道的题目而设计,比赛数据集来自于官方提供法律文书问答数据集,可在此处下载(提取码:8w0y)。

数据集格式如下图所示,大部分问题的答案为从文档中直接抽取得到,另外还包含拒答以及是否类(YES/NO)问题。

Baseline

https://github.com/china-ai-law-challenge/CAIL2019/tree/master/%E9%98%85%E8%AF%BB%E7%90%86%E8%A7%A3

要求

  • Python = 3.6
  • Tensorflow >= 1.11.0

准备

训练

python run_cail_with_yorn.py \
  --vocab_file=$MODEL_HOME/vocab.txt \
  --bert_config_file=$MODEL_HOME/bert_config.json \
  --init_checkpoint=$MODEL_HOME/bert_model.ckpt \
  --do_train=True \
  --train_file=$DATA_DIR/big_train_data.json \
  --train_batch_size=8 \
  --learning_rate=3e-5 \
  --num_train_epochs=7.0 \
  --max_seq_length=512 \
  --output_dir=$OUTPUT_DIR/cail_yorn/

$MODEL_HOME是BERT预训练语言模型的目录,$DATA_DIR是数据集的路径,$OUTPUT_DIR是输出文件路径,保存训练模型等文件。

预测

python run_cail_with_yorn.py \
  --vocab_file=$MODEL_HOME/vocab.txt \
  --bert_config_file=$MODEL_HOME/bert_config.json \
  --do_predict=True \
  --predict_file=$DATA_DIR/test_data.json \
  --max_seq_length=512 \
  --output_dir=$OUTPUT_DIR/cail_yorn/

TODO

Ensemble模型

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].