All Projects → zliucr → Crosslingual Nlu

zliucr / Crosslingual Nlu

Licence: mit
Zero-shot Cross-lingual Task-Oriented Dialogue Systems (EMNLP 2019)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Crosslingual Nlu

Nlp Recipes
Natural Language Processing Best Practices & Examples
Stars: ✭ 5,783 (+28815%)
Mutual labels:  natural-language-understanding, nlu
Gluon Nlp
NLP made easy
Stars: ✭ 2,344 (+11620%)
Mutual labels:  natural-language-understanding, nlu
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (+505%)
Mutual labels:  natural-language-understanding, nlu
Spark Nlp Models
Models and Pipelines for the Spark NLP library
Stars: ✭ 88 (+340%)
Mutual labels:  natural-language-understanding, nlu
Articutapi
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到 SIGHAN 2005 F1-measure 94% 以上,Recall 96% 以上的成績。
Stars: ✭ 252 (+1160%)
Mutual labels:  natural-language-understanding, nlu
Spokestack Python
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application.
Stars: ✭ 103 (+415%)
Mutual labels:  natural-language-understanding, nlu
Chat
基于自然语言理解与机器学习的聊天机器人,支持多用户并发及自定义多轮对话
Stars: ✭ 516 (+2480%)
Mutual labels:  natural-language-understanding, nlu
Chinese nlu by using rasa nlu
使用 RASA NLU 来构建中文自然语言理解系统(NLU)| Use RASA NLU to build a Chinese Natural Language Understanding System (NLU)
Stars: ✭ 99 (+395%)
Mutual labels:  natural-language-understanding, nlu
spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Stars: ✭ 52 (+160%)
Mutual labels:  nlu, natural-language-understanding
mixed-language-training
Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems (AAAI-2020)
Stars: ✭ 29 (+45%)
Mutual labels:  multilingual, nlu
Dialogflow Ruby Client
Ruby SDK for Dialogflow
Stars: ✭ 148 (+640%)
Mutual labels:  natural-language-understanding, nlu
Clause
🏇 聊天机器人,自然语言理解,语义理解
Stars: ✭ 323 (+1515%)
Mutual labels:  natural-language-understanding, nlu
watson-document-classifier
Augment IBM Watson Natural Language Understanding APIs with a configurable mechanism for text classification, uses Watson Studio.
Stars: ✭ 41 (+105%)
Mutual labels:  nlu, natural-language-understanding
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+1315%)
Mutual labels:  natural-language-understanding, nlu
Botlibre
An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
Stars: ✭ 412 (+1960%)
Mutual labels:  natural-language-understanding, nlu
Deberta
The implementation of DeBERTa
Stars: ✭ 541 (+2605%)
Mutual labels:  natural-language-understanding
Conv Emotion
This repo contains implementation of different architectures for emotion recognition in conversations.
Stars: ✭ 646 (+3130%)
Mutual labels:  natural-language-understanding
Articulate
A platform for building conversational interfaces with intelligent agents (chatbots)
Stars: ✭ 534 (+2570%)
Mutual labels:  nlu
Chatbot cn
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (+3855%)
Mutual labels:  nlu
Vim Doge
(Do)cumentation (Ge)nerator 10+ languages 📚 Generate proper code documentation skeletons with a single keypress. ⚡️🔥
Stars: ✭ 533 (+2565%)
Mutual labels:  multilingual

Zero-shot Cross-lingual Task-Oriented Dialogue Systems

License: MIT

This repository is for the paper accepted in EMNLP 2019: Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables

This code has been written using PyTorch. If you use any source codes or ideas included in this repository for your work, please cite the following paper.

@inproceedings{liu2019zero,
  title={Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables},
  author={Liu, Zihan and Shin, Jamin and Xu, Yan and Winata, Genta Indra and Xu, Peng and Madotto, Andrea and Fung, Pascale},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
  pages={1297--1303},
  year={2019}
}

Abstract

Despite the surging demands for multilingual task-oriented dialog systems (e.g., Alexa, Google Home), there has been less research done in multilingual or cross-lingual scenarios. Hence, we propose a zero-shot adaptation of task-oriented dialogue system to low-resource languages. To tackle this challenge, we first use a set of very few parallel word pairs to refine the aligned cross-lingual word-level representations. We then employ a latent variable model to cope with the variance of similar sentences across different languages, which is induced by imperfect cross-lingual alignments and inherent differences in languages. Finally, the experimental results show that even though we utilize much less external resources, our model achieves better adaptation performance for natural language understanding task (i.e., the intent detection and slot filling) compared to the current state-of-the-art model in the zero-shot scenario.

Data

We evaluate our system on multilingual task-oriented dialogue dataset Published by Schuster et al. (2019), which contains Dialog Natural Language Understanding data in English, Spanish and Thai. We put this dataset in the data folder under this repository.

Model Architecture

Setup

  • Install PyTorch (Tested in PyTorch 0.4.0 and Python 3.6)
  • Install library dependencies
  • Download cross-lingual word embeddings in English, Spanish and Thai from fasttext, and put it in the emb folder.

Note: Refined cross-lingual word embeddings are already included in the refine_emb folder, where "refine.en.align.en-es.vec" and "refine.es.align.vec" are refined pairs for adapting to Spanish, and "refine.en.align.en-th.vec" and "refine.th.align.vec" are for adapting to Thai.

Cross-lingual NLU

  • --embnoise: inject Gaussian noise into English embeddings
  • --lvm : use latent variable model
  • --crf : use conditional random field
  • --emb_file_en: path of English embeddings (choose to use refined cross-lingual embeddings, the default setting is the original cross-lingual embeddings from fasttext)
  • --emb_file_es: path of Spanish embeddings
  • --emb_file_th: path of Thai embeddings
  • --clean_txt: conduct delexicalization

Training

Train English system for Spanish Adaptation

python main.py --exp_name lvm_refine_noise_clean_enes --exp_id 1 --bidirection --freeze_emb --lvm --lvm_dim 100 --batch_size 32 --emb_file_en ./refine_emb/refine.en.align.en-es.vec --embnoise --clean_txt --early_stop 1

Train English system for Thai Adaptation

python main.py --exp_name lvm_refine_noise_clean_enth --exp_id 1 --bidirection --freeze_emb --lvm --lvm_dim 100 --batch_size 32 --emb_file_en ./refine_emb/refine.en.align.en-th.vec --embnoise --clean_txt --early_stop 1

Zero-shot Adaptation

Zero-shot transfer to Spanish

python main.py --exp_name lvm_refine_noise_clean_enes --exp_id 1 --transfer --trans_lang es --bidirection --lvm --emb_file_es ./refine_emb/refine.es.align.vec --clean_txt

Zero-shot transfer to Thai

python main.py --exp_name lvm_refine_noise_clean_enth --exp_id 1 --transfer --trans_lang th --bidirection --lvm --emb_file_th ./refine_emb/refine.th.align.vec --clean_txt

Reproducibility

We provide the pretrained checkpoints of our model in the experiment folder to help you reproduce the results in Spanish and Thai by running the following commands:

Zero-shot transfer to Spanish

python main.py --exp_name lvm_refine_noise_clean_enes --exp_id best --transfer --trans_lang es --bidirection --lvm --emb_file_es ./refine_emb/refine.es.align.vec --clean_txt

Zero-shot transfer to Thai

python main.py --exp_name lvm_refine_noise_clean_enth --exp_id best --transfer --trans_lang th --bidirection --lvm --emb_file_th ./refine_emb/refine.th.align.vec --clean_txt
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].