All Projects → zqhZY → _rasa_chatbot

zqhZY / _rasa_chatbot

A Chinese task oriented chatbot in IVR(Interactive Voice Response) domain, implement by rasa. This is a demo with toy dataset, more data should be added for performance.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to rasa chatbot

Swift Algorithm Club Cn
swift-algorithm-club的翻译。使用Swift学习算法和数据结构。
Stars: ✭ 304 (-28.97%)
Mutual labels:  chinese
Weapp Poem
诗词墨客 - 最全中华古诗词小程序
Stars: ✭ 349 (-18.46%)
Mutual labels:  chinese
Jetbrains In Chinese
JetBrains 系列软件汉化包 关键字: Android Studio 3.5 汉化包 CLion 2019.3 汉化包 DataGrip 2019.3 汉化包 GoLand 2019.3 汉化包 IntelliJ IDEA 2019.3 汉化包 PhpStorm 2019.3 汉化包 PyCharm 2019.3 汉化包 Rider 2019.3 汉化包 RubyMine 2019.3 汉化包 WebStorm 2019.3 汉化包
Stars: ✭ 3,912 (+814.02%)
Mutual labels:  chinese
Osfcc
一个收集可用于中文字体排印的开源字体集合。
Stars: ✭ 314 (-26.64%)
Mutual labels:  chinese
Text Classification Cnn Rnn
CNN-RNN中文文本分类,基于TensorFlow
Stars: ✭ 3,613 (+744.16%)
Mutual labels:  chinese
Cnn handwritten chinese recognition
CNN在线识别手写中文。
Stars: ✭ 365 (-14.72%)
Mutual labels:  chinese
Tensorflow Mtcnn
人脸检测MTCNN算法,采用tensorflow框架编写,从理解到训练,中文注释完全,含测试和训练,支持摄像头
Stars: ✭ 302 (-29.44%)
Mutual labels:  chinese
Zhparser
zhparser is a PostgreSQL extension for full-text search of Chinese language
Stars: ✭ 418 (-2.34%)
Mutual labels:  chinese
Weekly Github Digest
📰 A published weekly with a Thursday publication date.
Stars: ✭ 338 (-21.03%)
Mutual labels:  chinese
Gpt2 Chinese
Chinese version of GPT2 training code, using BERT tokenizer.
Stars: ✭ 4,592 (+972.9%)
Mutual labels:  chinese
Minecraft Mod Language Package
A language package for Minecraft Mods.
Stars: ✭ 322 (-24.77%)
Mutual labels:  chinese
Zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。
Stars: ✭ 327 (-23.6%)
Mutual labels:  chinese
Chaizi
漢語拆字字典
Stars: ✭ 384 (-10.28%)
Mutual labels:  chinese
Python Pinyin
汉字转拼音(pypinyin)
Stars: ✭ 3,618 (+745.33%)
Mutual labels:  chinese
Raft Zh cn
Raft一致性算法论文的中文翻译
Stars: ✭ 4,684 (+994.39%)
Mutual labels:  chinese
Jiebar
Chinese text segmentation with R. R语言中文分词 (文档已更新 🎉 :https://qinwenfeng.com/jiebaR/ )
Stars: ✭ 302 (-29.44%)
Mutual labels:  chinese
Cope
A modern IDE for writing classical Chinese poetry 格律诗编辑程序
Stars: ✭ 362 (-15.42%)
Mutual labels:  chinese
Deep Learning Resources
由淺入深的深度學習資源 Collection of deep learning materials for everyone
Stars: ✭ 422 (-1.4%)
Mutual labels:  chinese
Chineseutil
PHP 中文工具包,支持汉字转拼音、拼音分词、简繁互转、数字、金额大写;QQ群:17916227
Stars: ✭ 413 (-3.5%)
Mutual labels:  chinese
Padavan
padavan 简体中文 & 路由器适配
Stars: ✭ 385 (-10.05%)
Mutual labels:  chinese

note:

项目已跟进到rasa新版本,一些新特性后面尝试后补充。rasa 版本更新太快,本项目滞后最新版本较大,仅供参考,建议根据需要阅读最新rasa文档。

rasa_chatbot

A Chinese task oriented chatbot in IVR(Interactive Voice Response) domain, Implement by rasa nlu and rasa core. This is a demo with toy dataset.

install dependency:

python3

install or update to python 3

install rasa_core, this will install rasa nlu too, and now support chinese.

pip install rasa_core==0.9.0

this command will install rasa nlu too.

install sklearn and MITIE

pip install -U scikit-learn sklearn-crfsuite
pip install git+https://github.com/mit-nlp/MITIE.git

dir tree

_rasa_chatbot/
├── bot.py
├── chat_detection
├── data
│   ├── mobile_nlu_data.json # train data json format
│   ├── mobile_raw_data.txt # train data raw
│   ├── mobile_story.md # toy dialogue train data 
│   └── total_word_feature_extractor.dat # pretrained mitie word vector
├── httpserver.py # rasa nlu httpserver
├── __init__.py
├── INSTALL.md
├── ivr_chatbot.yml # rasa nlu config file
├── mobile_domain.yml # rasa core config file
├── projects # pretrained models
│   ├── dialogue
│   └── ivr_nlu
├── README.md
├── tools # tools of data process
└── train.sh # train script of rasa nlu

train nlu model

sh train.sh

命令运行耗时较长,模型训练完毕生成:

projects/
└── ivr_nlu
    └── demo
        ├── entity_extractor.dat
        ├── entity_synonyms.json
        ├── intent_classifier_sklearn.pkl
        ├── metadata.json
        └── training_data.json

test rasa nlu

$ python httpserver.py
$ curl -X POST localhost:1235/parse -d '{"q":"我的流量还剩多少"}' | python -m json.tool
{
    'q': '我的流量还剩多少', 
    'intent': 'request_search', 
    'entities': {
        'item': '流量'
    }
}

train dialogue

python bot.py train-dialogue

模型训练完毕生成:

projects
├── dialogue
│   ├── domain.json
│   ├── domain.yml
│   ├── policy_0_MemoizationPolicy
│   │   ├── featurizer.json
│   │   └── memorized_turns.json
│   ├── policy_1_KerasPolicy
│   │   ├── featurizer.json
│   │   ├── keras_arch.json
│   │   ├── keras_policy.json
│   │   └── keras_weights.h5
│   ├── policy_metadata.json
│   └── stories.md
└── ivr_nlu

train dialogue in online mode

python bot.py online_train

test

Run command below:

python bot.py run

Example1:

Bot loaded. Type a message and press enter : 
YOU:你是谁
BOT:您好!,我是机器人小热,很高兴为您服务。
YOU:我想看一下消费情况
BOT:您想查询哪个月的消费?
YOU:十月份的
BOT:好,请稍等
BOT:您好,您十月份共消费二十八元。
YOU:好谢谢
BOT:您开心我就开心
BOT:您还想干什么
YOU:没什么了
BOT:Bye, 下次再见

Example2:

Bot loaded. Type a message and press enter : 
YOU:给我看看我上月用了多少话费
BOT:好,请稍等
BOT:您好,您上月共消费二十八元。
BOT:您还想干什么

train word vector

You can train your own MITIE model using following method:

$ git clone https://github.com/mit-nlp/MITIE.git
$ cd MITIE/tools/wordrep
$ mkdir build
$ cd build
$ cmake ..
$ cmake --build . --config Release
$ ./wordrep -e /path/to/your/folder_of_cutted_text_files

/path/to/your/folder_of_cutted_text_files above is a directory path in which has word cutted data files to train. This process may cost one or two days.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].