Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

自然语言处理（nlp），小姜机器人（闲聊检索式chatbot），BERT句向量-相似度（Sentence Similarity），XLNET句向量-相似度（text xlnet embedding），文本分类（Text classification），实体提取（ner，bert+bilstm+crf），数据增强（text augment, data enhance），同义句同义词生成，句子主干提取（mainpart），中文汉语短文本相似度，文本特征工程，keras-http-service调用

Stars: ✭ 954 (-18.18%)

Mutual labels: chatbot, chinese

Awesome Cn

awesome项目中文翻译，提升查阅效率

Stars: ✭ 62 (-94.68%)

Mutual labels: chinese

Text Analytics With Python

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.

Stars: ✭ 1,132 (-2.92%)

Mutual labels: natural-language

Megahal

MegaHAL is a learning chatterbot.

Stars: ✭ 60 (-94.85%)

Mutual labels: chatbot

Dumbqq

对SmartQQ API的C#封装。（由于作者懒出了一定境界现已停止维护）

Stars: ✭ 60 (-94.85%)

Mutual labels: chatbot

Chinese Hershey Font

Convert Chinese Characters to Single-Line Fonts using Computer Vision

Stars: ✭ 70 (-94%)

Mutual labels: chinese

Microsoftbotframework

Microsoft Bot Framework is a wrapper for the Microsoft Bot API by Microsoft

Stars: ✭ 68 (-94.17%)

Mutual labels: chatbot

Awesome machine learning solutions

A curated list of repositories for my book Machine Learning Solutions.

Stars: ✭ 65 (-94.43%)

Mutual labels: chatbot

Dragonfire

the open-source virtual assistant for Ubuntu based Linux distributions

Stars: ✭ 1,120 (-3.95%)

Mutual labels: chatbot

Devchatterbot

Stars: ✭ 60 (-94.85%)

Mutual labels: chatbot

Talkify

Talkify is an open source framework with an aim to standardize and model conversational AI enabling development of personal assistants and chat bots. The mission of this framework is to make developing chat bots and personal assistants as easy as spinning up a simple website in html.

Stars: ✭ 68 (-94.17%)

Mutual labels: chatbot

When

A natural language date/time parser with pluggable rules

Stars: ✭ 1,113 (-4.55%)

Mutual labels: natural-language

Localization Zh Cn Plugin

Chinese Localization for Jenkins

Stars: ✭ 65 (-94.43%)

Mutual labels: chinese

Fb Botmill

A Java framework for building bots on Facebook's Messenger Platform.

Stars: ✭ 67 (-94.25%)

Mutual labels: chatbot

Messenger Bot Rails

Ruby on Rails Gem for the Facebook Messenger Bot Platform

Stars: ✭ 64 (-94.51%)

Mutual labels: chatbot

View All Similar Projects ➔

Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

Please refer to newest instructions at official Rasa NLU document

中文Blog

Files you should have:

data/total_word_feature_extractor_zh.dat

Trained from Chinese corpus by MITIE wordrep tools (takes 2-3 days for training)

For training, please build the MITIE Wordrep Tool. Note that Chinese corpus should be tokenized first before feeding into the tool for training. Close-domain corpus that best matches user case works best.

A trained model from Chinese Wikipedia Dump and Baidu Baike can be downloaded from 中文Blog.

data/examples/rasa/demo-rasa_zh.json

Should add as much examples as possible.

Usage:

Clone this project, and run

python setup.py install

Modify configuration.

Currently for Chinese we have two pipelines:

Use MITIE+Jieba (sample_configs/config_jieba_mitie.yml):

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_classifier_mitie"

RECOMMENDED: Use MITIE+Jieba+sklearn (sample_configs/config_jieba_mitie_sklearn.yml):

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_mitie"
- name: "intent_classifier_sklearn"

(Optional) Use Jieba User Defined Dictionary or Switch Jieba Default Dictionoary:

You can put in file path or directory path as the "user_dicts" value. (sample_configs/config_jieba_mitie_sklearn_plus_dict_path.yml)

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
  default_dict: "./default_dict.big"
  user_dicts: "./jieba_userdict"
#  user_dicts: "./jieba_userdict/jieba_userdict.txt"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_mitie"
- name: "intent_classifier_sklearn"

Train model by running:

If you specify your project name in configure file, this will save your model at /models/your_project_name.

Otherwise, your model will be saved at /models/default

python -m rasa_nlu.train -c sample_configs/config_jieba_mitie_sklearn.yml --data data/examples/rasa/demo-rasa_zh.json --path models

Run the rasa_nlu server:

python -m rasa_nlu.server -c sample_configs/config_jieba_mitie_sklearn.yml --path models

Open a new terminal and now you can curl results from the server, for example:

$ curl -XPOST localhost:5000/parse -d '{"q":"我发烧了该吃什么药？", "project": "rasa_nlu_test", "model": "model_20170921-170911"}' | python -mjson.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   652    0   552  100   100    157     28  0:00:03  0:00:03 --:--:--   157
{
    "entities": [
        {
            "end": 3,
            "entity": "disease",
            "extractor": "ner_mitie",
            "start": 1,
            "value": "发烧"
        }
    ],
    "intent": {
        "confidence": 0.5397186422631861,
        "name": "medical"
    },
    "intent_ranking": [
        {
            "confidence": 0.5397186422631861,
            "name": "medical"
        },
        {
            "confidence": 0.16206323981749196,
            "name": "restaurant_search"
        },
        {
            "confidence": 0.1212448457737397,
            "name": "affirm"
        },
        {
            "confidence": 0.10333600028547868,
            "name": "goodbye"
        },
        {
            "confidence": 0.07363727186010374,
            "name": "greet"
        }
    ],
    "text": "我发烧了该吃什么药？"
}

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,166

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (75) 🔗