All Projects → howl-anderson → Atis_dataset

howl-anderson / Atis_dataset

The ATIS (Airline Travel Information System) Dataset

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Atis dataset

Meglass
An eyeglass face dataset collected and cleaned for face recognition evaluation, CCBR 2018.
Stars: ✭ 281 (+246.91%)
Mutual labels:  dataset, datasets
Voice datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Stars: ✭ 494 (+509.88%)
Mutual labels:  dataset, datasets
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+249.38%)
Mutual labels:  dataset, nlu
dbcollection
A collection of popular datasets for deep learning.
Stars: ✭ 26 (-67.9%)
Mutual labels:  dataset, datasets
Chatbot cn
基于金融-司法领域(兼有闲聊性质)的聊天机器人,其中的主要模块有信息抽取、NLU、NLG、知识图谱等,并且利用Django整合了前端展示,目前已经封装了nlp和kg的restful接口
Stars: ✭ 791 (+876.54%)
Mutual labels:  nlu, dialogue-systems
Hub
Dataset format for AI. Build, manage, & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai
Stars: ✭ 4,003 (+4841.98%)
Mutual labels:  dataset, datasets
Doccano
Open source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+6813.58%)
Mutual labels:  dataset, datasets
Datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Stars: ✭ 3,094 (+3719.75%)
Mutual labels:  dataset, datasets
Chatito
🎯🗯 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!
Stars: ✭ 678 (+737.04%)
Mutual labels:  dataset, nlu
Label Studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+8867.9%)
Mutual labels:  dataset, datasets
recurrent-defocus-deblurring-synth-dual-pixel
Reference github repository for the paper "Learning to Reduce Defocus Blur by Realistically Modeling Dual-Pixel Data". We propose a procedure to generate realistic DP data synthetically. Our synthesis approach mimics the optical image formation found on DP sensors and can be applied to virtual scenes rendered with standard computer software. Lev…
Stars: ✭ 30 (-62.96%)
Mutual labels:  dataset, datasets
Letsgodataset
This repository makes the integral Let's Go dataset publicly available.
Stars: ✭ 41 (-49.38%)
Mutual labels:  dataset, dialogue-systems
HINT3
This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020's Insights Workshop https://insights-workshop.github.io/ Preprint for the paper is available here https://arxiv.org/abs/2009.13833
Stars: ✭ 27 (-66.67%)
Mutual labels:  datasets, dialogue-systems
Dstc8 Schema Guided Dialogue
The Schema-Guided Dialogue Dataset
Stars: ✭ 277 (+241.98%)
Mutual labels:  dataset, dialogue-systems
Wisty.js
🧚‍♀️ Chatbot library turning conversations into actions, locally, in the browser.
Stars: ✭ 24 (-70.37%)
Mutual labels:  nlu, dialogue-systems
Awesome Segmentation Saliency Dataset
A collection of some datasets for segmentation / saliency detection. Welcome to PR...😄
Stars: ✭ 315 (+288.89%)
Mutual labels:  dataset, datasets
Datasets
source{d} datasets ("big code") for source code analysis and machine learning on source code
Stars: ✭ 231 (+185.19%)
Mutual labels:  dataset, datasets
Retriever
Quickly download, clean up, and install public datasets into a database management system
Stars: ✭ 241 (+197.53%)
Mutual labels:  dataset, datasets
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (+558.02%)
Mutual labels:  dataset, datasets
French Sentiment Analysis Dataset
A collection of over 1.5 Million tweets data translated to French, with their sentiment.
Stars: ✭ 35 (-56.79%)
Mutual labels:  dataset, datasets

README written in English

The ATIS (Airline Travel Information System) Dataset

本仓库包含了 Python pickle 格式和 Rasa NLU JSON 格式(https://rasa.com/docs/nlu/dataformat/#json-format)的 ATIS Dataset(数据集),并提供了读取脚本和示例代码。

数据样本

原始格式

   0:         flight: BOS i want to fly from boston at 838 am and arrive in denver at 1110 in the morning EOS
                              BOS                                        O
                                i                                        O
                             want                                        O
                               to                                        O
                              fly                                        O
                             from                                        O
                           boston                      B-fromloc.city_name
                               at                                        O
                              838                       B-depart_time.time
                               am                       I-depart_time.time
                              and                                        O
                           arrive                                        O
                               in                                        O
                           denver                        B-toloc.city_name
                               at                                        O
                             1110                       B-arrive_time.time
                               in                                        O
                              the                                        O
                          morning              B-arrive_time.period_of_day
                              EOS                                        O

Rasa NLU Json 格式

{
    "rasa_nlu_data": {
        "common_examples": [
            {
                "text": "i would like to find a flight from charlotte to las vegas that makes a stop in st. louis",
                "intent": "flight",
                "entities": [
                    {
                        "start": 35,
                        "end": 44,
                        "value": "charlotte",
                        "entity": "fromloc.city_name"
                    },
                    {
                        "start": 48,
                        "end": 57,
                        "value": "las vegas",
                        "entity": "toloc.city_name"
                    },
                    {
                        "start": 79,
                        "end": 88,
                        "value": "st. louis",
                        "entity": "stoploc.city_name"
                    }
                ]
            },
            ...
        ]
    }
}

数据统计

样本数 词汇数 实体数 意图数
4978(训练集)+893(测试集) 943 129 26

示例代码

summary_data.py 中包含了读取原始数据的代码,用户可以参考该代码,实现从原始文件读取数据。

下载

数据格式 训练集 测试集
Python 3 Pickle 格式 atis.train.pkl atis.test.pkl
Rasa NLU JSON 格式 train.json test.json

Credit

同类项目

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].