All Projects → A2Zadeh → Social-IQ

A2Zadeh / Social-IQ

Licence: other
[CVPR 2019 Oral] Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Social-IQ

circDeep
End-to-End learning framework for circular RNA classification from other long non-coding RNA using multimodal deep learning
Stars: ✭ 21 (-43.24%)
Mutual labels:  multimodal-deep-learning
MultiGraphGAN
MultiGraphGAN for predicting multiple target graphs from a source graph using geometric deep learning.
Stars: ✭ 16 (-56.76%)
Mutual labels:  multimodal-deep-learning
MSAF
Offical implementation of paper "MSAF: Multimodal Split Attention Fusion"
Stars: ✭ 47 (+27.03%)
Mutual labels:  multimodal-deep-learning
BBFN
This repository contains the implementation of the paper -- Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis
Stars: ✭ 42 (+13.51%)
Mutual labels:  multimodal-deep-learning
slp
Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning
Stars: ✭ 17 (-54.05%)
Mutual labels:  multimodal-deep-learning
scarches
Reference mapping for single-cell genomics
Stars: ✭ 175 (+372.97%)
Mutual labels:  multimodal-deep-learning
muscaps
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
Stars: ✭ 39 (+5.41%)
Mutual labels:  multimodal-deep-learning
Robust-Deep-Learning-Pipeline
Deep Convolutional Bidirectional LSTM for Complex Activity Recognition with Missing Data. Human Activity Recognition Challenge. Springer SIST (2020)
Stars: ✭ 20 (-45.95%)
Mutual labels:  multimodal-deep-learning
Multimodal-Future-Prediction
The official repository for the CVPR 2019 paper "Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction"
Stars: ✭ 38 (+2.7%)
Mutual labels:  multimodal-deep-learning
MISE
Multimodal Image Synthesis and Editing: A Survey
Stars: ✭ 214 (+478.38%)
Mutual labels:  multimodal-deep-learning
Self-Supervised-Embedding-Fusion-Transformer
The code for our IEEE ACCESS (2020) paper Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion.
Stars: ✭ 57 (+54.05%)
Mutual labels:  multimodal-deep-learning
hateful memes-hate detectron
Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. https://arxiv.org/abs/2012.12975
Stars: ✭ 35 (-5.41%)
Mutual labels:  multimodal-deep-learning
referit3d
Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
Stars: ✭ 59 (+59.46%)
Mutual labels:  multimodal-deep-learning
iMIX
A framework for Multimodal Intelligence research from Inspur HSSLAB.
Stars: ✭ 21 (-43.24%)
Mutual labels:  multimodal-deep-learning
multimodal-deep-learning-for-disaster-response
Damage Identification in Social Media Posts using Multimodal Deep Learning: code and dataset
Stars: ✭ 43 (+16.22%)
Mutual labels:  multimodal-deep-learning
attentive-modality-hopping-for-SER
TensorFlow implementation of "Attentive Modality Hopping for Speech Emotion Recognition," ICASSP-20
Stars: ✭ 25 (-32.43%)
Mutual labels:  multimodal-deep-learning
mmd
This repository contains the Pytorch implementation for our SCAI (EMNLP-2018) submission "A Knowledge-Grounded Multimodal Search-Based Conversational Agent"
Stars: ✭ 28 (-24.32%)
Mutual labels:  multimodal-deep-learning

Social-IQ Dataset

Download paper here alt text

Human language offers a unique unconstrained approach to probe through questions and reason through answers about social situations. This unconstrained approach extends previous attempts to model social intelligence through numeric supervision (e.g. sentiment and emotions labels). Social-IQ, is an unconstrained benchmark designed to train and evaluate socially intelligent technologies. By providing a rich source of open-ended questions and answers, Social-IQ opens the door to explainable social intelligence. The dataset contains rigorously annotated and validated videos, questions and answers, as well as annotations for the complexity level of each question and answer. Social-IQ contains 1,250 natural in-the-wild social situations, 7,500 questions and 52,500 correct and incorrect answers. Although humans can reason about social situations with very high accuracy (95.08%), existing state-of-the-art computational models struggle on this task.

Social-IQ Statistics

alt text

Question Statistics: The Social-IQ dataset contains a total of 7500 questions (6 per video). Figure 2 (a) demonstrates the distribution of question length in terms of number of words. The average length of questions in Social-IQ is 10.87 words. Figure 2 (c) shows the different question types in the Social-IQ dataset. Questions starting with why and how, which often require causal reasoning, are the largest group of questions in Social-IQ. This is a unique feature of the Social-IQ dataset and a distinguishing factor of Social-IQ from other multimodal QA datasets (which commonly have what (object) and who questions as the most common). Figure 2 (e) demonstrates the distribution of complexity across questions of the Social-IQ. Majority of the dataset consists of advanced and intermediate questions (with almost equal share between the two) while easy questions share a small portion of the dataset. The distribution of question types and complexity levels in Social-IQ demonstrates the challenging nature of the dataset.

Answer Statistics: Social-IQ contains a total of 30,000 correct (4 per question) and 22,500 (3 per question) incorrect answers. Figure 2 (b) demonstrates the distribution of word length for answers in the Social-IQ dataset. Both the correct (green) and incorrect (red) answers follow similar distribution. On average, there are a total of 10.46 words per answer in Social-IQ. This is also a unique characteristic of the Social-IQ dataset since the average answer length is longer than other multimodal QA datasets (with average length between 1.24 to 5.3 words). The long average length demonstrates the level of detail included in Social-IQ answers. Presence of multiple correct answers in the Social-IQ dataset allows for modeling diversity and subjectivity across annotators in cases where multiple explanations are correct for a certain question. Furthermore, having multiple correct answers enables answer generation tasks (which often require multiple correct answers for successful evaluation).

Multimedia Statistics: Social-IQ dataset consists of a total of 1,250 videos from YouTube. Figure 2 (f) demonstrates an overview of categories of the videos in Social-IQ. There is a total of 1,239 minutes of annotated video content (across 10,529 minutes of full videos). Figure 2 (d) shows the distribution of number of characters in videos. All the videos in the Social-IQ dataset contain manual transcriptions with detailed timestamps.

Acquiring the data

The data will be released as a part of our CMU Multimodal SDK (https://github.com/A2Zadeh/CMU-MultimodalSDK). To download all the processed data, simply use:

>>> from mmsdk import mmdatasdk
>>> socialiq_highlevel=mmdatasdk.mmdataset(mmdatasdk.socialiq.highlevel,'socialiq/')
>>> folds=mmdatasdk.socialiq.standard_train_fold,mmdatasdk.socialiq.standard_valid_fold

Social-IQ 1.0 has no public test data, since the test set will be used for challenges and workshops. However, we are releasing a public test set here. This public test is different than the challenge test set, which the original paper reports results on. To submit to challenge test set, please contact us by email.

You can also download the raw data here

Running the Tensor-MFN code

Please find the code in the code folder. First run the dl_and_align.py to download the dataset. Subsequently, run the tmfn_bert.py.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].