kushalkafle / DVQA_dataset

Licence: other

DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018

Projects that are alternatives of or similar to DVQA dataset

MICCAI21 MMQ

Multiple Meta-model Quantifying for Medical Visual Question Answering

Stars: ✭ 16 (-20%)

Mutual labels: vqa, question-answering

Vqa Tensorflow

Tensorflow Implementation of Deeper LSTM+ normalized CNN for Visual Question Answering

Stars: ✭ 98 (+390%)

Mutual labels: vqa, question-answering

Mac Network

Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)

Stars: ✭ 444 (+2120%)

Mutual labels: vqa, question-answering

VideoNavQA

An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)

Stars: ✭ 22 (+10%)

Mutual labels: vqa, question-answering

Mullowbivqa

Hadamard Product for Low-rank Bilinear Pooling

Stars: ✭ 57 (+185%)

Mutual labels: vqa, question-answering

hcrn-videoqa

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Stars: ✭ 111 (+455%)

Mutual labels: vqa, question-answering

head-qa

HEAD-QA: A Healthcare Dataset for Complex Reasoning

Stars: ✭ 20 (+0%)

Mutual labels: question-answering

MSMARCO

Machine Comprehension Train on MSMARCO with S-NET Extraction Modification

Stars: ✭ 31 (+55%)

Mutual labels: question-answering

unanswerable qa

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Stars: ✭ 21 (+5%)

Mutual labels: question-answering

just-ask

[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Stars: ✭ 57 (+185%)

Mutual labels: vqa

WikiQA

Very Simple Question Answer System using Chinese Wikipedia Data

Stars: ✭ 24 (+20%)

Mutual labels: question-answering

query-focused-sum

Official code repository for "Exploring Neural Models for Query-Focused Summarization".

Stars: ✭ 17 (-15%)

Mutual labels: question-answering

Stargraph

StarGraph (aka *graph) is a graph database to query large Knowledge Graphs. Playing with Knowledge Graphs can be useful if you are developing AI applications or doing data analysis over complex domains.

Stars: ✭ 24 (+20%)

Mutual labels: question-answering

Medi-CoQA

Conversational Question Answering on Clinical Text

Stars: ✭ 22 (+10%)

Mutual labels: question-answering

DocQN

Author implementation of "Learning to Search in Long Documents Using Document Structure" (Mor Geva and Jonathan Berant, 2018)

Stars: ✭ 21 (+5%)

Mutual labels: question-answering

VoxelMorph-PyTorch

An unofficial PyTorch implementation of VoxelMorph- An unsupervised 3D deformable image registration method

Stars: ✭ 68 (+240%)

Mutual labels: cvpr2018

XORQA

This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".

Stars: ✭ 61 (+205%)

Mutual labels: question-answering

BarChart

SwiftUI Bar Chart

Stars: ✭ 156 (+680%)

Mutual labels: bar-chart

DisguiseNet

Code for DisguiseNet : A Contrastive Approach for Disguised Face Verification in the Wild

Stars: ✭ 20 (+0%)

Mutual labels: cvpr2018

Giveme5W

Extraction of the five journalistic W-questions (5W) from news articles

Stars: ✭ 16 (-20%)

Mutual labels: question-answering

View All Similar Projects ➔

DVQA

This repository provides the images, metadata and question-answer pairs described in the paper:

DVQA: Understanding Data Visualizations via Question Answering
Kushal Kafle, Brian Price, Scott Cohen, Christopher Kanan

To be presented at CVPR 2018

Please cite the following if you use the DVQA dataset in your work:

@inproceedings{kafle2018dvqa,
  title={DVQA: Understanding Data Visualizations via Question Answering},
  author={Kafle, Kushal and Cohen, Scott and Price, Brian and Kanan, Christopher},
  booktitle={CVPR},
  year={2018}
}

A live demo of our SANDY algorithm as described in the paper above can be found in this url

Download Links

Images

Download images using this url. The images are all in the same folder and are named as

bar_{split}_xxxxxxxx.png
where, 
xxxxxxxx = image_id padded (right justified) to length of 8 characters
split = train, val_easy, or val_hard

The images expand to about 6.5 GB.

Question Answer Pairs

The question-answer pair can be downloaded from this url. It consists of three files, one each for three different splits of the dataset named as {split}_qa.json It consists the following fields:

image: The image filename which the given question-answer pair applies to
question: Question
answer: Answer to the Questions. Remember that (cardinal numbers (1,2,3...) are used when 
	the number denotes the value and words (one,two,three...) are used to denote count
question_type: Denotes whether the question is structure, data or reasoning type
bbox_answer: If the answer is a text in the bar_chart, bounding box in form of [x,y,w,h], else []
question_id: Unique question_id associated with the question

The question-answer pairs expand to about 750 MB.

Bar-chart metadata

In addition to question-answers, we also provide detailed annotations of every object in the bar-chart that can serve as either the source of additional supervision (à la our SANDY and MOM model) or use it to do additional analysis of your algorithm's performance.

Metadata for the bar-charts can be downloaded using this url. It consists of three files, one each for three different splits of the dataset named as {split}_metadata.json It consists the following fields:

image: The image filename which the given metadata applies to
bars:
	bboxes: Bounding boxes for different bars (number_of_bars x number_of_legends x 4)
    	names: Names for each bar in the form (number_of_bars x number_of_legends)
	colors: Color of each bar (number_of_bars x number_of_legends)

texts:
	text: The string of the text-block in the bar-chart
    	text_function: The function of text (e.g., title, legend, etc)
    	bbox: The bounding box surrounding the text-block

table: Underlying table used to create the chart saved in the following format.

	single row charts:
		C_1 	C_2 	C_3	...	C_N
		-------------------------------------
		V_1	V_2	V_3	... 	V_N
		
	multi row charts:
		
		None |	C_1 	C_2 	C_3	...	C_N
		-----|---------------------------------------
		R_1  |	V_11	V_21	V_31	... 	V_N1
		R_2  |	V_12	V_22	V_32	... 	V_N2
		...  |	...	...	... 	... 	...
		R_M  |	V_1M	V_2M	V_3M	... 	V_NM

Since numpy arrays are not supporte by JSON, the tables are saved as nested lists. Converting them to numpy array, e.g., table = np.array(metadata['table']) might provide easier access to the elements, e.g., for multi-row charts, table[1:,1:] contains the numeric data, table[1:,0] contains the row names and table[0,1:] contain the column names.

The annotations expand to about 800 MB.

Contact

Feel free to contact us (contact details on the paper PDF) about any questions, suggestions or comments about either the dataset or the methods used in the paper.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

kushalkafle / DVQA_dataset

Labels

Projects that are alternatives of or similar to DVQA dataset

DVQA

Download Links

Images

Question Answer Pairs

Bar-chart metadata

Contact