All Projects → U-Alberta → Exemplar

U-Alberta / Exemplar

Licence: gpl-3.0
An open relation extraction system

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Exemplar

Reside
EMNLP 2018: RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information
Stars: ✭ 222 (+382.61%)
Mutual labels:  natural-language-processing, relation-extraction
Languagecrunch
LanguageCrunch NLP server docker image
Stars: ✭ 281 (+510.87%)
Mutual labels:  natural-language-processing, relation-extraction
Pytorch graph Rel
A PyTorch implementation of GraphRel
Stars: ✭ 204 (+343.48%)
Mutual labels:  natural-language-processing, relation-extraction
Deeplearning nlp
基于深度学习的自然语言处理库
Stars: ✭ 154 (+234.78%)
Mutual labels:  natural-language-processing, relation-extraction
Fewrel
A Large-Scale Few-Shot Relation Extraction Dataset
Stars: ✭ 526 (+1043.48%)
Mutual labels:  natural-language-processing, relation-extraction
Oie Resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Stars: ✭ 283 (+515.22%)
Mutual labels:  natural-language-processing, relation-extraction
Tacred Relation
PyTorch implementation of the position-aware attention model for relation extraction
Stars: ✭ 271 (+489.13%)
Mutual labels:  natural-language-processing, relation-extraction
Knowledge Graphs
A collection of research on knowledge graphs
Stars: ✭ 845 (+1736.96%)
Mutual labels:  natural-language-processing, relation-extraction
Usc Ds Relationextraction
Distantly Supervised Relation Extraction
Stars: ✭ 378 (+721.74%)
Mutual labels:  natural-language-processing, relation-extraction
Gcn Over Pruned Trees
Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)
Stars: ✭ 312 (+578.26%)
Mutual labels:  natural-language-processing, relation-extraction
Awesome Relation Extraction
📖 A curated list of awesome resources dedicated to Relation Extraction, one of the most important tasks in Natural Language Processing (NLP).
Stars: ✭ 656 (+1326.09%)
Mutual labels:  natural-language-processing, relation-extraction
Rex
REx: Relation Extraction. Modernized re-write of the code in the master's thesis: "Relation Extraction using Distant Supervision, SVMs, and Probabalistic First-Order Logic"
Stars: ✭ 21 (-54.35%)
Mutual labels:  natural-language-processing, relation-extraction
Ml Classify Text Js
Machine learning based text classification in JavaScript using n-grams and cosine similarity
Stars: ✭ 38 (-17.39%)
Mutual labels:  natural-language-processing
Bbw
Semantic annotator: Matching CSV to a Wikibase instance (e.g., Wikidata) via Meta-lookup
Stars: ✭ 42 (-8.7%)
Mutual labels:  relation-extraction
Reading comprehension tf
Machine Reading Comprehension in Tensorflow
Stars: ✭ 37 (-19.57%)
Mutual labels:  natural-language-processing
Gsoc2018 3gm
💫 Automated codification of Greek Legislation with NLP
Stars: ✭ 36 (-21.74%)
Mutual labels:  natural-language-processing
Style Transfer In Text
Paper List for Style Transfer in Text
Stars: ✭ 1,030 (+2139.13%)
Mutual labels:  natural-language-processing
Rebiber
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
Stars: ✭ 1,005 (+2084.78%)
Mutual labels:  natural-language-processing
Understanding Financial Reports Using Natural Language Processing
Investigate how mutual funds leverage credit derivatives by studying their routine filings to the SEC using NLP techniques 📈🤑
Stars: ✭ 36 (-21.74%)
Mutual labels:  natural-language-processing
Vale
📝 A syntax-aware linter for prose built with speed and extensibility in mind.
Stars: ✭ 978 (+2026.09%)
Mutual labels:  natural-language-processing

EXEMPLAR

EXEMPLAR is an open relation extraction system originating from a research project at the University of Alberta. Relation extraction is the task of, given a text corpus, identifying relations (e.g., acquisition, spouse, employment) among named entities (e.g., people, organizations). While traditional systems are limited to the relations predetermined by the user, open relation extraction systems like EXEMPLAR are able to identify instances of any relation described in the text.

EXEMPLAR takes text files as input and extracts relations with two or more arguments. For instance, consider the following sentence:

NFL approves Falcons' new stadium in Atlanta. 

Given this sentence, EXEMPLAR extracts an instance of the relation "approve new stadium" whose arguments are "NFL", "Falcons" and "Atlanta".

Relation: approve new stadium
    SUBJ: NFL
    POBJ-OF: Falcons
	POBJ-IN: Atlanta

The role of an argument can be one of the following: SUBJ (subject), DOBJ (direct object) and POBJ (prepositional object). We often append the preposition of a POBJ argument to its role (e.g., "POBJ-IN" for preposition "in"). EXEMPLAR uses heuristics to choose a preposition for a POBJ argument whose preposition is implicit. This is the case for "Falcons" in the above example.

People

Building

Download all dependencies:

$ sh dependencies.sh 

Compile and build jar with all dependencies:

$ sh build.sh 

Running

$ sh exemplar.sh [options] <input> <output>
   -b,--benchmark <arg>   expects input to be a benchmark file (arg = binary | nary)
   -h,--help              shows this message
   -p,--parser <arg>      defines which parser to use (arg = stanford | malt)
  • input: path to the document file or directory containing the document files. The tool will recursively look for .txt files in subdirectories.
  • output: path to the file where the triples will be stored.
  • benchmark option: assumes the input file is formatted according to the ground truth files of our benchmarks for open relation extraction (click to download). These benchmarks are discussed in the paper entitled "Effectiveness and Efficiency of Open Relation Extraction" (see reference in the 'Citing' section). Valid values for this option are 'binary' (for binary relation extraction) and 'nary' (for nary relation extraction). If this option is not provided, the system assumes the input file contains plain text.
  • parser option: defines which dependency parser to use. Valid values are 'stanford' and 'malt'. If this option is not provided, the system will use the Malt parser.

Sample Output

The output file contains one relation per line. Fields are separated by a tab in the following order: Subjects, Relation, Objects, Normalized Relation and Sentence. This is the output for our example:

SUBJ:NFL#ORG <tab> approves new stadium <tab> POBJ-OF:Falcons#ORG,,POBJ-IN:Atlanta <tab> approve new stadium <tab> NFL approves Falcons ' new stadium in Atlanta .

The suffix in each argument corresponds to its type. Possible types are person (PER), organization (ORG), location (LOC) and miscellaneous (MISC). Subjects and objects are separated by double comma (",,"), if more than one exists.

Libraries

The main libraries used in this tool are:

  • Stanford Parser: tokenization, lemmatization, part-of-speech tagging, named entity recognition and dependency parsing.
  • Malt Parser: dependency parsing.

Citing

If you use this code in your research, please acknowledge that by citing:

@INPROCEEDINGS { mesquita-schmidek-barbosa:2013:EMNLP, 
	AUTHOR = { Filipe Mesquita and Jordan Schmidek and Denilson Barbosa }, 
	BOOKTITLE = { Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing }, 
	MONTH = { October }, PAGES = { 447--457 }, 
	PUBLISHER = { Association for Computational Linguistics }, 
	TITLE = { Effectiveness and Efficiency of Open Relation Extraction }, 
	PDF = { http://www.aclweb.org/anthology/D13-1043 }, 
	YEAR = { 2013 }
} 

Acknowledgements

This work was primarily funded by the NSERC Business Intelligence Network (BIN).

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].