Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → INK-USC → Usc Ds Relationextraction

INK-USC / Usc Ds Relationextraction

Licence: mit

Distantly Supervised Relation Extraction

Labels

machine-learning natural-language-processing relation-extraction information-extraction knowledgebase

Projects that are alternatives of or similar to Usc Ds Relationextraction

Gcn Over Pruned Trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (authors' PyTorch implementation)

Stars: ✭ 312 (-17.46%)

Mutual labels: natural-language-processing, relation-extraction, information-extraction

Oie Resources

A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.

Stars: ✭ 283 (-25.13%)

Mutual labels: natural-language-processing, relation-extraction, information-extraction

Tacred Relation

PyTorch implementation of the position-aware attention model for relation extraction

Stars: ✭ 271 (-28.31%)

Mutual labels: natural-language-processing, relation-extraction, information-extraction

PLE

Label Noise Reduction in Entity Typing (KDD'16)

Stars: ✭ 53 (-85.98%)

Mutual labels: information-extraction, knowledgebase

DocuNet

Code and dataset for the IJCAI 2021 paper "Document-level Relation Extraction as Semantic Segmentation".

Stars: ✭ 84 (-77.78%)

Mutual labels: information-extraction, relation-extraction

CogIE

CogIE: An Information Extraction Toolkit for Bridging Text and CogNet. ACL 2021

Stars: ✭ 47 (-87.57%)

Mutual labels: information-extraction, relation-extraction

Zamia Ai

Free and open source A.I. system based on Python, TensorFlow and Prolog.

Stars: ✭ 133 (-64.81%)

Mutual labels: knowledgebase, natural-language-processing

knowledge-graph-nlp-in-action

从模型训练到部署，实战知识图谱(Knowledge Graph)&自然语言处理(NLP)。涉及 Tensorflow, Bert+Bi-LSTM+CRF,Neo4j等涵盖 Named Entity Recognition,Text Classify,Information Extraction,Relation Extraction 等任务。

Stars: ✭ 58 (-84.66%)

Mutual labels: information-extraction, relation-extraction

IE Paper Notes

Paper notes for Information Extraction, including Relation Extraction (RE), Named Entity Recognition (NER), Entity Linking (EL), Event Extraction (EE), Named Entity Disambiguation (NED).

Stars: ✭ 14 (-96.3%)

Mutual labels: information-extraction, relation-extraction

Multiple Relations Extraction Only Look Once

Multiple-Relations-Extraction-Only-Look-Once. Just look at the sentence once and extract the multiple pairs of entities and their corresponding relations. 端到端联合多关系抽取模型，可用于 http://lic2019.ccf.org.cn/kg 信息抽取。

Stars: ✭ 269 (-28.84%)

Mutual labels: relation-extraction, information-extraction

Languagecrunch

LanguageCrunch NLP server docker image

Stars: ✭ 281 (-25.66%)

Mutual labels: natural-language-processing, relation-extraction

Medacy

🏥 Medical Text Mining and Information Extraction with spaCy

Stars: ✭ 287 (-24.07%)

Mutual labels: natural-language-processing, information-extraction

Open Entity Relation Extraction

Knowledge triples extraction and knowledge base construction based on dependency syntax for open domain text.

Stars: ✭ 350 (-7.41%)

Mutual labels: relation-extraction, information-extraction

ReQuest

Indirect Supervision for Relation Extraction Using Question-Answer Pairs (WSDM'18)

Stars: ✭ 26 (-93.12%)

Mutual labels: information-extraction, relation-extraction

InformationExtractionSystem

Information Extraction System can perform NLP tasks like Named Entity Recognition, Sentence Simplification, Relation Extraction etc.

Stars: ✭ 27 (-92.86%)

Mutual labels: information-extraction, relation-extraction

lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.

Stars: ✭ 75 (-80.16%)

Mutual labels: information-extraction, relation-extraction

PSPE

Pretrained Span and span Pair Encoder, code for "Pre-training Entity Relation Encoder with Intra-span and Inter-spanInformation.", EMNLP2020. It is based on our NERE toolkit (https://github.com/Receiling/NERE).

Stars: ✭ 17 (-95.5%)

Mutual labels: information-extraction, relation-extraction

Mitie

MITIE: library and tools for information extraction

Stars: ✭ 2,693 (+612.43%)

Mutual labels: natural-language-processing, information-extraction

Open Semantic Entity Search Api

Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names

Stars: ✭ 98 (-74.07%)

Mutual labels: knowledgebase, natural-language-processing

Aggcn

Attention Guided Graph Convolutional Networks for Relation Extraction (authors' PyTorch implementation for the ACL19 paper)

Stars: ✭ 318 (-15.87%)

Mutual labels: relation-extraction, information-extraction

View All Similar Projects ➔

USC Distantly-supervised Relation Extraction System

This repository puts together recent models and data sets for sentence-level relation extraction using knowledge bases (i.e., distant supervision). In particular, it contains the source code for WWW'17 paper CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases.

Please also check out our new repository on handling shifted label distribution in distant supervision

Task: Given a text corpus with entity mentions detected and heuristically labeled using distant supervision, the task aims to identify relation types/labels between a pair of entity mentions based on the sentence context where they co-occur.

Blog Posts

[08/2017] Indirect Supervision for Relation Extraction Using Question-Answer Pairs
[08/2016] Heterogeneous Supervision for Relation Extraction

Data

For evaluating on sentence-level extraction, we processed (using our data pipeline) three public datasets to our JSON format. We ran Stanford NER on training set to detect entity mentions, mapped entity names to Freebase entities using DBpediaSpotlight, aligned Freebase facts to sentences, and assign entity types of Freebase entities to their mapped names in sentences:

PubMed-BioInfer: 100k PubMed paper abstracts as training data and 1,530 manually labeled biomedical paper abstracts from BioInfer (Pyysalo et al., 2007) as test data. It consists of 94 relation types (protein-protein interactions) and over 2,000 entity types (from MESH ontology). (Download)
NYT-manual: 1.18M sentences sampled from 294K New York Times news articles which were then aligned with Freebase facts by (Riedel et al., ECML'10) (link to Riedel's data). For test set, 395 sentences are manually annotated with 24 relation types and 47 entity types (Hoffmann et al., ACL'11) (link to Hoffmann's data). (Download)
Wiki-KBP: the training corpus contains 1.5M sentences sampled from 780k Wikipedia articles (Ling & Weld, 2012) plus ~7,000 sentences from 2013 KBP corpus. Test data consists of 14k system-labeled sentences from 2013 KBP slot filling assessment results. It has 7 relation types and 126 entity types after filtering of numeric value relations. (Download)

Please put the data files in corresponding subdirectories under data/source

Benchmark

Performance comparison with several relation extraction systems over KBP 2013 dataset (sentence-level extraction).

Method	Precision	Recall	F1
Mintz (our implementation, Mintz et al., 2009)	0.296	0.387	0.335
LINE + Dist Sup (Tang et al., 2015)	0.360	0.257	0.299
MultiR (Hoffmann et al., 2011)	0.325	0.278	0.301
FCM + Dist Sup (Gormley et al., 2015)	0.151	0.498	0.300
HypeNet (our implementation, Shwartz et al., 2016)	0.210	0.315	0.252
CNN (our implementation, Zeng et at., 2014)	0.198	0.334	0.242
PCNN (our implementation, Zeng et at., 2015)	0.220	0.452	0.295
LSTM (our implementation)	0.274	0.500	0.350
Bi-GRU (our implementation)	0.301	0.465	0.362
SDP-LSTM (our implementation, Xu et at., 2015)	0.300	0.436	0.356
Position-Aware LSTM (Zhang et al., 2017)	0.265	0.598	0.367
CoType-RM (Ren et al., 2017)	0.303	0.407	0.347
CoType (Ren et al., 2017)	0.348	0.406	0.369

Note: for models that trained on sentences annotated with a single label (HypeNet, CNN/PCNN, LSTM, SDP/PA-LSTMs, Bi-GRU), we form one training instance for each sentence-label pair based on their DS-annotated data.

Usage

Dependencies

We will take Ubuntu for example.

python 2.7
Python library dependencies

$ pip install pexpect ujson tqdm

stanford coreNLP 3.7.0 and its python wrapper. Please put the library under `code/DataProcessor/'.

$ cd code/DataProcessor/
$ git clone [email protected]:stanfordnlp/stanza.git
$ cd stanza
$ pip install -e .
$ wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
$ unzip stanford-corenlp-full-2016-10-31.zip

eigen 3.2.5 (already included).

We have included compilied binaries. If you need to re-compile retype.cpp under your own g++ environment

$ cd code/Model/retype; make

Default Run

As an example, we show how to run CoType on the Wiki-KBP dataset

Start the Stanford corenlp server for the python wrapper.

$ java -mx4g -cp "code/DataProcessor/stanford-corenlp-full-2016-10-31/*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

Feature extraction, embedding learning on training data, and evaluation on test data.

$ ./run.sh

For relation classification, the "none"-labeled instances need to be first removed from train/test JSON files. The hyperparamters for embedding learning are included in the run.sh script.

Parameters

Dataset to run on.

Data="KBP"

Hyperparameters for relation extraction:

- KBP: -negative 3 -iters 400 -lr 0.02 -transWeight 1.0
- NYT: -negative 5 -iters 700 -lr 0.02 -transWeight 7.0
- BioInfer: -negative 5 -iters 700 -lr 0.02 -transWeight 7.0

Hyperparameters for relation classification are included in the run.sh script.

Evaluation

Evaluates relation extraction performance (precision, recall, F1): produce predictions along with their confidence score; filter the predicted instances by tuning the thresholds.

$ python code/Evaluation/emb_test.py extract KBP retype cosine 0.0
$ python code/Evaluation/tune_threshold.py extract KBP emb retype cosine

In-text Prediction

The last command in run.sh generates json file for predicted results, in the same format as test.json in data/source/$DATANAME, except that we only output the predicted relation mention labels. Replace the second parameter with whatever threshold you would like.

$ python code/Evaluation/convertPredictionToJson.py $Data 0.0

Customized Run

Code for producing the JSON files from a raw corpus for running CoType and baseline models is here.

Baselines

You can find our implementation of some recent relation extraction models under the Code/Model/ directory.

References

Xiang Ren, Zeqiu Wu, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, Tarek F. Abdelzaher, Jiawei Han. "CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases", WWW 2017.
Meng Qu, Xiang Ren, Yu Zhang, Jiawei Han. “Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning”, WWW 2018.
Liyuan Liu*, Xiang Ren*, Qi Zhu, Shi Zhi, Huan Gui, Heng Ji, Jiawei Han. "Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach", EMNLP 2017.
Ellen Wu, Xiang Ren, Frank Xu, Ji Li, Jiawei Han. "Indirect Supervision for Relation Extraction using Question-Answer Pairs", WSDM 2018.

Contributors

Ellen Wu
Meng Qu
Frank Xu
Wenqi He
Maosen Zhang
Qinyuan Ye
Xiang Ren

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 378

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗