All Projects → philipperemy → Stanford Openie Python

philipperemy / Stanford Openie Python

Licence: isc
Stanford Open Information Extraction made simple!

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Stanford Openie Python

Stanford-NER-Python
Stanford Named Entity Recognizer (NER) - Python Wrapper
Stars: ✭ 63 (-81.9%)
Mutual labels:  extraction, python-wrapper, stanford
PyStanfordNLP
A Python Wrapper of Stanford Chinese Segmenter
Stars: ✭ 17 (-95.11%)
Mutual labels:  python-wrapper, stanford
FineGrainedVisualRecognition
Fine grained visual recognition tensorflow baseline on CUB, Stanford Cars, Dogs, Aircrafts, and Flower102.
Stars: ✭ 19 (-94.54%)
Mutual labels:  stanford
Pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Stars: ✭ 297 (-14.66%)
Mutual labels:  python-wrapper
music220a
The code examples for Music 220A
Stars: ✭ 51 (-85.34%)
Mutual labels:  stanford
ro.py
ro.py is a modern, asynchronous Python 3 wrapper for the Roblox API.
Stars: ✭ 65 (-81.32%)
Mutual labels:  python-wrapper
trainline-python
Non-official Python wrapper and CLI tool for Trainline
Stars: ✭ 41 (-88.22%)
Mutual labels:  python-wrapper
PySophus
Python bindings for Sophus Lie Algebra C++ Library
Stars: ✭ 52 (-85.06%)
Mutual labels:  python-wrapper
Unrpa
A program to extract files from the RPA archive format.
Stars: ✭ 313 (-10.06%)
Mutual labels:  extraction
AutoIt-Ripper
Extract AutoIt scripts embedded in PE binaries
Stars: ✭ 101 (-70.98%)
Mutual labels:  extraction
Stanford Algs
Example Test Cases for Stanford's Algorithms Coursera Specialization
Stars: ✭ 261 (-25%)
Mutual labels:  stanford
zauberlehrling
Collection of tools and ideas for splitting up big monolithic PHP applications in smaller parts.
Stars: ✭ 28 (-91.95%)
Mutual labels:  extraction
baresipy
baresip python wrapper
Stars: ✭ 16 (-95.4%)
Mutual labels:  python-wrapper
coq-simple-io
IO for Gallina
Stars: ✭ 21 (-93.97%)
Mutual labels:  extraction
Uritemplate
PHP URI Template (RFC 6570) supports both URI expansion & extraction
Stars: ✭ 310 (-10.92%)
Mutual labels:  extraction
H2PC TagExtraction
A application made to extract assets from cache files of H2v using BlamLib by KornnerStudios.
Stars: ✭ 12 (-96.55%)
Mutual labels:  extraction
polisci150b
Machine Learning for Social Science
Stars: ✭ 56 (-83.91%)
Mutual labels:  stanford
Cs231
Complete Assignments for CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 317 (-8.91%)
Mutual labels:  stanford
Api
Vulners Python API wrapper
Stars: ✭ 313 (-10.06%)
Mutual labels:  python-wrapper
tabula-sharp
Extract tables from PDF files (port of tabula-java)
Stars: ✭ 38 (-89.08%)
Mutual labels:  extraction

Python3 wrapper for Stanford OpenIE

Stanford NLP Wrapper CI

Open information extraction (open IE) refers to the extraction of structured relation triples from plain text, such that the schema for these relations does not need to be specified in advance. For example, Barack Obama was born in Hawaii would create a triple (Barack Obama; was born in; Hawaii), corresponding to the open domain relation "was born in". CoreNLP is a Java implementation of an open IE system as described in the paper:

More information can be found here : http://nlp.stanford.edu/software/openie.html

The OpenIE library is only available in english: https://stanfordnlp.github.io/CoreNLP/human-languages.html

Installation

pip install stanford_openie

Example

from openie import StanfordOpenIE

with StanfordOpenIE() as client:
    text = 'Barack Obama was born in Hawaii. Richard Manning wrote this sentence.'
    print('Text: %s.' % text)
    for triple in client.annotate(text):
        print('|-', triple)

    graph_image = 'graph.png'
    client.generate_graphviz_graph(text, graph_image)
    print('Graph generated: %s.' % graph_image)

    with open('corpus/pg6130.txt', 'r', encoding='utf8') as r:
        corpus = r.read().replace('\n', ' ').replace('\r', '')

    triples_corpus = client.annotate(corpus[0:50000])
    print('Corpus: %s [...].' % corpus[0:80])
    print('Found %s triples in the corpus.' % len(triples_corpus))
    for triple in triples_corpus[:3]:
        print('|-', triple)

Expected output

|- {'subject': 'Barack Obama', 'relation': 'was', 'object': 'born'}
|- {'subject': 'Barack Obama', 'relation': 'was born in', 'object': 'Hawaii'}
|- {'subject': 'Richard Manning', 'relation': 'wrote', 'object': 'sentence'}
Graph generated: graph.png.
Corpus: According to this document, the city of Cumae in Ćolia, was, at an early period [...].
Found 1664 triples in the corpus.
|- {'subject': 'city', 'relation': 'is in', 'object': 'Ćolia'}
|- {'subject': 'Menapolus', 'relation': 'son of', 'object': 'Ithagenes'}
|- {'subject': 'Menapolus', 'relation': 'was Among', 'object': 'immigrants'}

It will generate a GraphViz DOT in graph.png:



Note: Make sure GraphViz is installed beforehand. Try to run the dot command to see if this is the case. If not, run sudo apt-get install graphviz if you're running on Ubuntu.

V1

Still available here v1.

References

Cite

@misc{StanfordOpenIEWrapper,
  author = {Philippe Remy},
  title = {Python wrapper for Stanford OpenIE},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/philipperemy/Stanford-OpenIE-Python}},
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].