All Projects → liaoziyang → Openie Spider

liaoziyang / Openie Spider

Licence: mit
Extract Information from web corpus using Open Information Extraction.

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Openie Spider

React Apollo Decorators
Better decorators for Apollo and React
Stars: ✭ 39 (-77.59%)
Mutual labels:  fragments
Androiddynamicloader
A plugin system that runs like a browser, but instead of load web pages, it load apk plugins which runs natively on Android system.
Stars: ✭ 1,437 (+725.86%)
Mutual labels:  fragments
Scene
Android Single Activity Applications framework without Fragment.
Stars: ✭ 1,793 (+930.46%)
Mutual labels:  fragments
Smsretrieverapimaster
Automatic SMS Verification with the SMS Retriever API
Stars: ✭ 48 (-72.41%)
Mutual labels:  fragments
Simpledialogfragments
A collection of easy to use and extendable DialogFragment's for Android
Stars: ✭ 94 (-45.98%)
Mutual labels:  fragments
Nested Fragments
Samples of nested fragments in various widgets (TabHost, ViewPager)
Stars: ✭ 115 (-33.91%)
Mutual labels:  fragments
React Apollo Defragment
💿 Automatic query defragmentation based on React trees.
Stars: ✭ 14 (-91.95%)
Mutual labels:  fragments
Android Cheat Sheet
Cheat Sheet for Android Interviews
Stars: ✭ 1,891 (+986.78%)
Mutual labels:  fragments
Fragnav
An Android library for managing multiple stacks of fragments
Stars: ✭ 1,379 (+692.53%)
Mutual labels:  fragments
Flowr
FlowR is a wrapper class around the Fragment Manager.
Stars: ✭ 123 (-29.31%)
Mutual labels:  fragments
Material About Library
Makes it easy to create beautiful about screens for your apps
Stars: ✭ 1,099 (+531.61%)
Mutual labels:  fragments
Bottomnavigation
A sample app for Bottom Navigation View with ViewPager in Android
Stars: ✭ 94 (-45.98%)
Mutual labels:  fragments
Tailor
A streaming layout service for front-end microservices
Stars: ✭ 1,640 (+842.53%)
Mutual labels:  fragments
Simple Stack
[ACTIVE] Simple Stack, a backstack library / navigation framework for simpler navigation and state management (for fragments, views, or whatevers).
Stars: ✭ 1,012 (+481.61%)
Mutual labels:  fragments
Catchup
An app for catching up on things.
Stars: ✭ 1,690 (+871.26%)
Mutual labels:  fragments
Blog Fragments 2017
Blog post regarding android fragments in 2017. Sample includes fragment-less architecture
Stars: ✭ 34 (-80.46%)
Mutual labels:  fragments
Tieguanyin
Activity Builder.
Stars: ✭ 113 (-35.06%)
Mutual labels:  fragments
Expandingpager
ExpandingPager is a card peek/pop controller
Stars: ✭ 1,906 (+995.4%)
Mutual labels:  fragments
Verticalslidefragment
vertical slide to switch to the next fragment page, looks like vertical viewpager
Stars: ✭ 1,615 (+828.16%)
Mutual labels:  fragments
Flownav
Annotation processor that provides better navigation on android multi-modules projects 🛳.
Stars: ✭ 122 (-29.89%)
Mutual labels:  fragments

Stanford-OpenIE-Spider

Extract Information from WebCorpus using Stanford Open Information Extraction.

About Stanford IE

Open information extraction (open IE) refers to the extraction of structured relation triples from plain text, such that the schema for these relations does not need to be specified in advance. For example, Barack Obama was born in Hawaii would create a triple (Barack Obama; was born in; Hawaii), corresponding to the open domain relation "was born in". This software is a Java implementation of an open IE system as described in the paper:

Gabor Angeli, Melvin Johnson Premkumar, and Christopher D. Manning. Leveraging Linguistic Structure For Open Domain Information Extraction. In Proceedings of the Association of Computational Linguistics (ACL), 2015. The system first splits each sentence into a set of entailed clauses. Each clause is then maximally shortened, producing a set of entailed shorter sentence fragments. These fragments are then segmented into OpenIE triples, and output by the system.

More information can be found here : http://nlp.stanford.edu/software/openie.html

About Open Information Extraction(http://openie.allenai.org/)

This is a web service that implement information extraction feature from web corpus using Stanford IE. We can search the relation in the web from this website.

Usage

First of all, please make sure python is installed.

python --version

Install scrapy.

sudo pip install scrapy

You can also see the offical document from here. https://doc.scrapy.org/en/latest/intro/install.html

Install beautifulsoup4.

sudo pip install beautifulsoup4

Copy code into local.

git clone [email protected]:liaoziyang/Stanford-OpenIE-Spider.git
cd Stanford-OpenIE-Spider

Argument

Require at least one parameter below, using -a option.

  • arg1: Noun in the left side of the relationship. Default null.
  • rel: The relationship. Default null.
  • arg2: Noun in the right side of the relationship. Default null.

And you can write the result into file by using -o option.

Example

Extract the information "What kills bacteria?" from the web corpus.

scrapy runspider -a rel=kills -a args2=bacteria openie_spider.py -o result.json

And the result in result.json is like: key is the string representation of the result and the value is the frequency of the result appeared in the web corpus.

[
{"Antibiotic": 165},
{"Chlorine": 76},
{"Water": 59},
{"Benzoyl peroxide": 45},
{"Heat": 40},
{"Antiseptic": 38},
{"Pasteurization": 35},
{"Cooking": 34},
{"Vinegar": 34},
{"Honey": 28},
{"Tea tree oil": 24},
{"Ultraviolet": 24},
{"Alcohol": 24},
{"The process": 21},
{"This drug": 21},
{"the oil": 19},
{"the skin": 17},
{"Food": 16},
{"chemicals": 15},
{"Cell (biology)": 14},
{"Isoniazid": 2},
{"sebum production": 2},
{"the UV light feature": 2},
{"Lidocaine": 2},
{"Patent-pending germicidal chamber": 2},
{"Titanium dioxide": 2},
{"shedding": 2},
{"Gut flora": 2},
{"six antiseptic agents": 2},
{"The warm salt water": 2},
{"Chlorhexidine": 2},
{"bitter component": 2},
{"a small anti-microbial bomb": 2},
{"an herb": 2},
{"Cotton": 2},
{"a day": 2},
{"the microwave": 2},
{"cancer cells": 2},
{"Food irradiation": 2},
{"Virus": 2},
{"Studies": 2},
{"Polymer": 2},
{"Active ingredient": 2},
{"Electron": 2},
{"Freezing": 2},
{"Fahrenheit": 2},
{"These peptides": 2},
{"Clarithromycin": 2},
{"redness": 2},
{"Saltwater": 2},
{"BP": 2},
{"White blood cell": 2},
{"a salve": 2},
{"the soap": 2},
{"Spice": 2},
{"Egg (food)": 2},
{"most toiletries": 2},
{"both": 2},
{"Neem": 2},
{"a role": 2},
{"Chamomile": 2},
{"Clindamycin": 2},
{"the formation": 2},
{"humans": 2},
{"three minutes": 2},
{"Sparfloxacin": 2},
{"one minute": 2},
{"Burning": 2},
{"High heat": 2},
{"A sanitizer test kit": 2},
{"Air purifier": 2},
{"the teeth and gums": 2},
{"Intestine": 2},
{"Concentration": 2},
{"Juice": 2},
{"Your goal": 2},
{"Sodium": 2},
{"Zinc": 2},
{"The face": 3},
{"Swimming pool": 3},
{"Vancomycin": 3},
{"Alcoholic beverage": 3},
{"Urine": 3},
{"Nitric oxide": 3},
{"Fever": 3},
{"This combination": 3},
{"The wine": 3},
{"the heart": 3},
{"Electricity": 3},
{"lemon": 3},
{"soothes": 3},
{"Phagocyte": 3},
{"devices": 3},
{"The Aprilaire 5000 Air Cleaner": 3},
{"Tanning bed": 2},
{"dirt and dust": 2},
{"These rinses": 2},
{"This medication": 2},
{"Bacteriophage": 3},
{"Toothpaste": 3},
{"chemicals and particles": 3},
{"Cell wall": 3},
{"Amoxicillin": 3},
{"Blood pressure": 3},
{"Miphil": 3},
{"Roasting": 3},
{"plaque": 3},
{"all": 3},
{"the tight-lidded pot": 3},
{"UV germicidal light": 3},
{"The idea": 3},
{"Azelaic acid": 3},
{"energy": 3},
{"Sulfur": 3},
{"Ammonium hydroxide": 3},
{"The purpose": 3},
{"these beds": 3},
{"160 degrees": 3},
{"Tea": 4},
{"Macrophage": 4},
{"Apple cider vinegar": 4},
{"Sodium bicarbonate": 4},
{"Additives": 4},
{"Acne vulgaris": 4},
{"Stomach": 4},
{"Antibacterial soap": 4},
{"Citric acid": 4},
{"15 seconds": 4},
{"Infection": 4},
{"Iodine": 4},
{"Disinfectant": 4},
{"Bactericidal antibiotics": 4},
{"Coconut oil": 4},
{"the same time": 4},
{"formula": 4},
{"Levofloxacin": 4},
{"The Lampe Berger": 4},
{"Nanoparticle": 3},
{"Saliva": 6},
{"Ion": 6},
{"the acidity": 6},
{"Odor": 6},
{"your body": 5},
{"Acai": 5},
{"Hair": 5},
{"Protein": 5},
{"Meat": 5},
{"Cabbage": 5},
{"Clothes dryer": 5},
{"Sebaceous gland": 5},
{"Lemon juice": 5},
{"Laser": 5},
{"Water filter": 5},
{"Wound": 5},
{"action": 5},
{"Antimicrobial": 4},
{"Antibody": 4},
{"Aloe": 4},
{"Immune system": 9},
{"Inflammation": 9},
{"Ingredient": 8},
{"Bactericide": 8},
{"Toxin": 8},
{"These medicines": 8},
{"Tap water": 8},
{"Neutrophil granulocyte": 7},
{"Salicylic acid": 7},
{"Colloidal silver": 7},
{"the treatment": 6},
{"Sunlight": 6},
{"Copper": 6},
{"Two-week triple therapy": 6},
{"Essential oil": 6},
{"Chemotherapy": 6},
{"the solution": 6},
{"Penicillin": 6},
{"Radical (chemistry)": 6},
{"Hydrogen peroxide": 6},
{"the sun": 14},
{"Oxygen": 14},
{"the light": 14},
{"the product": 14},
{"Irradiation": 12},
{"pores": 12},
{"Salt": 12},
{"Bleach (manga)": 11},
{"Temperature": 11},
{"Silver": 11},
{"NOT": 11},
{"the ability": 11},
{"Ozone": 11},
{"Milk": 10},
{"Bleach": 10},
{"Garlic": 9},
{"the growth": 9},
{"numerous active agents": 9},
{"Boiling": 9},
{"the steam": 9}
]
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].