All Projects → fhamborg → Giveme5W

fhamborg / Giveme5W

Licence: Apache-2.0 License
Extraction of the five journalistic W-questions (5W) from news articles

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Giveme5W

Giveme5w1h
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Stars: ✭ 316 (+1875%)
Mutual labels:  news, text-analysis, question-answering, nlp-library
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+41500%)
Mutual labels:  news, question-answering
Go Interview
Collection of Technical Interview Questions solved with Go
Stars: ✭ 3,597 (+22381.25%)
Mutual labels:  question, answer
Chat
基于自然语言理解与机器学习的聊天机器人,支持多用户并发及自定义多轮对话
Stars: ✭ 516 (+3125%)
Mutual labels:  qa, question-answering
DocED
Source code for the ACL 2021 paper "MLBiNet: A Cross-Sentence Collective Event Detection Network ".
Stars: ✭ 18 (+12.5%)
Mutual labels:  event-detection, event-extraction
Farm
🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Stars: ✭ 1,140 (+7025%)
Mutual labels:  question-answering, nlp-library
Nlu sim
all kinds of baseline models for sentence similarity 句子对语义相似度模型
Stars: ✭ 286 (+1687.5%)
Mutual labels:  qa, question-answering
event-extraction-paper
Papers from top conferences and journals for event extraction in recent years
Stars: ✭ 54 (+237.5%)
Mutual labels:  event-detection, event-extraction
TradeTheEvent
Implementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (+300%)
Mutual labels:  news, event-detection
Question answering models
This repo collects and re-produces models related to domains of question answering and machine reading comprehension
Stars: ✭ 139 (+768.75%)
Mutual labels:  qa, question-answering
DeepEE
DeepEE: Deep Event Extraction Algorithm Gallery (基于深度学习的开源中文事件抽取算法汇总)
Stars: ✭ 24 (+50%)
Mutual labels:  event-detection, event-extraction
extractnet
A Dragnet that also extract author, headline, date, keywords from context
Stars: ✭ 52 (+225%)
Mutual labels:  news, news-articles
Chinese-Psychological-QA-DataSet
中文心理问答数据集
Stars: ✭ 23 (+43.75%)
Mutual labels:  qa, question-answering
Tableqa
AI Tool for querying natural language on tabular data.
Stars: ✭ 109 (+581.25%)
Mutual labels:  qa, question-answering
dialogbot
dialogbot, provide search-based dialogue, task-based dialogue and generative dialogue model. 对话机器人,基于问答型对话、任务型对话、聊天型对话等模型实现,支持网络检索问答,领域知识问答,任务引导问答,闲聊问答,开箱即用。
Stars: ✭ 96 (+500%)
Mutual labels:  qa, question-answering
KrantikariQA
An InformationGain based Question Answering over knowledge Graph system.
Stars: ✭ 54 (+237.5%)
Mutual labels:  qa, question-answering
WikiTableQuestions
A dataset of complex questions on semi-structured Wikipedia tables
Stars: ✭ 81 (+406.25%)
Mutual labels:  question-answering
Stargraph
StarGraph (aka *graph) is a graph database to query large Knowledge Graphs. Playing with Knowledge Graphs can be useful if you are developing AI applications or doing data analysis over complex domains.
Stars: ✭ 24 (+50%)
Mutual labels:  question-answering
Medi-CoQA
Conversational Question Answering on Clinical Text
Stars: ✭ 22 (+37.5%)
Mutual labels:  question-answering
watchman
Watchman: An open-source social-media event-detection system
Stars: ✭ 18 (+12.5%)
Mutual labels:  event-detection

Announcement

We recommend using our new, extended 5W extraction system Giveme5W1H, which has various advantages, most importantly: (1) better extraction performance, (2) extraction of the 'how' question, and (3) easier installation. The Giveme5W repository (the one that contains this page) is not maintained any longer.

Click here to go to Giveme5W1H

Giveme5W

Giveme5W is an open-source system to extract answers to the five journalistic W questions (5Ws). The 5Ws describe the main event of a news articles, i.e., who did what, when, where, and why. Giveme5W can be accessed by other software as a Python library and via a RESTful API. The extraction performance is p=0.7.

Note that we currently work on an improved version of Giveme5W, which will be available here very soon.

Getting started

Installation

The following steps setup Giveme5W on a Linux system. If you are using MacOS, see the installation wiki.

  1. Clone the repository
  2. Stanford NER: Download version stanford-ner-2015-12-09 from the Stanford NER website (the tool was tested with stanford-ner-2015-12-09, other versions may work as well)
  3. Unzip its contents into /Giveme5W/extractor/resources (afterward, /Giveme5W/extractor/resources/stanford-ner-2015-12-09 needs to exist)
  4. pip3 install -r requirements.txt

Use within your own code

Invoking Giveme5W is straightforward - it only requires a couple of lines of codes, and Giveme5W takes care of the rest! If you want to extract the 5Ws from a single article, run the following code.

from extractor.document import Document
from extractor.five_w_extractor import FiveWExtractor

extractor = FiveWExtractor()

document = Document(articletext)
extractor.parse(document)

Note that while Giveme5W allows you to just put in an article's whole text (including or excluding the headline), you can also separately pass the headline, lead paragraph (which we call description), and main text.

document = Document(title, description, text)

Afterward, you can access the questions and their answers, e.g.:

import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(document.questions)

Example script

Giveme5W also includes an example script that runs out of the box.

$ python3 examples/fivew_single_article.py

Access via RESTful API

Giveme5W provides a RESTful API to which you can pass a news article. First, start the server script.

$ python3 examples/server.py

Starting up the server takes a few seconds. Once the server is running, you can send GET and POST requests to http://localhost:5000/extract. Simply pass a single JSON object that needs to contain the text of an article, i.e., a field named

  • articletext (the text of the article including or excluding the headline)

Alternatively, you can also distinctively pass the headline, lead paragraph, and the full text:

  • title
  • description (the lead paragraph)
  • text (the remainder of the text)

Giveme5W natively supports articles extracted by our news crawler and extractor news-please.

How to cite

If you are using Giveme5W, please cite our paper (ResearchGate, Mendeley):

@InProceedings{Hamborg2018,
  author    = {Hamborg, Felix and Lachnit, Soeren and Schubotz, Moritz and Hepp, Thomas and Gipp, Bela},
  title     = {Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions},
  booktitle = {Proceedings of the iConference 2018},
  year      = {2018},
  month     = {March},
  location  = {Sheffield, UK},
  url       = {https://doi.org/10.1007/978-3-319-78105-1_39},
  doi       = {10.1007/978-3-319-78105-1_39}
}

You can find more information on this and other news projects on our website.

Contribution and support

Do you want to contribute? Great, we are always happy for any support on this project! Just send a pull request. By contributing to this project, you agree that your contributions will be licensed under the project's license (see below). If you have questions or issues while working on the code, e.g., when implementing a new feature that you would like to have added to Giveme5W, open an issue on GitHub and we'll be happy to help you. Please note that we usually do not have enough resources to implement features requested by users - instead we recommend to implement them yourself, and send a pull request.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use Giveme5W except in compliance with the License. A copy of the License is included in the project, see the file LICENSE.txt.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

The Giveme5W logo is courtesy of Mario Hamborg.

Copyright 2017-2018 The Giveme5W team

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].