All Projects β†’ jenojp β†’ Negspacy

jenojp / Negspacy

Licence: mit
spaCy pipeline object for negating concepts in text

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Negspacy

Sense2vec
πŸ¦† Contextually-keyed word vectors
Stars: ✭ 1,184 (+630.86%)
Mutual labels:  spacy
Pytextrank
Python implementation of TextRank for phrase extraction and summarization of text documents
Stars: ✭ 1,675 (+933.95%)
Mutual labels:  spacy
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+1053.09%)
Mutual labels:  spacy
Dframcy
Dataframe Integration with spaCy.
Stars: ✭ 74 (-54.32%)
Mutual labels:  spacy
Jupyterlab Prodigy
🧬 A JupyterLab extension for annotating data with Prodigy
Stars: ✭ 97 (-40.12%)
Mutual labels:  spacy
Spacy Dev Resources
πŸ’« Scripts, tools and resources for developing spaCy
Stars: ✭ 123 (-24.07%)
Mutual labels:  spacy
Dragonfire
the open-source virtual assistant for Ubuntu based Linux distributions
Stars: ✭ 1,120 (+591.36%)
Mutual labels:  spacy
Spacymoji
πŸ’™ Emoji handling and meta data for spaCy with custom extension attributes
Stars: ✭ 151 (-6.79%)
Mutual labels:  spacy
Lemminflect
A python module for English lemmatization and inflection.
Stars: ✭ 105 (-35.19%)
Mutual labels:  spacy
Rasa
πŸ’¬ Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Stars: ✭ 13,219 (+8059.88%)
Mutual labels:  spacy
Spacy Graphql
πŸ€Ήβ€β™€οΈ Query spaCy's linguistic annotations using GraphQL
Stars: ✭ 81 (-50%)
Mutual labels:  spacy
Tageditor
πŸ–TagEditor - Annotation tool for spaCy
Stars: ✭ 92 (-43.21%)
Mutual labels:  spacy
Ner Annotator
Named Entity Recognition (NER) Annotation tool for SpaCy. Generates Traning Data as a JSON which can be readily used.
Stars: ✭ 127 (-21.6%)
Mutual labels:  spacy
Python nlp tutorial
This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Stars: ✭ 72 (-55.56%)
Mutual labels:  spacy
Wheelwright
🎑 Automated build repo for Python wheels and source packages
Stars: ✭ 148 (-8.64%)
Mutual labels:  spacy
Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+598.77%)
Mutual labels:  spacy
Spacy Js
πŸŽ€ JavaScript API for spaCy with Python REST API
Stars: ✭ 123 (-24.07%)
Mutual labels:  spacy
Spacy Wordnet
spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
Stars: ✭ 156 (-3.7%)
Mutual labels:  spacy
Spacy Course
πŸ‘©β€πŸ« Advanced NLP with spaCy: A free online course
Stars: ✭ 1,920 (+1085.19%)
Mutual labels:  spacy
Textacy
NLP, before and after spaCy
Stars: ✭ 1,849 (+1041.36%)
Mutual labels:  spacy

negspacy: negation for spaCy

Build Status Built with spaCy pypi Version DOI Code style: black

spaCy pipeline object for negating concepts in text. Based on the NegEx algorithm.

NegEx - A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries Chapman, Bridewell, Hanbury, Cooper, Buchanan https://doi.org/10.1006/jbin.2001.1029

What's new

Version 1.0 is a major version update providing support for spaCy 3.0's new interface for adding pipeline components. As a result, it is not backwards compatible with previous versions of negspacy.

If your project uses spaCy 2.3.5 or earlier, you will need to use version 0.1.9. See archived readme.

Installation and usage

Install the library.

pip install negspacy

Import library and spaCy.

import spacy
from negspacy.negation import Negex

Load spacy language model. Add negspacy pipeline object. Filtering on entity types is optional.

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("negex", config={"ent_types":["PERSON","ORG"]})

View negations.

doc = nlp("She does not like Steve Jobs but likes Apple products.")

for e in doc.ents:
	print(e.text, e._.negex)
Steve Jobs True
Apple False

Consider pairing with scispacy to find UMLS concepts in text and process negations.

NegEx Patterns

  • psuedo_negations - phrases that are false triggers, ambiguous negations, or double negatives
  • preceding_negations - negation phrases that precede an entity
  • following_negations - negation phrases that follow an entity
  • termination - phrases that cut a sentence in parts, for purposes of negation detection (.e.g., "but")

Termsets

Designate termset to use, en_clinical is used by default.

  • en = phrases for general english language text
  • en_clinical DEFAULT = adds phrases specific to clinical domain to general english
  • en_clinical_sensitive = adds additional phrases to help rule out historical and possibly irrelevant entities

To set:

from negspacy.negation import Negex
from negspacy.termsets import termset

ts = termset("en")

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe(
    "negex",
    config={
        "neg_termset":ts.get_patterns()
    }
)

Additional Functionality

Change patterns or view patterns in use

Replace all patterns with your own set

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe(
    "negex", 
    config={
        "neg_termset":{
            "pseudo_negations": ["might not"],
            "preceding_negations": ["not"],
            "following_negations":["declined"],
            "termination": ["but","however"]
        }
    }
    )

Add and remove individual patterns on the fly from built-in termsets

from negspacy.termsets import termset
ts = termset("en")
ts.add_patterns({
            "pseudo_negations": ["my favorite pattern"],
            "termination": ["these are", "great patterns", "but"],
            "preceding_negations": ["wow a negation"],
            "following_negations": ["extra negation"],
        })
#OR
ts.remove_patterns(
        {
            "termination": ["these are", "great patterns"],
            "pseudo_negations": ["my favorite pattern"],
            "preceding_negations": ["denied", "wow a negation"],
            "following_negations": ["unlikely", "extra negation"],
        }
    )

View patterns in use

from negspacy.termsets import termset
ts = termset("en_clinical")
print(ts.get_patterns())

Negations in noun chunks

Depending on the Named Entity Recognition model you are using, you may have negations "chunked together" with nouns. For example:

nlp = spacy.load("en_core_sci_sm")
doc = nlp("There is no headache.")
for e in doc.ents:
    print(e.text)

# no headache

This would cause the Negex algorithm to miss the preceding negation. To account for this, you can add a chunk_prefix:

nlp = spacy.load("en_core_sci_sm")
ts = termset("en_clinical")
nlp.add_pipe(
    "negex",
    config={
        "chunk_prefix": ["no"],
    },
    last=True,
)
doc = nlp("There is no headache.")
for e in doc.ents:
    print(e.text, e._.negex)

# no headache True

Contributing

contributing

Authors

  • Jeno Pizarro

License

license

Other libraries

This library is featured in the spaCy Universe. Check it out for other useful libraries and inspiration.

If you're looking for a spaCy pipeline object to extract values that correspond to a named entity (e.g., birth dates, account numbers, or laboratory results) take a look at extractacy.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].