All Projects → mholtzscher → spacy_readability

mholtzscher / spacy_readability

Licence: MIT license
spaCy pipeline component for adding text readability meta data to Doc objects.

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Labels

Projects that are alternatives of or similar to spacy readability

prodigy-scratch
Prodigy thing(z)
Stars: ✭ 13 (-75.93%)
Mutual labels:  spacy
extractacy
Spacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)
Stars: ✭ 47 (-12.96%)
Mutual labels:  spacy
spacy conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
Stars: ✭ 60 (+11.11%)
Mutual labels:  spacy
spacy-dbpedia-spotlight
A spaCy wrapper for DBpedia Spotlight
Stars: ✭ 85 (+57.41%)
Mutual labels:  spacy
lemmy
🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪
Stars: ✭ 68 (+25.93%)
Mutual labels:  spacy
NER-and-Linking-of-Ancient-and-Historic-Places
An NER tool for ancient place names based on Pleiades and Spacy.
Stars: ✭ 26 (-51.85%)
Mutual labels:  spacy
DrFAQ
DrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Stars: ✭ 29 (-46.3%)
Mutual labels:  spacy
spacy-iwnlp
German lemmatization with IWNLP as extension for spaCy
Stars: ✭ 22 (-59.26%)
Mutual labels:  spacy
bisemantic
Text pair classification
Stars: ✭ 12 (-77.78%)
Mutual labels:  spacy
NLP Quickbook
NLP in Python with Deep Learning
Stars: ✭ 516 (+855.56%)
Mutual labels:  spacy
replaCy
spaCy match and replace, maintaining conjugation
Stars: ✭ 29 (-46.3%)
Mutual labels:  spacy
spacy-langdetect
A fully customisable language detection pipeline for spaCy
Stars: ✭ 86 (+59.26%)
Mutual labels:  spacy
rita-dsl
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format
Stars: ✭ 60 (+11.11%)
Mutual labels:  spacy
Semantic-Textual-Similarity
Natural Language Processing using NLTK and Spacy
Stars: ✭ 30 (-44.44%)
Mutual labels:  spacy
Quora QuestionPairs DL
Kaggle Competition: Using deep learning to solve quora's question pairs problem
Stars: ✭ 54 (+0%)
Mutual labels:  spacy
spaczz
Fuzzy matching and more functionality for spaCy.
Stars: ✭ 215 (+298.15%)
Mutual labels:  spacy
spacymoji
💙 Emoji handling and meta data for spaCy with custom extension attributes
Stars: ✭ 174 (+222.22%)
Mutual labels:  spacy
ginza-transformers
Use custom tokenizers in spacy-transformers
Stars: ✭ 15 (-72.22%)
Mutual labels:  spacy
nlp workshop odsc europe20
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…
Stars: ✭ 127 (+135.19%)
Mutual labels:  spacy
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (-5.56%)
Mutual labels:  spacy

spacy_readability

spaCy v2.0 pipeline component for calculating readability scores of of text. Provides scores for Flesh-Kincaid grade level, Flesh-Kincaid reading ease, Dale-Chall, and SMOG.

Installation

pip install spacy-readability

Usage

import spacy
from spacy_readability import Readability

nlp = spacy.load('en')
nlp.add_pipe(Readability())

doc = nlp("I am some really difficult text to read because I use obnoxiously large words.")

print(doc._.flesch_kincaid_grade_level)
print(doc._.flesch_kincaid_reading_ease)
print(doc._.dale_chall)
print(doc._.smog)
print(doc._.coleman_liau_index)
print(doc._.automated_readability_index)
print(doc._.forcast)

Readability Scores

Readability is the ease with which a reader can understand a written text. In natural language, the readability of text depends on its content (the complexity of its vocabulary and syntax) and its presentation (such as typographic aspects like font size, line height, and line length).

Popular Metrics

  • The Flesch formulas : - Flesch-Kincaid Readability Score

    • Flesch-Kincaid Reading Ease
  • Dale-Chall formula

  • SMOG

  • Coleman-Liau Index

  • Automated Readability Index

  • FORCAST

For more in depth reading.

Contributing

Setup

  1. Install Poetry
  2. Run make setup to prepare workspace

Testing

  1. Run make test to run all tests

Linting

  1. Run make format to run black code formatter
  2. Run make lint to run pylint
  3. Run make mypy to run mypy
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].