All Projects → tsproisl → Linguistic_and_stylistic_complexity

tsproisl / Linguistic_and_stylistic_complexity

Licence: gpl-3.0
Linguistic and stylistic complexity measures for (literary) texts

Projects that are alternatives of or similar to Linguistic and stylistic complexity

Textclassifier
tensorflow implementation
Stars: ✭ 944 (+3271.43%)
Mutual labels:  jupyter-notebook
Mask Rcnn Tensorflow
Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the original implementation on AWS.
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Tensorflow2.0 eager execution tutorials
Tutorials of TensorFlow eager execution
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Alfabattle2 1stproblem
Alfabattle 2.0 1st task Top-6 solution: 8-folds lgbm blend
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Idb Idb Invest Coronavirus Impact Dashboard
Follow the impact of COVID-19 outbreak in Latin America in real time
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Medium Article
Repo for articles in my personal blog and Medium
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Sklearn ensae course
Materials for a course on scikit-learn at ENSAE
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Yancheng Sales
天池-印象盐城-汽车销量预测大赛
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Data driven science python demos
IPython notebooks with demo code intended as a companion to the book "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by J. Nathan Kutz and Steven L. Brunton
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Resimnet
Implementation of ReSimNet for drug response similarity prediction
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Sid
Official implementation for ICCV19 "Shadow Removal via Shadow Image Decomposition"
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Advanced Gradient Obfuscating
Take further steps in the arms race of adversarial examples with only preprocessing.
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Sports Type Classifier
Classify the type of sports from images
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Kispython
Курс программирования на языке Python
Stars: ✭ 27 (-3.57%)
Mutual labels:  jupyter-notebook
Data Visualizations Medium
Understanding Data and Machine Learning Models with Visualizations
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Anatomyofmatplotlib
Anatomy of Matplotlib -- tutorial developed for the SciPy conference
Stars: ✭ 943 (+3267.86%)
Mutual labels:  jupyter-notebook
Chexpert
CheXpert competition models -- attention augmented convolutions on DenseNet, ResNet; EfficientNet
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Uc berkeley Applied Machine Learning
Materials for Applied Machine Learning Taught in Python
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
S gd2
Stress-based Graph Drawing by Stochastic Gradient Descent
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook
Shadowmusic
A temporal music synthesizer
Stars: ✭ 28 (+0%)
Mutual labels:  jupyter-notebook

Linguistic and Stylistic Complexity

This project is a collection of measures that assess the linguistic and stylistic complexity of (literary) texts.

Usage

You can use the script bin/lascomplexity.py to compute all implemented complexity measures from the command line. The vocabulary-based and dependency-based complexity measures are language independent, the constituent-based measures rely on the NEGRA parsing scheme, i.e. can only be applied to German data.

The input has to be a CoNLL-style text file with six tab-separated columns and an empty line after each sentence. The six columns are: word index, word, part-of-speech tag, index of dependency head, dependency relation, phrase structure tree. Missing values can be represented by an underscore (_). Here is a short example with two sentences:

1	Das	ART	3	NK	(TOP(S(NP*
2	fremde	ADJA	3	NK	*
3	Schiff	NN	4	SB	*)
4	war	VAFIN	-1	--	*
5	nicht	PTKNEG	6	NG	(AVP*
6	allein	ADV	4	MO	*)
7	.	$.	6	--	*))

1	Sieben	CARD	2	NK	(TOP(S(NP*
2	weitere	ADJA	3	MO	*)
3	begleiteten	VVFIN	-1	--	*
4	es	PPER	3	OA	*
5	.	$.	4	--	*))

Without any options, the script computes all measures:

lascomplexity.py <file>

You can also request subsets of the measures via the -v/--voc, -d/--dep and -c/--const options for vocabulary-based, dependency-based and constituent-based measures. More detailed usage information is available via:

lascomplexity.py -h

Vocabulary-based complexity measures

Measures that use sample size and vocabulary size

  • Type-token ratio
  • Guiraud's R
  • Herdan's C
  • Dugast's k
  • Maas' a2
  • Dugast's U
  • Tuldava's LN
  • Brunet's W
  • Carroll's CTTR
  • Summer's S

Measures that use part of the frequency spectrum

  • Honoré's H
  • Sichel's S
  • Michéa's M

Measures that use the whole frequency spectrum

  • Entropy
  • Yule's K
  • Simpson's D
  • Herdan's Vm
  • McCarthy and Jarvis' HD-D

Parameters of probabilistic models

  • Orlov's Z

Measures that use the whole text

  • Covington and McFall's MATTR
  • MTLD
  • Kubat and Milicka's STTR

Shallow syntactic complexity measures

  • Average sentence length
  • Average punctuation per sentence
  • Average punctuation per token

Dependency-based measures

  • Average dependency distance
  • Average closeness centrality
  • Average outdegree centralization
  • Average closeness centralization
  • Average longest shortest path
  • Average dependents per token

Constituent-based measures

Language-independent measures:

  • Average number of constituents
  • Average number of constituents without leaves
  • Average height of the parse trees

Language-dependent measures (defined for German):

  • Average number of t units
  • Average number of complex t units
  • Average number of clauses
  • Average number of dependent clauses
  • Average number of NPs
  • Average number of VPs
  • Average number of PPs
  • Average number of coordinate phrases
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].