All Projects → sfischer13 → python-arpa

sfischer13 / python-arpa

Licence: MIT License
🐍 Python library for n-gram models in ARPA format

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to python-arpa

lm-scorer
📃Language Model based sentences scoring library
Stars: ✭ 264 (+654.29%)
Mutual labels:  lm, language-model
Lingvo
Lingvo
Stars: ✭ 2,361 (+6645.71%)
Mutual labels:  lm, language-model
gpt-j
A GPT-J API to use with python3 to generate text, blogs, code, and more
Stars: ✭ 101 (+188.57%)
Mutual labels:  language-model
wikipron
Massively multilingual pronunciation mining
Stars: ✭ 167 (+377.14%)
Mutual labels:  computational-linguistics
php-ntlm
Message encoder/decoder and password hasher for the NTLM authentication protocol
Stars: ✭ 14 (-60%)
Mutual labels:  lm
tying-wv-and-wc
Implementation for "Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling"
Stars: ✭ 39 (+11.43%)
Mutual labels:  language-model
folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+60%)
Mutual labels:  computational-linguistics
eflm
Efficient Fitting of Linear and Generalized Linear Models by using just base R. The speed gains over lm and glm are obtained by reducing the NxP model matrix to a PxP matrix, and the best computational performance is obtained when R is linked against OpenBLAS, Intel MKL or other optimized BLAS library.
Stars: ✭ 14 (-60%)
Mutual labels:  lm
CISTEM
Stemmer for German
Stars: ✭ 33 (-5.71%)
Mutual labels:  computational-linguistics
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-8.57%)
Mutual labels:  computational-linguistics
DataAugmentationNMT
Data Augmentation for Neural Machine Translation
Stars: ✭ 26 (-25.71%)
Mutual labels:  language-model
mystem-scala
Morphological analyzer `mystem` (Russian language) wrapper for JVM languages
Stars: ✭ 21 (-40%)
Mutual labels:  computational-linguistics
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (+45.71%)
Mutual labels:  computational-linguistics
minicons
Utility for analyzing Transformer based representations of language.
Stars: ✭ 28 (-20%)
Mutual labels:  language-model
CodeT5
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
Stars: ✭ 390 (+1014.29%)
Mutual labels:  language-model
bangla-bert
Bangla-Bert is a pretrained bert model for Bengali language
Stars: ✭ 41 (+17.14%)
Mutual labels:  lm
word2vec-tsne
Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (+68.57%)
Mutual labels:  computational-linguistics
pyVHDLParser
Streaming based VHDL parser.
Stars: ✭ 51 (+45.71%)
Mutual labels:  language-model
tensorflow-with-kenlm
Tensorflow with KenLM integrated for beam search scoring
Stars: ✭ 30 (-14.29%)
Mutual labels:  language-model
foliapy
An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.
Stars: ✭ 13 (-62.86%)
Mutual labels:  computational-linguistics

Python ARPA Package

Python library for reading ARPA n-gram models.

Setup

Python 3.4+

PyPI Python Versions PyPI Version

In order to install the Python 3 version:

$ pip install --user -U arpa

Python 2.7

PyPI Python Versions PyPI Version

In order to install the Python 2.7 version:

$ pip install --user -U arpa-backport

Usage

The package may be imported directly:

import arpa  # Python 3.4+
# OR
import arpa_backport as arpa  # Python 2.7

models = arpa.loadf("foo.arpa")
lm = models[0]  # ARPA files may contain several models.

# probability p(end|in, the)
lm.p("in the end")
lm.log_p("in the end")

# sentence score w/ sentence markers
lm.s("This is the end .")
lm.log_s("This is the end .")

# sentence score w/o sentence markers
lm.s("This is the end .", sos=False, eos=False)
lm.log_s("This is the end .", sos=False, eos=False)

Development

Travis Documentation Status Coverage Status

Contributions are welcome!
Write a bug report or send a pull request.
Other contributors have done so before.

License

Copyright (c) 2015-2018 Stefan Fischer
The source code is available under the MIT License.
See LICENSE for further details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].