pgSqlBlocks - это standalone приложение, написанное на языке программирования Java, которое позволяет легко ориентироваться среди процессов и получать информацию о блокировках и ожидающих запросов в СУБД PostgreSQL. Отображается информация о состоянии подключения к БД, а также информация о процессах в БД.

Stars: ✭ 23 (-56.6%)

Mutual labels: blocks

contextualLSTM

Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning

Stars: ✭ 28 (-47.17%)

Mutual labels: word-embeddings

chainDB

A noSQL database based on blockchain technology

Stars: ✭ 13 (-75.47%)

Mutual labels: blocks

pair2vec

pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference

Stars: ✭ 62 (+16.98%)

Mutual labels: word-embeddings

View All Similar Projects ➔

Morphological Priors for Probabilistic Neural Word Embeddings

================================= Implementation of Morphological Priors for Probabilistic Neural Word Embeddings.

VarEmbed in Blocks

This is the implementation for the following paper, to appear at EMNLP 2016: Morphological Priors for Probabilistic Neural Word Embeddings. Parminder Bhatia, Robert Guthrie, Jacob Eisenstein.

Using LSTM's for word embeddings that incorporate word-level and morpheme-level information using Blocks and Fuel. LSTM code modified from https://github.com/johnarevalo/blocks-char-rnn.git.

Requirements

Install Blocks. Please see the documentation for more information.
Install Fuel. Please see the documentation for more information.
Install the Morfessor Python package.

Results

Usage

The input can be any raw, pre-tokenized text. This will walk through how to generate the Morfessor model, preprocess and package the data as NDArrays, and train the model.

You will need to train a Morfessor model on your data. A script for this has been provided. It will output a serialized Morfessor model for later use.

python train_morfessor.py --training-data <input.txt> --output <output.bin>

The data set needs to be preprocessed and formatted, using preprocess_data.py and make_dataset.py. The -h flag will give the arguments needed. Preprocessing is downcasing so that capitalization doesn't affect Morfessor.

python preprocess_data.py <textfile>.trn -o <output_file> -n <unks all but top N words>

python make_dataset.py <textfile>.trn -mf <morfessor_model.bin>

Next, run train.py to train the model. It will print statistics after each mini-batch.

python train.py <filename>.hdf5

Parameters like batch size, embedding dimension, and the number of epochs can be changed in the config.py file.

Last, word vectors can be output in the format word dim1 dim2 ..., with 1 word per line, via the output_word_vectors.py script. Provide it a vocab of vectors to output, as well as a serialized network from training.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

rguthrie3 / MorphologicalPriorsForWordEmbeddings

Programming Languages

Labels

Projects that are alternatives of or similar to MorphologicalPriorsForWordEmbeddings

Morphological Priors for Probabilistic Neural Word Embeddings

VarEmbed in Blocks

Requirements

Results

Usage