Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → mit-nlp → Text.jl

mit-nlp / Text.jl

Licence: Apache-2.0 license

Numerous tools for text processing

Programming Languages

2034 projects

TEXT: Numerous tools for text processing

This package is a julia implementation of:

Text classification based on BoW models (e.g. topic/langauge id)
Language ID (training and processing) based on word and character n-grams
Lewis's SMART stop list for English
tfidf/tfllr text feature normalization
ngram feature extractors

Prerequistes

Stage - Needed for logging and memoization (Note: requires manual install)
Ollam - online learning modules (Note: requires manual install)
Devectorize - macro-based devectorization
DataStructures - for DefaultDict
Devectorize
GZip
Iterators - for iterator helper functions

Install

This is an experimental package which is not currently registered in the julia central repository. You can install via:

Pkg.clone("https://github.com/saltpork/Stage.jl")
Pkg.clone("https://github.com/mit-nlp/Ollam.jl")
Pkg.clone("https://github.com/mit-nlp/Text.jl")

Usage

See test/runtests.jl for detailed usage.

License

This package was created for the DARPA XDATA and Memex program under an Apache v2 License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 72

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗