Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → yooper → Php Text Analysis

yooper / Php Text Analysis

Licence: mit

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language

Labels

nlp text-analysis

Projects that are alternatives of or similar to Php Text Analysis

HurdleDMR.jl

Hurdle Distributed Multinomial Regression (HDMR) implemented in Julia

Stars: ✭ 19 (-95.37%)

Mutual labels: text-analysis

kwx

BERT, LDA, and TFIDF based keyword extraction in Python

Stars: ✭ 33 (-91.95%)

Mutual labels: text-analysis

Text mining resources

Resources for learning about Text Mining and Natural Language Processing

Stars: ✭ 358 (-12.68%)

Mutual labels: text-analysis

occupationcoder

Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.

Stars: ✭ 30 (-92.68%)

Mutual labels: text-analysis

DaDengAndHisPython

【微信公众号：大邓和他的python】, Python语法快速入门https://www.bilibili.com/video/av44384851 Python网络爬虫快速入门https://www.bilibili.com/video/av72010301, 我的联系邮箱[email protected]

Stars: ✭ 59 (-85.61%)

Mutual labels: text-analysis

Textpipe

Textpipe: clean and extract metadata from text

Stars: ✭ 284 (-30.73%)

Mutual labels: text-analysis

aylien textapi go

AYLIEN's officially supported Go client library for accessing Text API

Stars: ✭ 15 (-96.34%)

Mutual labels: text-analysis

Jekyll

Jekyll-based static site for The Programming Historian

Stars: ✭ 387 (-5.61%)

Mutual labels: text-analysis

support-tickets-classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Stars: ✭ 142 (-65.37%)

Mutual labels: text-analysis

Artificial Adversary

🗣️ Tool to generate adversarial text examples and test machine learning models against them

Stars: ✭ 348 (-15.12%)

Mutual labels: text-analysis

YelpDatasetSQL

Working with the Yelp Dataset in Azure SQL and SQL Server

Stars: ✭ 16 (-96.1%)

Mutual labels: text-analysis

LSX

A word embeddings-based semi-supervised model for document scaling

Stars: ✭ 42 (-89.76%)

Mutual labels: text-analysis

Graphbrain

Language, Knowledge, Cognition

Stars: ✭ 294 (-28.29%)

Mutual labels: text-analysis

rita

Website, documentation and examples for RiTa

Stars: ✭ 42 (-89.76%)

Mutual labels: text-analysis

Python Course

Tutorial and introduction into programming with Python for the humanities and social sciences

Stars: ✭ 370 (-9.76%)

Mutual labels: text-analysis

learning-stm

Learning structural topic modeling using the stm R package.

Stars: ✭ 103 (-74.88%)

Mutual labels: text-analysis

aylien textapi nodejs

AYLIEN's officially supported node.js client library for accessing Text API

Stars: ✭ 13 (-96.83%)

Mutual labels: text-analysis

Whatlang Rs

Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/

Stars: ✭ 400 (-2.44%)

Mutual labels: text-analysis

Open Semantic Search

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Stars: ✭ 386 (-5.85%)

Mutual labels: text-analysis

Giveme5w1h

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

Stars: ✭ 316 (-22.93%)

Mutual labels: text-analysis

View All Similar Projects ➔

php-text-analysis

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language. There are tools in this library that can perform:

document classification
sentiment analysis
compare documents
frequency analysis
tokenization
stemming
collocations with Pointwise Mutual Information
lexical diversity
corpus analysis
text summarization

All the documentation for this project can be found in the book and wiki.

PHP Text Analysis Book & Wiki

A book is in the works and your contributions are needed. You can find the book at https://github.com/yooper/php-text-analysis-book

Also, documentation for the library resides in the wiki, too. https://github.com/yooper/php-text-analysis/wiki

Installation Instructions

Add PHP Text Analysis to your project

composer require yooper/php-text-analysis

Tokenization

$tokens = tokenize($text);

You can customize which type of tokenizer to tokenize with by passing in the name of the tokenizer class

$tokens = tokenize($text, \TextAnalysis\Tokenizers\PennTreeBankTokenizer::class);

The default tokenizer is \TextAnalysis\Tokenizers\GeneralTokenizer::class . Some tokenizers require parameters to be set upon instantiation.

Normalization

By default, normalize_tokens uses the function strtolower to lowercase all the tokens. To customize the normalize function, pass in either a function or a string to be used by array_map.

$normalizedTokens = normalize_tokens(array $tokens);

$normalizedTokens = normalize_tokens(array $tokens, 'mb_strtolower');

$normalizedTokens = normalize_tokens(array $tokens, function($token){ return mb_strtoupper($token); });

Frequency Distributions

The call to freq_dist returns a FreqDist instance.

$freqDist = freq_dist(tokenize($text));

Ngram Generation

By default bigrams are generated.

$bigrams = ngrams($tokens);

Customize the ngrams

// create trigrams with a pipe delimiter in between each word
$trigrams = ngrams($tokens,3, '|');

Stemming

By default stem method uses the Porter Stemmer.

$stemmedTokens = stem($tokens);

You can customize which type of stemmer to use by passing in the name of the stemmer class name

$stemmedTokens = stem($tokens, \TextAnalysis\Stemmers\MorphStemmer::class);

Keyword Extract with Rake

There is a short cut method for using the Rake algorithm. You will need to clean your data prior to using. Second parameter is the ngram size of your keywords to extract.

$rake = rake($tokens, 3);
$results = $rake->getKeywordScores();

Sentiment Analysis with Vader

Need Sentiment Analysis with PHP Use Vader, https://github.com/cjhutto/vaderSentiment . The PHP implementation can be invoked easily. Just normalize your data before hand.

$sentimentScores = vader($tokens);

Document Classification with Naive Bayes

Need to do some document classification with PHP, trying using the Naive Bayes implementation. An example of classifying movie reviews can be found in the unit tests

$nb = naive_bayes();
$nb->train('mexican', tokenize('taco nacho enchilada burrito'));        
$nb->train('american', tokenize('hamburger burger fries pop'));  
$nb->predict(tokenize('my favorite food is a burrito'));

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 410

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗