Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → panggi → Pujangga

panggi / Pujangga

Licence: apache-2.0

Pujangga - Indonesian Natural Language Processing Tool with REST API, an Interface for InaNLP and Deeplearning4j's Word2Vec

Programming Languages

5932 projects

Labels

deep-learning natural-language-processing word2vec play-framework deeplearning4j

Projects that are alternatives of or similar to Pujangga

Awesome Embedding Models

A curated list of awesome embedding models tutorials, projects and communities.

Stars: ✭ 1,486 (+3061.7%)

Mutual labels: natural-language-processing, word2vec

Deep Math Machine Learning.ai

A blog which talks about machine learning, deep learning algorithms and the Math. and Machine learning algorithms written from scratch.

Stars: ✭ 173 (+268.09%)

Mutual labels: natural-language-processing, word2vec

Beautiful visualizations of how language differs among document types.

Stars: ✭ 1,722 (+3563.83%)

Mutual labels: natural-language-processing, word2vec

Japanese text8 corpus for word embedding.

Stars: ✭ 79 (+68.09%)

Mutual labels: natural-language-processing, word2vec

Natural Language Processing

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

Stars: ✭ 377 (+702.13%)

Mutual labels: natural-language-processing, word2vec

R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation

Stars: ✭ 103 (+119.15%)

Mutual labels: natural-language-processing, word2vec

Topic Modelling for Humans

Stars: ✭ 12,763 (+27055.32%)

Mutual labels: natural-language-processing, word2vec

Scattertext Pydata

Notebooks for the Seattle PyData 2017 talk on Scattertext

Stars: ✭ 132 (+180.85%)

Mutual labels: natural-language-processing, word2vec

LanguageCrunch NLP server docker image

Stars: ✭ 281 (+497.87%)

Mutual labels: natural-language-processing, word2vec

Oxford Deep NLP 2017 course - Practical 1: word2vec

Stars: ✭ 220 (+368.09%)

Mutual labels: natural-language-processing, word2vec

🦆 Contextually-keyed word vectors

Stars: ✭ 1,184 (+2419.15%)

Mutual labels: natural-language-processing, word2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

Stars: ✭ 715 (+1421.28%)

Mutual labels: natural-language-processing, word2vec

Library for Korean morpheme and word vector representation

Stars: ✭ 64 (+36.17%)

Mutual labels: natural-language-processing, word2vec

A fast, efficient universal vector embedding utility package.

Stars: ✭ 1,394 (+2865.96%)

Mutual labels: natural-language-processing, word2vec

Python codes in Machine Learning, NLP, Deep Learning and Reinforcement Learning with Keras and Theano

Stars: ✭ 1,123 (+2289.36%)

Mutual labels: natural-language-processing, word2vec

Germanwordembeddings

Toolkit to obtain and preprocess german corpora, train models using word2vec (gensim) and evaluate them with generated testsets

Stars: ✭ 189 (+302.13%)

Mutual labels: natural-language-processing, word2vec

CS224n: Natural Language Processing with Deep Learning Assignments Winter, 2017

Stars: ✭ 656 (+1295.74%)

Mutual labels: natural-language-processing, word2vec

Nlp In Practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

Stars: ✭ 790 (+1580.85%)

Mutual labels: natural-language-processing, word2vec

Pytorch Skipgram

Implementing Skip-gram Negative Sampling with pytorch

Stars: ✭ 39 (-17.02%)

Mutual labels: word2vec

Data-centric declarative deep learning framework

Stars: ✭ 8,018 (+16959.57%)

Mutual labels: natural-language-processing

View All Similar Projects ➔

Pujangga

Indonesian Natural Language Processing REST API

An interface for InaNLP and Deeplearning4j's Word2Vec for Indonesian (Bahasa Indonesia) in the form of REST API.

Below is the screenshot of Pujangga's request and response using Paw REST Client

Credits:

Local Setup

Install scala 2.12.2 and Lightbend Activator
Clone the project

$ git clone [email protected]:panggi/pujangga.git

Download the dependencies

$ cd pujangga
$ activator

Pretrained word2vec model can be downloaded here https://drive.google.com/uc?id=0B5YTktu2dOKKNUY1OWJORlZTcUU&export=download
Run Application

$ export WORD2VEC_FILE=/path/to/word2vec_wiki_id   
$ activator run

Access on http://localhost:9000

API Endpoints

Stemmer

Request:

POST /stemmer

{
  "string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}

Response:

{
  "status": "success",
  "data": "prof Habibie akan laku kunjung resmi ke pt Pindad di bandung"
}

Phrase Chunker

Request:

POST /phrasechunker

{
  "string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}

Response:

{
  "status": "success",
  "data": {
    "map": {
      "Pindad ": "NP",
      "Prof. Habibie ": "NP",
      ".": ".",
      "di Bandung ": "PP",
      "akan melakukan kunjungan resmi ke PT ": "VP"
    },
    "list": [
      "NP",
      "VP",
      "NP",
      "PP"
    ]
  }
}

Part-of-Speech Tagger

Request:

POST /postagger

{
  "string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}

Response:

{
  "status": "success",
  "data": {
    "map": {
      "resmi": "JJ",
      ".": ".",
      "akan": "MD",
      "ke": "IN",
      "di": "IN",
      "Bandung": "NNP",
      "Pindad": "NNP",
      "PT": "NN",
      "Prof.": "NNP",
      "kunjungan": "NN",
      "Habibie": "NNP",
      "melakukan": "VBT"
    },
    "list": [
      "NNP",
      "NNP",
      "MD",
      "VBT",
      "NN",
      "JJ",
      "IN",
      "NN",
      "NNP",
      "IN",
      "NNP"
    ]
  }
}

Named-Entity Tagger

Request:

POST /netagger

{
  "string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}

Response:

{
  "status": "success",
  "data": [
    "OTHER",
    "PERSON-B",
    "OTHER",
    "OTHER",
    "OTHER",
    "OTHER",
    "OTHER",
    "LOCATION-B",
    "OTHER",
    "PERSON-B",
    "OTHER",
    "LOCATION-B"
  ]
}

Formalizer

Request:

POST /formalizer

{
  "string": "Sis, lu bisa nggak pesenin gw sepatu newbalance tipe 960? gpl ya. hati2 sama penipuan anak 4l4y"
}

Response:

{
  "status": "success",
  "data": "Sis , kamu bisa tidak pesankan saya sepatu newbalance tipe 960 ? tidak pakai lama iya . hati-hati sama penipuan anak norak "
}

Stopwords Removal

Request:

POST /stopwords

{
  "string": "Prof. Habibie akan melakukan kunjungan resmi ke PT. Pindad di Bandung"
}

Response:

{
  "status": "success",
  "data": "Prof. Habibie kunjungan resmi PT . Pindad Bandung "
}

Sentence Tokenizer

Request:

POST /sentence/tokenizer

{
  "string": "Saya pergi ke (bagian kanan) rumah sakit Prof. Dr. Soerojo."
}

Response:

{
  "status": "success",
  "data": [
    "Saya",
    "pergi",
    "ke",
    "(",
    "bagian",
    "kanan",
    ")",
    "rumah",
    "sakit",
    "Prof.",
    "Dr.",
    "Soerojo",
    "."
  ]
}

Sentence Tokenizer with Composite Words

Request:

POST /sentence/tokenizer/composite

{
  "string": "Saya pergi ke (bagian kanan) rumah sakit Prof. Dr. Soerojo."
}

Response:

{
  "status": "success",
  "data": [
    "Saya",
    "pergi",
    "ke",
    "(",
    "bagian kanan",
    ")",
    "rumah sakit",
    "Prof.",
    "Dr.",
    "Soerojo",
    "."
  ]
}

Sentence Splitter

Request:

POST /sentence/splitter

{
  "string": "Michael Jeffrey Jordan dilahirkan di Brooklyn, New York, Amerika Serikat, pada 17 Februari 1963 adalah pemain bola basket profesional asal Amerika. Michael Jordan merupakan pemain terkenal di dunia dalam cabang olahraga itu. Setidaknya ia enam kali merebut kejuaraan NBA bersama kelompok Chicago Bulls (1991-1993, 1996-1998). Ia memiliki tinggi badan 198 cm dan merebut gelar pemain terbaik."
}

Response:

{
  "status": "success",
  "data": [
    "Michael Jeffrey Jordan dilahirkan di Brooklyn, New York, Amerika Serikat, pada 17 Februari 1963 adalah pemain bola basket profesional asal Amerika .",
    "Michael Jordan merupakan pemain terkenal di dunia dalam cabang olahraga itu .",
    "Setidaknya ia enam kali merebut kejuaraan NBA bersama kelompok Chicago Bulls (1991-1993, 1996-1998) .",
    "Ia memiliki tinggi badan 198 cm dan merebut gelar pemain terbaik ."
  ]
}

Word2Vec Nearest Words

Request:

POST /word2vec/nearestwords

{
  "string": "mobil",
  "n": 10
}

Response:

{
  "status": "success",
  "data": [
    "motor",
    "dikendarai",
    "sepeda",
    "truk",
    "motornya",
    "mengemudikan",
    "mobil-mobil",
    "mobilnya",
    "mengendarai",
    "pengemudi"
  ]
}

Word2Vec Arithmetic

Request:

POST /word2vec/arithmetic

{
  "first_string": "serang",
  "second_string": "malang",
  "third_string": "surabaya",
  "n": 10
}

Response:

{
  "status": "success",
  "data": [
    "serang",
    "lebak",
    "puloampel",
    "keserangan",
    "bogor",
    "waringinkurung",
    "jawilan",
    "cianjur",
    "garut",
    "padarincang"
  ]
}

Word2Vec Similarity

Request:

POST /word2vec/similarity

{
  "first_string": "sore",
  "second_string": "petang"
}

Response:

{
  "status": "success",
  "data": 0.7748607993125916
}

License

All files in libs and resource directories are the property of Dr. Eng. Ayu Purwarianti, ST.,MT., et al and not part of the license below (Apache License, Version 2.0).

All other custom codes made by Panggi Libersa Jasri Akadol are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 47

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (3) 🔗