All Projects → haryoa → Stif Indonesia

haryoa / Stif Indonesia

Licence: mit
Implementation of "Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation".

Programming Languages

forth
179 projects

Projects that are alternatives of or similar to Stif Indonesia

Stylized Neural Painting
Official Pytorch implementation of the preprint paper "Stylized Neural Painting", in CVPR 2021.
Stars: ✭ 663 (+739.24%)
Mutual labels:  style-transfer
Deep preset
[WACV'21] Deep Preset: Blending and Retouching Photos with Color Style Transfer
Stars: ✭ 29 (-63.29%)
Mutual labels:  style-transfer
Torch Models
Stars: ✭ 65 (-17.72%)
Mutual labels:  style-transfer
Landmark Detection
Four landmark detection algorithms, implemented in PyTorch.
Stars: ✭ 747 (+845.57%)
Mutual labels:  style-transfer
Style Transfer
Style Transfer - Alia Bhatt (Google Colab)
Stars: ✭ 26 (-67.09%)
Mutual labels:  style-transfer
Style Transfer In Text
Paper List for Style Transfer in Text
Stars: ✭ 1,030 (+1203.8%)
Mutual labels:  style-transfer
Deep Motion Editing
An end-to-end library for editing and rendering motion of 3D characters with deep learning [SIGGRAPH 2020]
Stars: ✭ 620 (+684.81%)
Mutual labels:  style-transfer
Agis Net
[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning
Stars: ✭ 77 (-2.53%)
Mutual labels:  style-transfer
Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+10635.44%)
Mutual labels:  style-transfer
Neural Painters X
Neural Paiters
Stars: ✭ 61 (-22.78%)
Mutual labels:  style-transfer
Data Augmentation Review
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to github repos, papers and others.
Stars: ✭ 785 (+893.67%)
Mutual labels:  style-transfer
Py Style Transfer
🎨 Artistic neural style transfer with tweaks (pytorch).
Stars: ✭ 23 (-70.89%)
Mutual labels:  style-transfer
Adain Style
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Stars: ✭ 1,049 (+1227.85%)
Mutual labels:  style-transfer
Pytorch Multi Style Transfer
Neural Style and MSG-Net
Stars: ✭ 687 (+769.62%)
Mutual labels:  style-transfer
Texture nets
Code for "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images" paper.
Stars: ✭ 1,147 (+1351.9%)
Mutual labels:  style-transfer
Tensorflow 101
TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow
Stars: ✭ 642 (+712.66%)
Mutual labels:  style-transfer
Multimodal transfer
tensorflow implementation of 'Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer'
Stars: ✭ 31 (-60.76%)
Mutual labels:  style-transfer
Style Transfer.js
👨‍🎨️ Generate novel artistic images in Node.js
Stars: ✭ 78 (-1.27%)
Mutual labels:  style-transfer
Aams
Attention-aware Multi-stroke Style Transfer (CVPR2019)
Stars: ✭ 68 (-13.92%)
Mutual labels:  style-transfer
Cyclegan Qp
Official PyTorch implementation of "Artist Style Transfer Via Quadratic Potential"
Stars: ✭ 59 (-25.32%)
Mutual labels:  style-transfer

STIF-Indonesia

Paper

An implementation of "Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation".

You can also find Indonesian informal-formal parallel corpus in this repository.

Description

We were researching transforming a sentence from informal to its formal form. Our work addresses a style-transfer from informal to formal Indonesian as a low-resource machine translation problem. We benchmark several strategies to perform the style transfer.

In this repository, we provide the Phrase-Based Statistical Machine Translation, which has the highest result in our experiment. Note that, our data is extremely low-resource and domain-specific (Customer Service domain). Therefore, the system might not be robust towards out-of-domain input. Our future work includes exploring more robust style transfer. Stay tuned!

Paper

Paper

You can access our paper below:

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation (IALP 2020)

Medium Article: Mengubah Bahasa Indonesia Informal Menjadi Baku Menggunakan Kecerdasan Buatan (In Indonesian)

Requirements

We use the RELEASE 4.0 Ubuntu 17.04+ version which only works on the specified OS.

We haven't tested it on other OS (e.g.: OS X and Windows). If you want to run the source code, use Ubuntu 17.04+. If you use windows, we advise you to use the WSL-2 to run the code.

In this experiment, we wrap the MOSES code by using Python's subprocess. So Python installation is necessary. The system is tested on Python 3.9. We recommend it to install with miniconda. You can install it by following this link: https://docs.conda.io/en/latest/miniconda.html

How To Run

First, clone the repository

git clone https://github.com/haryoa/stif-indonesia.git

Then run the MOSES downloader. We use .sh, so use a CLI application that can execute it. On the root project folder directory, do:

bash scripts/download_moses.sh

The script will download the Moses toolkit and extract it by itself.

Install required python packages

Before running the program you have to install some prerequisites packages:

pip install -r requirements.txt

Alternatively, if you prefer to use pipenv instead you can run:

pipenv install

NOTE: If you prefer to use pipenv you should preceed the command with pipenv run. E.g: pipenv run python -m stif_indonesia --exp-scenario supervised

Run Supervised Experiments

To run the supervised one, do:

python -m stif_indonesia --exp-scenario supervised

It will read the experiment config in experiment-config/00001_default_supervised_config.json

Run Semi-Supervised Experiments

To run the semi-supervised one, do:

python -m stif_indonesia --exp-scenario semi-supervised

It will read the experiment config in experiment-config/00002_default_semi_supervised_config.json

Output

  1. The training process will output the log of the experiment in log.log
  2. The output of the model will be produced in the output folder

Supervised output

It will output evaluation, lm , and train. evaluation is the result of prediction on the test set, lm is the output of the trained LM, and train is the produced model by the Moses toolkit

Semi-supervised output

It will output agg_data, best_model_dir, and produced_tgt_data. agg_data is the result of the forward-iteration data synthesis. best_model_dir is the best model produced by the training process, and produced_tgt_data is the prediction output of the test set.

Score

Please check the log.log file which is the output of the process.

Additional Information

If you want to replicate the dictionary-based method, you can use any informal - formal or slang dictionary on the internet.

For example, you can use this dictionary.

If you want to replicate our GPT-2 experiment, you can use a pre-trained Indonesian GPT-2 such as this one, or train it by yourself by using Oscar Corpus. After that, you can finetune it with the dataset that we have provided here. You should follow the paper on how to transform the data when you do the finetuning.

We use Huggingface's off-the-shelf implementation to train the model.

Team

  1. Haryo Akbarianto Wibowo @ Kata.ai
  2. Tatag Aziz Prawiro @ Universitas Indonesia
  3. Muhammad Ihsan @ Bina Nusantara
  4. Alham Fikri Aji @ Kata.ai
  5. Radityo Eko Prasojo @ Kata.ai
  6. Rahmad Mahendra @ Universitas Indonesia
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].