All Projects → Open Korean Text → Similar Projects or Alternatives

955 Open source projects that are alternatives of or similar to Open Korean Text

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

Stars: ✭ 433 (-1.14%)

Mutual labels: tokenizer, text-processing

Char Rnn Tensorflow

Multi-layer Recurrent Neural Networks for character-level language models implements by TensorFlow

Stars: ✭ 58 (-86.76%)

Mutual labels: korean, natural-language-processing

Kagome

Self-contained Japanese Morphological Analyzer written in pure Go

Stars: ✭ 554 (+26.48%)

Mutual labels: korean, tokenizer

Py Nltools

A collection of basic python modules for spoken natural language processing

Stars: ✭ 46 (-89.5%)

Mutual labels: tokenizer, natural-language-processing

Tokenizer

Fast and customizable text tokenization library with BPE and SentencePiece support

Stars: ✭ 132 (-69.86%)

Mutual labels: tokenizer, natural-language-processing

Cogcomp Nlpy

CogComp's light-weight Python NLP annotators

Stars: ✭ 115 (-73.74%)

Mutual labels: natural-language-processing, text-processing

Nlpre

Python library for Natural Language Preprocessing (NLPre)

Stars: ✭ 158 (-63.93%)

Mutual labels: natural-language-processing, text-processing

Stringi

THE String Processing Package for R (with ICU)

Stars: ✭ 204 (-53.42%)

Mutual labels: natural-language-processing, text-processing

Textvec

Text vectorization tool to outperform TFIDF for classification tasks

Stars: ✭ 167 (-61.87%)

Mutual labels: natural-language-processing, text-processing

Thot

Thot toolkit for statistical machine translation

Stars: ✭ 53 (-87.9%)

Mutual labels: tokenizer, natural-language-processing

Text-Classification-LSTMs-PyTorch

The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.

Stars: ✭ 45 (-89.73%)

Mutual labels: tokenizer, text-processing

python-mecab

A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)

Stars: ✭ 27 (-93.84%)

Mutual labels: tokenizer, text-processing

ArabicProcessingCog

A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.

Stars: ✭ 19 (-95.66%)

Mutual labels: tokenizer, text-processing

Kor2vec

Library for Korean morpheme and word vector representation

Stars: ✭ 64 (-85.39%)

Mutual labels: korean, natural-language-processing

Pytorch Bert Crf Ner

KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)

Stars: ✭ 236 (-46.12%)

Mutual labels: korean, natural-language-processing

Lingua Franca

Mycroft's multilingual text parsing and formatting library

Stars: ✭ 51 (-88.36%)

Mutual labels: natural-language-processing, text-processing

Konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

Stars: ✭ 130 (-70.32%)

Mutual labels: natural-language-processing, text-processing

Kadot

Kadot, the unsupervised natural language processing library.

Stars: ✭ 108 (-75.34%)

Mutual labels: tokenizer, natural-language-processing

Hunspell Dict Ko

Korean spellchecking dictionary for Hunspell

Stars: ✭ 187 (-57.31%)

Mutual labels: korean, natural-language-processing

Prenlp

Preprocessing Library for Natural Language Processing

Stars: ✭ 130 (-70.32%)

Mutual labels: natural-language-processing, text-processing

Stanza Old

Stanford NLP group's shared Python tools.

Stars: ✭ 142 (-67.58%)

Mutual labels: natural-language-processing, text-processing

Greynir

The greynir.is natural language processing website for Icelandic

Stars: ✭ 47 (-89.27%)

Mutual labels: tokenizer, natural-language-processing

Fastnlp

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Stars: ✭ 2,441 (+457.31%)

Mutual labels: natural-language-processing, text-processing

Udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

Stars: ✭ 160 (-63.47%)

Mutual labels: tokenizer, natural-language-processing

hama-py

🦛 파이썬 한글 처리 라이브러리. Python Korean Morphological Analyzer

Stars: ✭ 16 (-96.35%)

Mutual labels: korean, text-processing

Pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Stars: ✭ 426 (-2.74%)

Mutual labels: natural-language-processing, text-processing

Mlinterview

A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!

Stars: ✭ 410 (-6.39%)

Mutual labels: natural-language-processing

Transformers Tutorials

Github repo with tutorials to fine tune transformers for diff NLP tasks

Stars: ✭ 384 (-12.33%)

Mutual labels: natural-language-processing

Multiwoz

Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)

Stars: ✭ 384 (-12.33%)

Mutual labels: natural-language-processing

Jflex

The fast scanner generator for Java™ with full Unicode support

Stars: ✭ 380 (-13.24%)

Mutual labels: tokenizer

Ernie

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Stars: ✭ 4,659 (+963.7%)

Mutual labels: natural-language-processing

Cogcomp Nlp

CogComp's Natural Language Processing libraries and Demos:

Stars: ✭ 410 (-6.39%)

Mutual labels: natural-language-processing

Nlpnet

A neural network architecture for NLP tasks, using cython for fast performance. Currently, it can perform POS tagging, SRL and dependency parsing.

Stars: ✭ 379 (-13.47%)

Mutual labels: natural-language-processing

Nlp Progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Stars: ✭ 19,518 (+4356.16%)

Mutual labels: natural-language-processing

Portuguese Bert

Portuguese pre-trained BERT models

Stars: ✭ 409 (-6.62%)

Mutual labels: natural-language-processing

Natural Language Processing

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

Stars: ✭ 377 (-13.93%)

Mutual labels: natural-language-processing

Usc Ds Relationextraction

Distantly Supervised Relation Extraction

Stars: ✭ 378 (-13.7%)

Mutual labels: natural-language-processing

Aho Corasick

A fast implementation of Aho-Corasick in Rust.

Stars: ✭ 424 (-3.2%)

Mutual labels: text-processing

Reductio

Automatic summarizer text in Swift

Stars: ✭ 406 (-7.31%)

Mutual labels: natural-language-processing

Beginner nlp

A curated list of beginner resources in Natural Language Processing

Stars: ✭ 376 (-14.16%)

Mutual labels: natural-language-processing

Nlp Python Deep Learning

NLP in Python with Deep Learning

Stars: ✭ 374 (-14.61%)

Mutual labels: natural-language-processing

Gnn4nlp Papers

A list of recent papers about Graph Neural Network methods applied in NLP areas.

Stars: ✭ 405 (-7.53%)

Mutual labels: natural-language-processing

Data Science

Collection of useful data science topics along with code and articles

Stars: ✭ 315 (-28.08%)

Mutual labels: natural-language-processing

Awesome Text Generation

A curated list of recent models of text generation and application

Stars: ✭ 370 (-15.53%)

Mutual labels: natural-language-processing

Bert Embedding

🔡 Token level embeddings from BERT model on mxnet and gluonnlp

Stars: ✭ 424 (-3.2%)

Mutual labels: natural-language-processing

Ln2sql

A tool to query a database in natural language

Stars: ✭ 403 (-7.99%)

Mutual labels: natural-language-processing

Southkorea Maps

South Korea administrative divisions in ESRI Shapefile, GeoJSON and TopoJSON formats.

Stars: ✭ 367 (-16.21%)

Mutual labels: korean

Nlp

[UNMANTEINED] Extract values from strings and fill your structs with nlp.

Stars: ✭ 367 (-16.21%)

Mutual labels: natural-language-processing

D2l Vn

Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.

Stars: ✭ 402 (-8.22%)

Mutual labels: natural-language-processing

Typescript Kr.github.io

🇰🇷 TypeScript Handbook in Korean

Stars: ✭ 364 (-16.89%)

Mutual labels: korean

Matchzoo Py

Facilitating the design, comparison and sharing of deep text matching models.

Stars: ✭ 362 (-17.35%)

Mutual labels: natural-language-processing

Code search

Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"