All Projects โ†’ datanada โ†’ Awesome Korean Nlp

datanada / Awesome Korean Nlp

A curated list of resources for NLP (Natural Language Processing) for Korean

Projects that are alternatives of or similar to Awesome Korean Nlp

Rnnlg
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.
Stars: โœญ 487 (-9.98%)
Mutual labels:  natural-language-processing
Spacy Stanza
๐Ÿ’ฅ Use the latest Stanza (StanfordNLP) research models directly in spaCy
Stars: โœญ 508 (-6.1%)
Mutual labels:  natural-language-processing
Chat
ๅŸบไบŽ่‡ช็„ถ่ฏญ่จ€็†่งฃไธŽๆœบๅ™จๅญฆไน ็š„่Šๅคฉๆœบๅ™จไบบ๏ผŒๆ”ฏๆŒๅคš็”จๆˆทๅนถๅ‘ๅŠ่‡ชๅฎšไน‰ๅคš่ฝฎๅฏน่ฏ
Stars: โœญ 516 (-4.62%)
Mutual labels:  natural-language-processing
Doccano
Open source annotation tool for machine learning practitioners.
Stars: โœญ 5,600 (+935.12%)
Mutual labels:  natural-language-processing
Cdqa
โ›” [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
Stars: โœญ 500 (-7.58%)
Mutual labels:  natural-language-processing
Deep Semantic Similarity Model
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
Stars: โœญ 509 (-5.91%)
Mutual labels:  natural-language-processing
Learn Data Science For Free
This repositary is a combination of different resources lying scattered all over the internet. The reason for making such an repositary is to combine all the valuable resources in a sequential manner, so that it helps every beginners who are in a search of free and structured learning resource for Data Science. For Constant Updates Follow me in โ€ฆ
Stars: โœญ 4,757 (+779.3%)
Mutual labels:  natural-language-processing
Leakgan
The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.
Stars: โœญ 533 (-1.48%)
Mutual labels:  natural-language-processing
Seqgan
A simplified PyTorch implementation of "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient." (Yu, Lantao, et al.)
Stars: โœญ 502 (-7.21%)
Mutual labels:  natural-language-processing
Languagetool
Style and Grammar Checker for 25+ Languages
Stars: โœญ 5,641 (+942.7%)
Mutual labels:  natural-language-processing
Ml paper notes
๐Ÿ“– Notes and summaries of some Machine Learning / Computer Vision / NLP papers.
Stars: โœญ 496 (-8.32%)
Mutual labels:  natural-language-processing
Xlnet Pytorch
Simple XLNet implementation with Pytorch Wrapper
Stars: โœญ 501 (-7.39%)
Mutual labels:  natural-language-processing
Paper Reading
Paper reading list in natural language processing, including dialogue systems and text generation related topics.
Stars: โœญ 508 (-6.1%)
Mutual labels:  natural-language-processing
Neural Vqa
โ” Visual Question Answering in Torch
Stars: โœญ 487 (-9.98%)
Mutual labels:  natural-language-processing
Fewrel
A Large-Scale Few-Shot Relation Extraction Dataset
Stars: โœญ 526 (-2.77%)
Mutual labels:  natural-language-processing
Ml Mipt
Open Machine Learning course at MIPT
Stars: โœญ 480 (-11.28%)
Mutual labels:  natural-language-processing
Seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
Stars: โœญ 508 (-6.1%)
Mutual labels:  natural-language-processing
Awesome Semi Supervised Learning
๐Ÿ“œ An up-to-date & curated list of awesome semi-supervised learning papers, methods & resources.
Stars: โœญ 538 (-0.55%)
Mutual labels:  natural-language-processing
Ner Lstm
Named Entity Recognition using multilayered bidirectional LSTM
Stars: โœญ 532 (-1.66%)
Mutual labels:  natural-language-processing
Nlp Notebooks
A collection of notebooks for Natural Language Processing from NLP Town
Stars: โœญ 513 (-5.18%)
Mutual labels:  natural-language-processing

Awesome-Korean-NLP

A curated list of Natural Language Processing (NLP) of

  • NLP of Korean Text
  • NLP information written in Korean.

Feel free to contribute! or blab it here

Maintainer: Jaemin Cho

Index

  1. Tools
  2. Dataset
  3. Blogs / Slides / Researchers
  4. Papers
  5. Lectures
  6. Journals / Conferences / Institutes / Events
  7. Online Communities
  8. How to contribute

1. Tools

(Korean-specific tools are listed ahead of language-agnostic tools.)

1.1. Morpheme/ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ + Part of Speech(PoS)/ํ’ˆ์‚ฌ Tagger

  • Hannanum (ํ•œ๋‚˜๋ˆ”) (Java, C) [link]
    • KoNLPy (Python) [link]
  • Kkma (๊ผฌ๊ผฌ๋งˆ) (Java) [link] [paper]
    • KoNLPy (Python) [link]
  • Komoran (Java) [link]
    • KoNLPy (Python) [link]
  • Mecab-ko (C++) [link]
    • KoNLPy (Python) [link]
  • Twitter (Scala, Java) [link]
    • KoNLPy (Python) [link]
    • .NET, Node.js, Python, Ruby, Elasitc Search bindings
  • dparser (REST API) [link]
  • UTagger [link]
  • Arirang (Lucence, Java) [link]
  • Rouzeta [link] [slide] [video]
  • seunjeon (Scala, Java) [link]
  • RHINO (๋ผ์ด๋…ธ) [link]
  • KTS [paper]
  • ๊นœ์ง์ƒˆ [link]

1.2. Named Entity(NE) Tagger / ๊ฐœ์ฒด๋ช… ์ธ์‹๊ธฐ

1.3. Spell Checker / ๋งž์ถค๋ฒ• ๊ฒ€์‚ฌ๊ธฐ

  • PNU Spell Checker [link]
  • Naver Spell Checker [link]
  • Daum Spell Checker [link]
  • hunspell-ko [link]

1.4. Syntax Parser / ๊ตฌ๋ฌธ ๋ถ„์„๊ธฐ

  • dparser (REST API) [link]
  • NLP HUB (Java) [link]

1.5. Sentimental Analysis / ๊ฐ์ • ๋ถ„์„๊ธฐ

  • OpenHangul (์˜คํ”ˆํ•œ๊ธ€) [link] [paper]

1.6. Translator / ๋ฒˆ์—ญ๊ธฐ

1.7. Packages

1.8. Others / ๊ธฐํƒ€

  • Hangulpy (Python) [link]
    • ์ž๋™ ์กฐ์‚ฌ/์ ‘๋ฏธ์‚ฌ ์ฒจ๋ถ€, ์ž๋ชจ ๋ถ„ํ•ด ๋ฐ ๊ฒฐํ•ฉ
  • Hangulize (Python) [link]
    • ์™ธ๋ž˜์–ด ํ•œ๊ธ€ ๋ณ€ํ™˜
  • Hanja (Python) [link]
    • ํ•œ์ž ํ•œ๊ธ€ ๋ณ€ํ™˜
  • kroman [link]
  • hangul (Perl) [link]
    • Hangul Romanization
  • textrankr (Python) [link] [demo]
    • TextRank ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ๋ฌธ์„œ ์š”์•ฝ
  • ํ•œ๊ตญ์–ด Word2Vec [demo] [paper]
    • ํ•œ๊ตญ์–ด Word2Vec์˜ analogy test ๋ฐ๋ชจ
  • ๋‚˜์œ ๋‹จ์–ด ์‚ฌ์ „ [link]
    • crowdsourced dic about badword in korean

2. Dataset

  • Sejong Corpus [link]
  • KAIST Corpus [link]
  • Yonsei Univ. Corpus
  • Korea Univ. Corpus
  • Ulsan Univ. Corpus [link]
  • Wikipedia Dump [link] [Extractor]
  • NamuWiki Dump [link] [Extractor]
  • Naver News Archive [link]
  • Chosun Archive [link]
  • Naver sentiment movie corpus [link]
  • sci-news-sum-kr-50 [link]

3. Blogs / Slides / Researchers

3.1. Blogs

  • dsindex's blog [link]
  • ์—‘์‚ฌ์  , "ํ˜ผ์ž ํž˜์œผ๋กœ ํ•œ๊ตญ์–ด ์ฑ—๋ด‡ ๊ฐœ๋ฐœํ•˜๊ธฐ" [link]
  • Beomsu Kim, "word2vec ๊ด€๋ จ ์ด๋ก  ์ •๋ฆฌ" [link]
  • CPUU, "Google ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์˜คํ”ˆ์†Œ์Šค SyntaxNet ๊ณต๊ฐœ" (Korean tranlsation of Google blog) [link]
  • theeluwin, "python-crfsuite๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ•œ๊ตญ์–ด ์ž๋™ ๋„์–ด์“ฐ๊ธฐ๋ฅผ ํ•™์Šตํ•ด๋ณด์ž" [link]
  • Jaesoo Lim, "ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ ๋™ํ–ฅ" [link]

3.2. Slides

  • Lucy Park, "ํ•œ๊ตญ์–ด์™€ NLTK, Gensim์˜ ๋งŒ๋‚จ" (PyCon APAC 2015) [link]
  • Jeongkyu Shin, "Building AI Chat bot using Python 3 & TensorFlow" (PyCon APAC 2016) [link]
  • Changki Lee, "RNN & NLP Application" (Kangwon Univ. Machine Learning course) [link]
  • Kyunghoon Kim, "๋‰ด์Šค๋ฅผ ์žฌ๋ฏธ์žˆ๊ฒŒ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•; ๋‰ด์Šค์žผ" (PyCon APAC 2016) [link]
  • Hongjoo Lee, "Python ์œผ๋กœ 19๋Œ€ ๊ตญํšŒ ๋ฝ€๊ฐœ๊ธฐ" (PyCon APAC 2016) [link]
  • Kyumin Choi,"word2vecแ„‹แ…ต แ„Žแ…ฎแ„Žแ…ฅแ†ซแ„‰แ…ตแ„‰แ…ณแ„แ…ฆแ†ทแ„‹แ…ณแ†ฏ แ„†แ…กแ†ซแ„‚แ…กแ†ปแ„‹แ…ณแ†ฏ แ„„แ…ข" (PyCon APAC 2015) [link]
  • ้€ฒ่—ค่ฃ•ไน‹ (translated by Hongbae Kim), "๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์˜ ์—ฐ๊ตฌ๋™ํ–ฅ" [link]
  • Hongbae Kim, "๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ๊ธฐ์ˆ (I)" [link]
  • Changki Lee, "์ž์—ฐ์–ด์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•œ ๊ธฐ๊ณ„ํ•™์Šต ์†Œ๊ฐœ" [link]
  • Taeil Kim, Daeneung Son, "๊ธฐ๊ณ„ ๋ฒˆ์—ญ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์งˆ์˜ ๊ต์ • ์‹œ์Šคํ…œ" (Naver DEVIEW 2015) [link]

4. Papers

4.1. Korean

  • ๊น€๋™์ค€, ์ด์—ฐ์ˆ˜, ์žฅ์ •์„ , ์ž„ํ•ด์ฐฝ, ๊ณ ๋ ค๋Œ€ํ•™๊ต, (์ฃผ)์—”์”จ์†Œํ”„ํŠธ, "ํ•œ๊ตญ์–ด ๋Œ€ํ™” ํ™”ํ–‰ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์–ดํœ˜ ์ž์งˆ์˜ ์ž„๋ฒ ๋”ฉ(2015๋…„ ๋™๊ณ„ํ•™์ˆ ๋ฐœํ‘œํšŒ ๋…ผ๋ฌธ์ง‘)" [paper] link dead

4.2. English

5. Lectures

5.1. Korean Lectures

  • Kangwon Univ. ์ž์—ฐ์–ธ์–ด์ฒ˜๋ฆฌ [link]
  • ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ์Šค์ฟจ [link]
  • SNU Data Mining / Business Analytics [link]

5.2. English Lectures

  • Stanford CS224n: Natural Language Processing [link] [YouTube]
  • Stanford CS224d: Deep Learning for Natural Language Processing [link] [YouTube]
  • NLTK with Python 3 for NLP (by Sentdex) [YouTube]
  • LDA Topic Models [link]

6. Conferences / Institutes / Events

6.1. Conferences

  • ํ•œ๊ธ€ ๋ฐ ํ•œ๊ตญ์–ด ์ •๋ณด์ฒ˜๋ฆฌ ํ•™์ˆ ๋Œ€ํšŒ [link]
  • KIPS (ํ•œ๊ตญ์ •๋ณด์ฒ˜๋ฆฌํ•™ํšŒ) [link]
  • ํ•œ๊ตญ์Œ์„ฑํ•™ํšŒ ํ•™์ˆ ๋Œ€ํšŒ [link]

6.2. Institutes

  • ์–ธ์–ด๊ณตํ•™์—ฐ๊ตฌํšŒ [link]
    • ํ•œ๊ธ€ ๋ฐ ํ•œ๊ตญ์–ด ์ •๋ณด์ฒ˜๋ฆฌ ํ•™์ˆ ๋Œ€ํšŒ (Since 1989, ๋งค๋…„ ๊ฐœ์ตœ) [link]
    • ๊ตญ์–ด ์ •๋ณด ์ฒ˜๋ฆฌ ์‹œ์Šคํ…œ ๊ฒฝ์ง„๋Œ€ํšŒ (Since 2010, ๋งค๋…„ ๊ฐœ์ตœ, ์ฃผ์ตœ: ๋ฌธํ™”์ฒด์œก๊ด€๊ด‘๋ถ€ ๋ฐ ๊ตญ๋ฆฝ๊ตญ์–ด์›) [link]
    • ์ž์—ฐ์–ธ์–ด์ฒ˜๋ฆฌ ํŠœํ† ๋ฆฌ์–ผ (๋น„์ •๊ธฐ์ ) [link]
    • ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ฐ ์ •๋ณด๊ฒ€์ƒ‰ ์›Œํฌ์ƒต [link]
  • ํ•œ๊ตญ์Œ์„ฑํ•™ํšŒ [link]

6.3. Events / Contests

  • ๊ตญ์–ด ์ •๋ณด ์ฒ˜๋ฆฌ ์‹œ์Šคํ…œ ๊ฒฝ์ง„ ๋Œ€ํšŒ [link]

7. Online Communities

  • Tensorflow KR (Facebook Group) [link]
  • AI Korea (Facebook Group) [link]
  • Bot Group (Facebook Group) [link]
  • ๋ฐ”๋ฒจํ”ผ์‰ฌ (Facebook Group) [link]
  • Reddit Machine Learning Top posts [link]

8. How to contribute

  1. Fork this Repository, by clicking on "fork" icon at the top right corner.

  2. Get the link for the forked repo, by clicking on the green button on your page. something like, "https://github.com/[username]/Awesome-Korean-NLP.git"

  3. On your local machine, "git clone https://github.com/[username]/Awesome-Korean-NLP.git"

  4. "cd Awesome-Korean-NLP"

  5. open "README.md" with your favorite text editor.

  6. Edit.

  7. git commit -a -m "added section 8: emoticons"

  8. git push, and verify on your fork

  9. goto https://github.com/datanada/Awesome-Korean-NLP and create pull request.

  10. "compare across forks" with base: datanada/Awesome.. and head: [username]/Awesome..

[beginners guide]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].