All Projects → ko-ichi-h → Khcoder

ko-ichi-h / Khcoder

Licence: gpl-2.0
KH Coder: for Quantitative Content Analysis or Text Mining

Programming Languages

perl
6916 projects

Projects that are alternatives of or similar to Khcoder

trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+464.29%)
Mutual labels:  text-mining, corpus
malay-dataset
Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
Stars: ✭ 189 (+50%)
Mutual labels:  text-mining, corpus
Awesome Hungarian Nlp
A curated list of NLP resources for Hungarian
Stars: ✭ 121 (-3.97%)
Mutual labels:  corpus, text-mining
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-27.78%)
Mutual labels:  text-mining
Lexicon Thai
คลังศัพท์ภาษาไทย
Stars: ✭ 96 (-23.81%)
Mutual labels:  corpus
Colibri Core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Stars: ✭ 112 (-11.11%)
Mutual labels:  corpus
Dialog corpus
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Stars: ✭ 1,662 (+1219.05%)
Mutual labels:  corpus
Lexicon
A data package containing lexicons and dictionaries for text analysis
Stars: ✭ 87 (-30.95%)
Mutual labels:  text-mining
Scattertext
Beautiful visualizations of how language differs among document types.
Stars: ✭ 1,722 (+1266.67%)
Mutual labels:  text-mining
Datasets
Poetry-related datasets developed by THUAIPoet (Jiuge) group.
Stars: ✭ 111 (-11.9%)
Mutual labels:  corpus
Ua Gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Stars: ✭ 108 (-14.29%)
Mutual labels:  corpus
Text predictor
Char-level RNN LSTM text generator📄.
Stars: ✭ 99 (-21.43%)
Mutual labels:  text-mining
Textcluster
短文本聚类预处理模块 Short text cluster
Stars: ✭ 115 (-8.73%)
Mutual labels:  text-mining
Chi Corpus
迟先生语料库
Stars: ✭ 96 (-23.81%)
Mutual labels:  corpus
Keywords2vec
Stars: ✭ 121 (-3.97%)
Mutual labels:  text-mining
Pyclue
Python toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
Stars: ✭ 91 (-27.78%)
Mutual labels:  corpus
Sejong Corpus
Korean sejong corpus download and simple analysis
Stars: ✭ 116 (-7.94%)
Mutual labels:  corpus
Pansori
Tools for ASR Corpus Generation from Online Video
Stars: ✭ 106 (-15.87%)
Mutual labels:  corpus
Genius
Easily access song lyrics from Genius in a tibble.
Stars: ✭ 111 (-11.9%)
Mutual labels:  text-mining
Cluedatasetsearch
搜索所有中文NLP数据集,附常用英文NLP数据集
Stars: ✭ 2,112 (+1576.19%)
Mutual labels:  corpus

KH Coder: for Quantitative Content Analysis or Text Mining

Web

Japanese: http://khcoder.net
English: http://khcoder.net/en

Description

KH Coder is a free software for quantitative content analysis or text mining. It is also utilized for computational linguistics. You can analyze Catalan, Chinese (simplified), Dutch, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Slovenian and Spanish text with KH Coder.

Screenshots: https://goo.gl/photos/ixn1sTM3jm8o11bP8
Official book (in Japanese): http://amzn.to/2wHFxKg

How to run source code of KH Coder on Windows

  1. Download & install Perl: http://strawberryperl.com/
  2. (Fork and) clone this repository
  3. Download released *.exe file (Winzip self-extractor) of KH Coder 3
  4. Unzip the downloaded file into the clone directory
  5. Open command prompt window and go to the clone directory, type "perl kh_coder.pl", and hit "Enter" key

If you get errors like "Can't locate Jcode.pm in @INC", you need to install Perl module called "Jcode". To install it, type "cpanm Jcode" and hit "Enter" key on your command prompt window.

Above procedure is for people who want to develop or modify KH Coder. If you want to just try or use KH Coder, you don't need Perl. Please just download and unzip released *.exe file, then double click extracted "kh_coder.exe".

On Linux or other Un*x like system

You need:

  • MySQL
  • Perl (and some Perl modules)
  • R (and some R packages)
  • Morphological Analysis and POS Tagging software
    • ChaSen or MeCab for analyzing Japanese text
    • FreeLing or Stanford POS Tagger for analyzing English text
    • FreeLing for analyzing Catalan, French, German, Italian, Portuguese, Russian or Spanish text
    • MeCab and HanDic for analyzing Korean text
    • Stanford Word Segmenter and Stanford POS Tagger for analyzing Chinese text

See issue #91 for more details.

License

GNU GPL version 2 or later

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].