All Projects → Indonesian Nlp Resources → Similar Projects or Alternatives

1395 Open source projects that are alternatives of or similar to Indonesian Nlp Resources

Nlp bahasa resources

A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia

Stars: ✭ 158 (+10.49%)

Mutual labels: dataset, corpus, sentiment-analysis

Awesome Hungarian Nlp

A curated list of NLP resources for Hungarian

Stars: ✭ 121 (-15.38%)

Mutual labels: dataset, corpus, named-entity-recognition

DeepSentiPers

Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"

Stars: ✭ 17 (-88.11%)

Mutual labels: sentiment-analysis, corpus

Cluepretrainedmodels

高质量中文预训练模型集合：最先进大模型、最快小模型、相似度专门模型

Stars: ✭ 493 (+244.76%)

Mutual labels: dataset, corpus

Hanlp

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

Stars: ✭ 24,626 (+17120.98%)

Mutual labels: named-entity-recognition, pos-tagging

gum

Repository for the Georgetown University Multilayer Corpus (GUM)

Stars: ✭ 71 (-50.35%)

Mutual labels: corpus, pos-tagging

CLUEmotionAnalysis2020

CLUE Emotion Analysis Dataset 细粒度情感分析数据集

Stars: ✭ 3 (-97.9%)

Mutual labels: sentiment-analysis, corpus

Phobert

PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)

Stars: ✭ 332 (+132.17%)

Mutual labels: named-entity-recognition, pos-tagging

Vncorenlp

A Vietnamese natural language processing toolkit (NAACL 2018)

Stars: ✭ 354 (+147.55%)

Mutual labels: named-entity-recognition, pos-tagging

Gossiping Chinese Corpus

PTT 八卦版問答中文語料

Stars: ✭ 137 (-4.2%)

Mutual labels: dataset, corpus

Wikipedia ner

📖 Labeled examples from wiki dumps in Python

Stars: ✭ 61 (-57.34%)

Mutual labels: dataset, named-entity-recognition

Turkish Bert Nlp Pipeline

Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.

Stars: ✭ 85 (-40.56%)

Mutual labels: sentiment-analysis, named-entity-recognition

Paribhasha

paribhasha.herokuapp.com/

Stars: ✭ 21 (-85.31%)

Mutual labels: sentiment-analysis, pos-tagging

TweebankNLP

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Stars: ✭ 84 (-41.26%)

Mutual labels: named-entity-recognition, pos-tagging

Pytorch-NLU

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech ta…

Stars: ✭ 151 (+5.59%)

Mutual labels: named-entity-recognition, pos-tagging

nlp-cheat-sheet-python

NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition

Stars: ✭ 69 (-51.75%)

Mutual labels: named-entity-recognition, pos-tagging

Bertweet

BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

Stars: ✭ 282 (+97.2%)

Mutual labels: sentiment-analysis, named-entity-recognition

Linusrants

Dataset of Linus Torvalds' rants classified by negativity using sentiment analysis

Stars: ✭ 291 (+103.5%)

Mutual labels: dataset, sentiment-analysis

Awesome Persian Nlp Ir

Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources

Stars: ✭ 460 (+221.68%)

Mutual labels: corpus, named-entity-recognition

Species-Names-Corpus

物种名称语料库。植物名,动物名。

Stars: ✭ 23 (-83.92%)

Mutual labels: corpus, dataset

Prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Stars: ✭ 139 (-2.8%)

Mutual labels: dataset, corpus

Harvesttext

文本挖掘和预处理工具（文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等），无监督或弱监督方法

Stars: ✭ 956 (+568.53%)

Mutual labels: sentiment-analysis, named-entity-recognition

Phonlp

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)

Stars: ✭ 56 (-60.84%)

Mutual labels: named-entity-recognition, pos-tagging

French Sentiment Analysis Dataset

A collection of over 1.5 Million tweets data translated to French, with their sentiment.

Stars: ✭ 35 (-75.52%)

Mutual labels: dataset, sentiment-analysis

Ua Gec

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Stars: ✭ 108 (-24.48%)

Mutual labels: dataset, corpus

Dan Jurafsky Chris Manning Nlp

My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.

Stars: ✭ 124 (-13.29%)

Mutual labels: sentiment-analysis, named-entity-recognition

Camel tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Stars: ✭ 124 (-13.29%)

Mutual labels: sentiment-analysis, named-entity-recognition

Chinese Names Corpus

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

Stars: ✭ 3,053 (+2034.97%)

Mutual labels: dataset, corpus

sequence labeling tf

Sequence Labeling in Tensorflow

Stars: ✭ 18 (-87.41%)

Mutual labels: named-entity-recognition, pos-tagging

Malaya

Natural Language Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/

Stars: ✭ 239 (+67.13%)

Mutual labels: sentiment-analysis, pos-tagging

rosette-elasticsearch-plugin

Document Enrichment plugin for Elasticsearch

Stars: ✭ 25 (-82.52%)

Mutual labels: sentiment-analysis, named-entity-recognition

wink-nlp

Developer friendly Natural Language Processing ✨

Stars: ✭ 312 (+118.18%)

Mutual labels: sentiment-analysis, pos-tagging

named-entity-recognition-template

Build a deep learning model for predicting the named entities from text.

Stars: ✭ 51 (-64.34%)

Mutual labels: corpus, named-entity-recognition

Spark Nlp

State of the Art Natural Language Processing

Stars: ✭ 2,518 (+1660.84%)

Mutual labels: sentiment-analysis, named-entity-recognition

Weibo terminator workflow

Update Version of weibo_terminator, This is Workflow Version aim at Get Job Done!

Stars: ✭ 259 (+81.12%)

Mutual labels: crawler, sentiment-analysis

Fakenewscorpus

A dataset of millions of news articles scraped from a curated list of data sources.

Stars: ✭ 255 (+78.32%)

Mutual labels: dataset, corpus

Informers

State-of-the-art natural language processing for Ruby

Stars: ✭ 306 (+113.99%)

Mutual labels: sentiment-analysis, named-entity-recognition

Medical-Names-Corpus

医疗语料库。医疗机构名语料库。药品本位码。

Stars: ✭ 26 (-81.82%)

Mutual labels: corpus, dataset

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (+209.79%)

Mutual labels: crawler, corpus

Weibo Analyst

Social media (Weibo) comments analyzing toolbox in Chinese 微博评论分析工具, 实现功能: 1.微博评论数据爬取; 2.分词与关键词提取; 3.词云与词频统计; 4.情感分析; 5.主题聚类

Stars: ✭ 430 (+200.7%)

Mutual labels: crawler, sentiment-analysis

Awesome Twitter Data

A list of Twitter datasets and related resources.

Stars: ✭ 533 (+272.73%)

Mutual labels: dataset, sentiment-analysis

Monpa

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

Stars: ✭ 203 (+41.96%)

Mutual labels: named-entity-recognition, pos-tagging

Company Names Corpus

公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。

Stars: ✭ 868 (+506.99%)

Mutual labels: dataset, corpus

Insuranceqa Corpus Zh

🚁 保险行业语料库，聊天机器人

Stars: ✭ 821 (+474.13%)

Mutual labels: dataset, corpus

Clue

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

Stars: ✭ 2,425 (+1595.8%)

Mutual labels: dataset, corpus

Nlp chinese corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

Stars: ✭ 6,656 (+4554.55%)

Mutual labels: dataset, corpus

Coarij

Corpus of Annual Reports in Japan

Stars: ✭ 55 (-61.54%)

Mutual labels: dataset, corpus

Images Web Crawler

This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..

Stars: ✭ 51 (-64.34%)

Mutual labels: dataset, crawler

Dataset List

lists of text corpus and more (mainly Japanese)

Stars: ✭ 84 (-41.26%)

Mutual labels: dataset, corpus

Cluener2020

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

Stars: ✭ 689 (+381.82%)

Mutual labels: dataset, named-entity-recognition

Pynlp

A pythonic wrapper for Stanford CoreNLP.