Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

The codes of paper "Long Text Generation via Adversarial Training with Leaked Information" on AAAI 2018. Text generation using GAN and Hierarchical Reinforcement Learning.

Stars: ✭ 533 (-10.42%)

Mutual labels: natural-language-processing

React Modern Calendar Datepicker

A modern, beautiful, customizable date picker for React

Stars: ✭ 555 (-6.72%)

Mutual labels: persian

Languagetool

Style and Grammar Checker for 25+ Languages

Stars: ✭ 5,641 (+848.07%)

Mutual labels: natural-language-processing

Pythainlp

Thai Natural Language Processing in Python.

Stars: ✭ 582 (-2.18%)

Mutual labels: natural-language-processing

D2l Zh

《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被55个国家的300所大学用于教学。

Stars: ✭ 29,132 (+4796.13%)

Mutual labels: natural-language-processing

Awesome Bert Nlp

A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.

Stars: ✭ 567 (-4.71%)

Mutual labels: natural-language-processing

Sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Stars: ✭ 5,540 (+831.09%)

Mutual labels: natural-language-processing

Awesome Semi Supervised Learning

📜 An up-to-date & curated list of awesome semi-supervised learning papers, methods & resources.

Stars: ✭ 538 (-9.58%)

Mutual labels: natural-language-processing

Mycroft Core

Mycroft Core, the Mycroft Artificial Intelligence platform.

Stars: ✭ 5,489 (+822.52%)

Mutual labels: natural-language-processing

Ner Lstm

Named Entity Recognition using multilayered bidirectional LSTM

Stars: ✭ 532 (-10.59%)

Mutual labels: natural-language-processing

Fast abs rl

Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting. Chen and Bansal"

Stars: ✭ 569 (-4.37%)

Mutual labels: natural-language-processing

Chat

基于自然语言理解与机器学习的聊天机器人，支持多用户并发及自定义多轮对话

Stars: ✭ 516 (-13.28%)

Mutual labels: natural-language-processing

Hate Speech And Offensive Language

Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017

Stars: ✭ 543 (-8.74%)

Mutual labels: natural-language-processing

Pythoncode Tutorials

The Python Code Tutorials

Stars: ✭ 544 (-8.57%)

Mutual labels: natural-language-processing

Talisman

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

Stars: ✭ 584 (-1.85%)

Mutual labels: natural-language-processing

View All Similar Projects ➔

Hazm

Python library for digesting Persian text.

Text cleaning
Sentence and word tokenizer
Word lemmatizer
POS tagger
Shallow parser
Dependency parser
Interfaces for Persian corpora
NLTK compatible
Python 2.7, 3.4, 3.5 and 3.6 support

Usage

>>> from __future__ import unicode_literals
>>> from hazm import *

>>> normalizer = Normalizer()
>>> normalizer.normalize('اصلاح نويسه ها و استفاده از نیم‌فاصله پردازش را آسان مي كند')
'اصلاح نویسه‌ها و استفاده از نیم‌فاصله پردازش را آسان می‌کند'

>>> sent_tokenize('ما هم برای وصل کردن آمدیم! ولی برای پردازش، جدا بهتر نیست؟')
['ما هم برای وصل کردن آمدیم!', 'ولی برای پردازش، جدا بهتر نیست؟']
>>> word_tokenize('ولی برای پردازش، جدا بهتر نیست؟')
['ولی', 'برای', 'پردازش', '،', 'جدا', 'بهتر', 'نیست', '؟']

>>> stemmer = Stemmer()
>>> stemmer.stem('کتاب‌ها')
'کتاب'
>>> lemmatizer = Lemmatizer()
>>> lemmatizer.lemmatize('می‌روم')
'رفت#رو'

>>> tagger = POSTagger(model='resources/postagger.model')
>>> tagger.tag(word_tokenize('ما بسیار کتاب می‌خوانیم'))
[('ما', 'PRO'), ('بسیار', 'ADV'), ('کتاب', 'N'), ('می‌خوانیم', 'V')]

>>> chunker = Chunker(model='resources/chunker.model')
>>> tagged = tagger.tag(word_tokenize('کتاب خواندن را دوست داریم'))
>>> tree2brackets(chunker.parse(tagged))
'[کتاب خواندن NP] [را POSTP] [دوست داریم VP]'

>>> parser = DependencyParser(tagger=tagger, lemmatizer=lemmatizer)
>>> parser.parse(word_tokenize('زنگ‌ها برای که به صدا درمی‌آید؟'))
<DependencyGraph with 8 nodes>

Installation

The latest stable version of Hazm can be installed through pip:

pip install hazm

But for testing or using Hazm with the latest updates you may use:

pip install https://github.com/sobhe/hazm/archive/master.zip --upgrade

We have also trained tagger and parser models. You may put these models in the resources folder of your project.

Extensions

Note: These are not official versions of hazm, not uptodate on functionality and are not supported by Sobhe.

JHazm: A Java port of Hazm
NHazm: A C# port of Hazm

Thanks

to constributors: Mojtaba Khallash and Mohsen Imany.
to Virastyar project for persian word list.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 595

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (98) 🔗