All Projects → morfologik → Polimorfologik

morfologik / Polimorfologik

Scripts for preprocessing morfologik data.

Labels

Projects that are alternatives of or similar to Polimorfologik

Hubot Slack Docker
Docker container running Github Hubot.
Stars: ✭ 21 (-32.26%)
Mutual labels:  makefile
Ariane Sdk
Ariane SDK containing RISC-V tools and Buildroot
Stars: ✭ 28 (-9.68%)
Mutual labels:  makefile
Ath11k Firmware
Firmware files for ath11k, a mac80211 driver for Qualcomm Technologies 802.11ax devices
Stars: ✭ 30 (-3.23%)
Mutual labels:  makefile
Bashmultitool
A library for bash shell program containing useful functions. Can be imported into scripts to create colourful and functional scripts and TUIs.
Stars: ✭ 27 (-12.9%)
Mutual labels:  makefile
Ansible Environment
Ansible role which adds /etc/environment variables
Stars: ✭ 27 (-12.9%)
Mutual labels:  makefile
Circleci Multi File Config
A sane workflow for managing large circleci configurations.
Stars: ✭ 29 (-6.45%)
Mutual labels:  makefile
Openpht
OpenPHT for AML
Stars: ✭ 20 (-35.48%)
Mutual labels:  makefile
Api tool postman
API开发利器:Postman
Stars: ✭ 31 (+0%)
Mutual labels:  makefile
Jmap
JSON Meta Application Protocol Specification (JMAP)
Stars: ✭ 942 (+2938.71%)
Mutual labels:  makefile
Android device samsung p4wifi
Device configuration for Samsung Galaxy Tab 10.1 - Google I/O edition
Stars: ✭ 30 (-3.23%)
Mutual labels:  makefile
Vault Auth Plugin Example
An example @HashiCorp Vault auth plugin
Stars: ✭ 27 (-12.9%)
Mutual labels:  makefile
Qubes Roadmap
High-level milestone planning for Qubes OS
Stars: ✭ 27 (-12.9%)
Mutual labels:  makefile
Gomake
Example sources for a talk about Golang & Makefiles
Stars: ✭ 29 (-6.45%)
Mutual labels:  makefile
Node Tab
Unix-style tables for command-line utilities
Stars: ✭ 21 (-32.26%)
Mutual labels:  makefile
Buildroot
Buildroot for the New bittboy (see bittboy branch)
Stars: ✭ 30 (-3.23%)
Mutual labels:  makefile
Android Audioplayer
An AudioPlayer For Android Platform
Stars: ✭ 16 (-48.39%)
Mutual labels:  makefile
Pxt Filesystem
File system - beta
Stars: ✭ 28 (-9.68%)
Mutual labels:  makefile
Pi Builder
Extensible tool to build Arch Linux ARM for Raspberry Pi on x86_64 host using Docker
Stars: ✭ 31 (+0%)
Mutual labels:  makefile
Istio Cross Namespace Canary Release Demo
Cross-namespace canary release using Kubernetes, Istio and Helm
Stars: ✭ 31 (+0%)
Mutual labels:  makefile
Eos Party Testnet
Deprecated(Recommended https://www.cryptokylin.io/)
Stars: ✭ 30 (-3.23%)
Mutual labels:  makefile

PoliMorfologik

Morfologik is a project aiming at generating Polish morphosyntactic dictionaries (hence the name) used for part-of-speech tagging and part-of-speech synthesis. The PoliMorfologik dictionary is a result of the PoliMorf project. It contains around It over 215 thousand lexemes and around 3.5 million word forms.

The dictionary was created by enriching the Polish ispell/hunspell dictionary with morphological information, which was possible thanks to the structure of the original dictionary that retained important grammatical distinctions. The process of conversion relied on a series of scripts, and the resulting dictionary was later augmented with manually entered information. Unfortunately, the original source dictionary did not contain sufficient structure to allow reliable detection of some information, such as the exact subgender of the masculine for substantives. This information was added manually and using heuristic methods, however its reliability is low. Considering the fact that the substantives are about one third of the dictionary content (and almost half of them are masculine), this limitation is severe.

The tagset of the dictionary is inspired by the IPI PAN Tagset. However, Morfologik diverges from that tagset and from Morfeusz, as it never splits orthographic (“space-to-space”) words into smaller dictionary words (i.e. so-called agglutination is not considered). Moreover, due to the lack of information in the ispell dictionary, some forms are not completely annotated, and are marked as irregular. There is, however, some additional mark up added to reflexive verbs, which is not present in the original IPI PAN Tagset. This was introduced for the purposes of the grammar checker LanguageTool that used the dictionary extensively.

The dictionaries can be used with morfologik-stemming library and tools.

See src/LICENSE.txt for license information. Basically, it's BSD.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].