🔓 Web extension for reading articles locked behind paywalls of over 50 german newspapers, e.g. Frankfurter Allgemeine Zeitung, Leipziger Volkszeitung & Hamburger Abendblatt

Stars: ✭ 63 (+90.91%)

Mutual labels: german

tensorrt-examples

TensorRT Examples (TensorRT, Jetson Nano, Python, C++)

Stars: ✭ 31 (-6.06%)

Mutual labels: segmentation

sembei

🍘 単語分割を経由しない単語埋め込み 🍘

Stars: ✭ 14 (-57.58%)

Mutual labels: computational-linguistics

Visual-Transformer-Paper-Summary

Summary of Transformer applications for computer vision tasks.

Stars: ✭ 51 (+54.55%)

Mutual labels: segmentation

adaptive-segmentation-mask-attack

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Stars: ✭ 50 (+51.52%)

Mutual labels: segmentation

Semantic-Aware-Attention-Based-Deep-Object-Co-segmentation

Semantic Aware Attention Based Deep Object Co-segmentation

Stars: ✭ 61 (+84.85%)

Mutual labels: segmentation

GENADEV OS

An AArch64 hobbyist OS for the Raspberry Pi 3 B+

Stars: ✭ 14 (-57.58%)

Mutual labels: german

Twelveish

🕛 Twelveish - Android Wear/Wear OS Watch Face

Stars: ✭ 29 (-12.12%)

Mutual labels: german

cluster tools

Distributed segmentation for bio-image-analysis

Stars: ✭ 26 (-21.21%)

Mutual labels: segmentation

XNet

CNN implementation for medical X-Ray image segmentation

Stars: ✭ 71 (+115.15%)

Mutual labels: segmentation

View All Similar Projects ➔

CISTEM

CISTEM is a stemming algorithm for the German language, developed by Leonie Weißweiler and Alexander Fraser. This repository contains official implementations in a variety of programming languages. At the moment, the following languages are available:

Python
Java
C++
C
Javascript
Go
Haskell
Perl
Swift

The code for each language encludes a method for stemming as well as one for segmentation, which returns the stripped suffix as well as the stem.

Performance

We performed a comparative analysis of six publicly available German stemmers, where CISTEM achieved the best results for f-measure and state-of-the-art results for runtime.

Gold standards

The gold_standards folder contains the two gold standards we used for evaluation. Each file is utf-8 text file with each line containing all the stems of one cluster separated by a single space. Note that we do not supply a reference stem for each cluster, as we measure stemming performance as the ability to group words with the same meaning, which is more relevant for information retrieval purposes than the absolute stem. If you use these gold standards in your own research, please cite our paper: Bibtex

More information on how we evaluated runtimes and stemming quality can be found in our paper:

Leonie Weißweiler, Alexander Fraser (2017). Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers. In Proceedings of the German Society for Computational Linguistics and Language Technology (GSCL), to appear.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

LeonieWeissweiler / CISTEM

Programming Languages

Labels

Projects that are alternatives of or similar to CISTEM

CISTEM

Performance

Gold standards