All Projects → Indian_ParallelCorpus → Similar Projects or Alternatives

186 Open source projects that are alternatives of or similar to Indian_ParallelCorpus

banglanmt
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
Stars: ✭ 91 (+295.65%)
BSD
The Business Scene Dialogue corpus
Stars: ✭ 51 (+121.74%)
ilmulti
Tooling to play around with multilingual machine translation for Indian Languages.
Stars: ✭ 19 (-17.39%)
Filipino-Text-Benchmarks
Open-source benchmark datasets and pretrained transformer models in the Filipino language.
Stars: ✭ 22 (-4.35%)
Mutual labels:  corpus, low-resource-languages
Code Docstring Corpus
Preprocessed Python functions and docstrings for automated code documentation (code2doc) and automated code generation (doc2code) tasks.
Stars: ✭ 137 (+495.65%)
indic nlp resources
Resources to go with the Indic NLP Library
Stars: ✭ 55 (+139.13%)
Mutual labels:  indian-languages
folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+143.48%)
Mutual labels:  corpus
jrte-corpus
Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)
Stars: ✭ 66 (+186.96%)
Mutual labels:  corpus
zero
Zero -- A neural machine translation system
Stars: ✭ 121 (+426.09%)
LanguageCodes
We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).
Stars: ✭ 70 (+204.35%)
Mutual labels:  corpus
named-entity-recognition-template
Build a deep learning model for predicting the named entities from text.
Stars: ✭ 51 (+121.74%)
Mutual labels:  corpus
text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Stars: ✭ 81 (+252.17%)
Mutual labels:  corpus
PoetryCorpus
Поэтический корпус русского языка
Stars: ✭ 40 (+73.91%)
Mutual labels:  corpus
OneStopEnglishCorpus
No description or website provided.
Stars: ✭ 38 (+65.22%)
Mutual labels:  corpus
ABD-NMT
Code for "Asynchronous bidirectional decoding for neural machine translation" (AAAI, 2018)
Stars: ✭ 32 (+39.13%)
DeepSentiPers
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Stars: ✭ 17 (-26.09%)
Mutual labels:  corpus
EntityTargetedActiveLearning
No description or website provided.
Stars: ✭ 17 (-26.09%)
Mutual labels:  low-resource-languages
SSAN
How Does Selective Mechanism Improve Self-attention Networks?
Stars: ✭ 18 (-21.74%)
dynmt-py
Neural machine translation implementation using dynet's python bindings
Stars: ✭ 17 (-26.09%)
open-discourse
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
Stars: ✭ 47 (+104.35%)
Mutual labels:  corpus
sentencepiece-jni
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Stars: ✭ 26 (+13.04%)
KWDLC
Kyoto University Web Document Leads Corpus
Stars: ✭ 64 (+178.26%)
Mutual labels:  corpus
thaigov-corpus
โครงการเก็บรวบรวมข่าวสารจากเว็บไซต์รัฐบาลไทย
Stars: ✭ 19 (-17.39%)
Mutual labels:  corpus
mev-corpus
MEV Data Corpus
Stars: ✭ 77 (+234.78%)
Mutual labels:  corpus
SpiCE-Corpus
An open-access corpus of conversational bilingual speech in Cantonese and English
Stars: ✭ 33 (+43.48%)
Mutual labels:  corpus
RNNSearch
An implementation of attention-based neural machine translation using Pytorch
Stars: ✭ 43 (+86.96%)
thesis
My thesis on "Open Source Code and Low Resource Languages" for an MSc in Language Science and Technology at Saarland University
Stars: ✭ 20 (-13.04%)
Mutual labels:  low-resource-languages
OpenDialog
An Open-Source Package for Chinese Open-domain Conversational Chatbot (中文闲聊对话系统,一键部署微信闲聊机器人)
Stars: ✭ 94 (+308.7%)
Mutual labels:  corpus
pdf-corpus
Python script to quickly create hand-crafted PDF files
Stars: ✭ 17 (-26.09%)
Mutual labels:  corpus
pytorch basic nmt
A simple yet strong implementation of neural machine translation in pytorch
Stars: ✭ 66 (+186.96%)
CBLUE
中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Stars: ✭ 379 (+1547.83%)
Mutual labels:  corpus
PubMed-PICO-Detection
PubMed PICO Element Detection Dataset
Stars: ✭ 37 (+60.87%)
Mutual labels:  corpus
egret-wenda-corpus
A Public Corpus for Machine Learning
Stars: ✭ 41 (+78.26%)
Mutual labels:  corpus
fastmorph
Fast corpus search engine originally made for the Corpus of Written Tatar language
Stars: ✭ 14 (-39.13%)
Mutual labels:  corpus
minimal-nmt
A minimal nmt example to serve as an seq2seq+attention reference.
Stars: ✭ 36 (+56.52%)
thai-language
computer tools for thai language
Stars: ✭ 20 (-13.04%)
Mutual labels:  corpus
NiuTrans.NMT
A Fast Neural Machine Translation System. It is developed in C++ and resorts to NiuTensor for fast tensor APIs.
Stars: ✭ 112 (+386.96%)
dialogue-datasets
collect the open dialog corpus and some useful data processing utils.
Stars: ✭ 24 (+4.35%)
Mutual labels:  corpus
TV4Dialog
No description or website provided.
Stars: ✭ 33 (+43.48%)
Mutual labels:  corpus
transformer
Neutron: A pytorch based implementation of Transformer and its variants.
Stars: ✭ 60 (+160.87%)
When-in-Rome
A meta-corpus of functional harmonic analysis.
Stars: ✭ 35 (+52.17%)
Mutual labels:  corpus
CLUEmotionAnalysis2020
CLUE Emotion Analysis Dataset 细粒度情感分析数据集
Stars: ✭ 3 (-86.96%)
Mutual labels:  corpus
malay-dataset
Text corpus for Bahasa Malaysia, https://malaya.readthedocs.io/en/latest/Dataset.html
Stars: ✭ 189 (+721.74%)
Mutual labels:  corpus
kanji-frequency
Kanji usage frequency data collected from various sources
Stars: ✭ 92 (+300%)
Mutual labels:  corpus
cljs-corpus
A greppable archive of ClojureScript code
Stars: ✭ 37 (+60.87%)
Mutual labels:  corpus
MT-Preparation
Machine Translation (MT) Preparation Scripts
Stars: ✭ 15 (-34.78%)
fuzzing-corpus
My fuzzing corpus
Stars: ✭ 120 (+421.74%)
Mutual labels:  corpus
parallel-corpora-tools
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Stars: ✭ 35 (+52.17%)
bible-corpus
A multilingual parallel corpus created from translations of the Bible.
Stars: ✭ 115 (+400%)
Mutual labels:  corpus
2018-dlsl
UPC Deep Learning for Speech and Language 2018
Stars: ✭ 18 (-21.74%)
nepali-translator
Neural Machine Translation on the Nepali-English language pair
Stars: ✭ 29 (+26.09%)
Mutual labels:  parallel-corpus
transformer-slt
Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)
Stars: ✭ 92 (+300%)
textbox
Text collections made available by the CLiGS group.
Stars: ✭ 19 (-17.39%)
Mutual labels:  corpus
Attention-Visualization
Visualization for simple attention and Google's multi-head attention.
Stars: ✭ 54 (+134.78%)
Word-Level-Eng-Mar-NMT
Translating English sentences to Marathi using Neural Machine Translation
Stars: ✭ 37 (+60.87%)
xl-sum
This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
Stars: ✭ 160 (+595.65%)
Mutual labels:  low-resource-languages
EdgarAllanPoetry
Computer-generated poetry
Stars: ✭ 22 (-4.35%)
Mutual labels:  corpus
wordfish-python
extract relationships from standardized terms from corpus of interest with deep learning 🐟
Stars: ✭ 19 (-17.39%)
Mutual labels:  corpus
Species-Names-Corpus
物种名称语料库。植物名,动物名。
Stars: ✭ 23 (+0%)
Mutual labels:  corpus
DataAugmentationNMT
Data Augmentation for Neural Machine Translation
Stars: ✭ 26 (+13.04%)
1-60 of 186 similar projects