Alternatives and detailed information of OneStopEnglishCorpus

nishkalavallabhi / OneStopEnglishCorpus

Licence: CC-BY-SA-4.0 License

No description or website provided.

Projects that are alternatives of or similar to OneStopEnglishCorpus

Awesome Deeplearning Resources

Deep Learning and deep reinforcement learning research papers and some codes

Stars: ✭ 2,483 (+6434.21%)

Mutual labels: paper, corpus

folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…

Stars: ✭ 56 (+47.37%)

Mutual labels: corpus

neural network papers

记录一些读过的论文，给出个人对论文的评分情况并简述论文insight

Stars: ✭ 152 (+300%)

Mutual labels: paper

adage

Data and code related to the paper "ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa..." Jie Tan, et al · mSystems · 2016

Stars: ✭ 61 (+60.53%)

Mutual labels: paper

Awesome-Polarization

List of awesome papers on Polarization Imaging

Stars: ✭ 31 (-18.42%)

Mutual labels: paper

Object-Detection-Confidence-Bias

Code for "The Box Size Confidence Bias Harms Your Object Detector" (https://arxiv.org/abs/2112.01901)

Stars: ✭ 22 (-42.11%)

Mutual labels: paper

influence boosting

Supporting code for the paper "Finding Influential Training Samples for Gradient Boosted Decision Trees"

Stars: ✭ 57 (+50%)

Mutual labels: paper

Wavelet-like-Auto-Encoder

No description or website provided.

Stars: ✭ 61 (+60.53%)

Mutual labels: paper

thai-language

computer tools for thai language

Stars: ✭ 20 (-47.37%)

Mutual labels: corpus

named-entity-recognition-template

Build a deep learning model for predicting the named entities from text.

Stars: ✭ 51 (+34.21%)

Mutual labels: corpus

ZSL-ADA

Code accompanying the paper "A Generative Framework for Zero Shot Learning with Adversarial Domain Adaptation"

Stars: ✭ 18 (-52.63%)

Mutual labels: paper

Paper Note

📚 记录一些自己读过的论文与笔记

Stars: ✭ 22 (-42.11%)

Mutual labels: paper

gemnet pytorch

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

Stars: ✭ 80 (+110.53%)

Mutual labels: paper

midi degradation toolkit

A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.

Stars: ✭ 29 (-23.68%)

Mutual labels: paper

my-bookshelf

Collection of books/papers that I've read/I'm going to read/I would remember that they exist/It is unlikely that I'll read/I'll never read.

Stars: ✭ 49 (+28.95%)

Mutual labels: paper

KWDLC

Kyoto University Web Document Leads Corpus

Stars: ✭ 64 (+68.42%)

Mutual labels: corpus

RTRT-Trans-Caustics

A reference implementation of ”Rendering transparent objects with caustics using real-time ray tracing“ using Unreal Engine 4.25.1.

Stars: ✭ 12 (-68.42%)

Mutual labels: paper

TAGCN

Tensorflow Implementation of the paper "Topology Adaptive Graph Convolutional Networks" (Du et al., 2017)

Stars: ✭ 17 (-55.26%)

Mutual labels: paper

sensim

Sentence Similarity Estimator (SenSim)

Stars: ✭ 15 (-60.53%)

Mutual labels: paper

PubMed-PICO-Detection

PubMed PICO Element Detection Dataset

Stars: ✭ 37 (-2.63%)

Mutual labels: corpus

View All Similar Projects ➔

This repository hosts the dataset described in the following paper:

OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification
Sowmya Vajjala and Ivana Lučić
2018
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 297–304. Association for Computational Linguistics.
url. bib file

Please cite the above paper if you use this corpus in your research.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Description of this repo:

Texts-SeparatedByReadingLevel/: This is the actual corpus folder, containing three sub-folders, one per reading level. Each file has the same name followed by a -ele.txt/-int.txt/-adv.txt depending on the sub-folder it is in.
Texts-Together-OneCSVperFile/: This folder has one csv file per text, three columns for three reading levels. Paragraph breaks are preserved.
Sentence-Aligned/: This folder contains three text files, with pair-wise sentence alignments (adv-int, int-ele, adv-ele). Cosine similarity was used to align sentences.
Processed-AllLevels-AllFiles/ : folder contains sub-folders with output files from Stanford parser, Stanford CoreNLP, and Upenn's Discourse Connectives Tagger

For enquiries: contact: [email protected]

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

nishkalavallabhi / OneStopEnglishCorpus

Labels

Projects that are alternatives of or similar to OneStopEnglishCorpus