All Projects → yuvalpinter → nytwit

yuvalpinter / nytwit

Licence: GPL-3.0 license
New York Times Word Innovation Types dataset

Projects that are alternatives of or similar to nytwit

trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+3285.71%)
Mutual labels:  news, corpus
folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+166.67%)
Mutual labels:  corpus, computational-linguistics
Nlp chinese corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+31595.24%)
Mutual labels:  news, corpus
tvsub
TVsub: DCU-Tencent Chinese-English Dialogue Corpus
Stars: ✭ 40 (+90.48%)
Mutual labels:  corpus
Chatbot-Training-Corpus
总结了一些可以用作聊天机器人训练实作的文字语聊,包含中英文不同语言
Stars: ✭ 117 (+457.14%)
Mutual labels:  corpus
GNews
A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.
Stars: ✭ 271 (+1190.48%)
Mutual labels:  news
feed2email
RSS/Atom feed updates in your email
Stars: ✭ 37 (+76.19%)
Mutual labels:  news
market-monitor
Interactive app to monitor market using Python
Stars: ✭ 20 (-4.76%)
Mutual labels:  news
NasdaqCloudDataService-SDK-Java
Nasdaq Data Link provides a modern and efficient method of delivery for real-time exchange data and other financial information. This repository provides a Java SDK for developing applications using Nasdaq Data Link's real-time data.
Stars: ✭ 70 (+233.33%)
Mutual labels:  news
Briefly
source based news in short : Winner @MumbaiHackathon 2018
Stars: ✭ 35 (+66.67%)
Mutual labels:  news
linguistics problems
Natural language processing in examples and games
Stars: ✭ 23 (+9.52%)
Mutual labels:  computational-linguistics
nearo
🔥 Nearo: A react.js app for local selling, buying, and news
Stars: ✭ 40 (+90.48%)
Mutual labels:  news
NewsPin
News app for android using Kotlin, coroutines, MVP architecture
Stars: ✭ 25 (+19.05%)
Mutual labels:  news
embedding evaluation
Evaluate your word embeddings
Stars: ✭ 32 (+52.38%)
Mutual labels:  computational-linguistics
RssNewsAPI
Free News API for fetching and categorizing news articles
Stars: ✭ 13 (-38.1%)
Mutual labels:  news
ncovis-2020
covid-19 舆论和新闻的可视化平台,获得了中国计算机学会、阿里云和机器之心等举办的疫情可视化比赛铜奖。🔥
Stars: ✭ 37 (+76.19%)
Mutual labels:  news
ariel-news-app
News App developed with Flutter featuring beautiful UI, category-based news, story for faster news reading, inbuilt article viewer, share feature, and more.
Stars: ✭ 31 (+47.62%)
Mutual labels:  news
OpenConvert
Text conversion tool (from e.g. Word, HTML, txt) to corpus formats TEI or FoLiA)
Stars: ✭ 20 (-4.76%)
Mutual labels:  corpus
savepagenow
A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
Stars: ✭ 140 (+566.67%)
Mutual labels:  news
habbo-downloader
⚡A tiny script to download various files directly from Habbo.
Stars: ✭ 47 (+123.81%)
Mutual labels:  news

NYTWIT

This repository hosts the New York Times Word Innovation Types dataset (NYTWIT), as presented in this report at COLING 2020.

Versions

Version Date Diff Details
V1.1 April 24, 2020 73 labels Re-annotation of mostly blends and compounds
V1 March 7, 2020 N/A Initial

Citation

If you use our dataset, please cite the following:

@inproceedings{nytwit,
    title = "{NYTWIT}: A Dataset of Novel Words in the {N}ew {Y}ork {T}imes",
    author = "Pinter, Yuval  and
      Jacobs, Cassandra L.  and
      Bittker, Max",
    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (Online)",
    publisher = "International Committee on Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.coling-main.572",
    pages = "6509--6515",
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].