All Projects → concepticon → concepticon-data

concepticon / concepticon-data

Licence: other
The curation repository for the data behind Concepticon.

Programming Languages

TeX
3793 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to concepticon-data

programming-with-cpp20
Companion source code for "Programming with C++20 - Concepts, Coroutines, Ranges, and more"
Stars: ✭ 142 (+468%)
Mutual labels:  concepts
TextDatasetCleaner
🔬 Очистка датасетов от мусора (нормализация, препроцессинг)
Stars: ✭ 27 (+8%)
Mutual labels:  linguistics
mystem
CGo bindings to Yandex.Mystem
Stars: ✭ 28 (+12%)
Mutual labels:  linguistics
data-science-learning
📊 All of courses, assignments, exercises, mini-projects and books that I've done so far in the process of learning by myself Machine Learning and Data Science.
Stars: ✭ 32 (+28%)
Mutual labels:  concepts
thread pool
Thread pool using std::* primitives from C++17, with optional priority queue/greenthreading for POSIX.
Stars: ✭ 74 (+196%)
Mutual labels:  concepts
clinical nlp elastic
Clinical NLP Analysis with Elasticsearch and Kibana
Stars: ✭ 32 (+28%)
Mutual labels:  linguistics
eliza-rs
A rust implementation of ELIZA - a natural language processing program developed by Joseph Weizenbaum in 1966.
Stars: ✭ 48 (+92%)
Mutual labels:  linguistics
TextGridTools
Read, write, and manipulate Praat TextGrid files with Python
Stars: ✭ 84 (+236%)
Mutual labels:  linguistics
neural-net-linguistics
Papers about NN and linguistics
Stars: ✭ 14 (-44%)
Mutual labels:  linguistics
cefal
(Concepts-enabled) Functional Abstraction Layer for C++
Stars: ✭ 52 (+108%)
Mutual labels:  concepts
lingvo--Ner-ru
Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке
Stars: ✭ 38 (+52%)
Mutual labels:  linguistics
linguisticsdown
Easy Linguistics Document Writing with R Markdown
Stars: ✭ 24 (-4%)
Mutual labels:  linguistics
folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for proces…
Stars: ✭ 56 (+124%)
Mutual labels:  linguistics
notes
📓 Notes related to Computer Science stuff.
Stars: ✭ 15 (-40%)
Mutual labels:  concepts
OpenGNT
Open Greek New Testament Project; NA28 / NA27 Equivalent Text & Resources
Stars: ✭ 55 (+120%)
Mutual labels:  linguistics
NatLang
NatLang is an English parser with an extensible grammar
Stars: ✭ 20 (-20%)
Mutual labels:  linguistics
GNOME-Concepts
Concepts and ideas for the GNOME desktop
Stars: ✭ 13 (-48%)
Mutual labels:  concepts
Movie Trailers SwiftUI
A simple app which shows the lastest movies trailers based on different genres developed using SwiftUI.
Stars: ✭ 51 (+104%)
Mutual labels:  concepts
wikipron
Massively multilingual pronunciation mining
Stars: ✭ 167 (+568%)
Mutual labels:  linguistics
duree
Durée: the longest book ever written.
Stars: ✭ 67 (+168%)
Mutual labels:  linguistics

CLLD Concepticon

Build Status

The data underlying the Concepticon of the CLLD project is maintained in this repository. Here, you can find

Concepticon Data

  • For an overview on the status of all currently linked conceptlists, see here.
  • For basic information on metadata, see here.
  • For information on how you can contribute to the project or profit from the data sources we offer, see here.

Data Structure

  • conceptlists/ folder contains conceptlists with links to IDs in concepticon.tsv, the lists are named after the first person who proposed them, the year of the reference publication in which we extracted them, and the number of concepts. All these three parts of information are separated by a dash. Furthermore, in cases where two lists would have an identical name, we add alphabetical letters to the lists to distinguish them. Files need to have the columns "GLOSS" (some still have "ENGLISH" instead, but this needs to be changed), additionally, most (if not all files) have a "NUMBER" field indicating the number in the reference, which is also important for ordering the entries as given in the original source. Additional columns are more or less free to the user, but we tried to be consistent.
  • conceptlists.tsv contains metadata about the lists in conceptlists/.
  • references/references.bib the bibtex file showing links to all concept lists (bibtex-key identical to the name of the conceptlist file, without file-ending. File further contains links to the references in which the conceptlists were published (references stored in the "crossref" field).
  • sources/ contains pdf-files of each conceptlist (only the list-parts, not the full publications for copyright reasons), naming is the same as for the conceptlists, but with the ending ".pdf" instead of ".tsv".
  • concepticon.tsv the backbone concept list. All concepts from individual concept lists are linked to entries in this file.
  • concept_set_meta/ contains lists of metadata, relating concept sets to additional information, e.g. on Wikipedia. These lists are described by accompanying metadata files following the recommendations of the Model for Tabular Data and Metadata on the Web.
  • app/ contains data for running the JavaScript-based Concepticon lookup tool.

Update policy

We try to release concepticon-data (as well as the concepticon web app) regularly at least once a year. Generally, new releases should only become more comprehensive, i.e. all data ever released should also be part of the newest release. Occasionally, though, we may have to correct an erratum, which may result in some data being removed, or changes in identifiers of objects. So whenever a link to the web app breaks or a script using the concepticon-data API throws an error, you should consult the list of errata to see, whether an error correction may be the reason for this behaviour.

pyconcepticon

pyconcepticon provides a Python package to programmatically access Concepticon data.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].