All Projects → phoible → dev

phoible / dev

Licence: GPL-3.0 license
PHOIBLE data and development.

Programming Languages

TeX
3793 projects
r
7636 projects

Projects that are alternatives of or similar to dev

lingtypology
R package for linguistic cartography and typological databases search
Stars: ✭ 47 (-47.78%)
Mutual labels:  linguistics, typology
Onset
A language evolution simulator, using realistic phonetic changes.
Stars: ✭ 30 (-66.67%)
Mutual labels:  linguistics, phonology
wikipron
Massively multilingual pronunciation mining
Stars: ✭ 167 (+85.56%)
Mutual labels:  linguistics, phonology
Tossi
Chooses correct Korean particle morphs for arbitrary words.
Stars: ✭ 160 (+77.78%)
Mutual labels:  linguistics
Rime Cantonese
Rime Cantonese input schema | 粵語拼音輸入方案
Stars: ✭ 173 (+92.22%)
Mutual labels:  linguistics
feminizator.github.io
Феминизатор слов
Stars: ✭ 29 (-67.78%)
Mutual labels:  linguistics
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-82.22%)
Mutual labels:  linguistics
Pycantonese
Cantonese Linguistics and NLP in Python
Stars: ✭ 147 (+63.33%)
Mutual labels:  linguistics
event-embedding-multitask
*SEM 2018: Learning Distributed Event Representations with a Multi-Task Approach
Stars: ✭ 22 (-75.56%)
Mutual labels:  linguistics
poesy
Poetic processing, for Python.
Stars: ✭ 28 (-68.89%)
Mutual labels:  linguistics
pfootprint
Political Discourse Analysis Using Pre-Trained Word Vectors.
Stars: ✭ 20 (-77.78%)
Mutual labels:  linguistics
WonderfulPolishLanguage
This is a repository created for the list of resources for learning and exploring Wonderful Polish language.
Stars: ✭ 31 (-65.56%)
Mutual labels:  linguistics
Hangulize
Korean Alphabet Transcription
Stars: ✭ 184 (+104.44%)
Mutual labels:  linguistics
nyt-first-said
Tweets when words are published for the first time in the NYT
Stars: ✭ 222 (+146.67%)
Mutual labels:  linguistics
Prosodic
Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Stars: ✭ 162 (+80%)
Mutual labels:  linguistics
lambda-notebook
Lambda Notebook: Formal Semantics in Jupyter
Stars: ✭ 16 (-82.22%)
Mutual labels:  linguistics
Hangulize
Hangulize transcribes non-Korean words into Hangul
Stars: ✭ 152 (+68.89%)
Mutual labels:  linguistics
pylangacq
Language Acquisition Research Tools
Stars: ✭ 33 (-63.33%)
Mutual labels:  linguistics
linguistics problems
Natural language processing in examples and games
Stars: ✭ 23 (-74.44%)
Mutual labels:  linguistics
Awesome Linguistics
A curated list of anything remotely related to linguistics
Stars: ✭ 207 (+130%)
Mutual labels:  linguistics

DOI

PHOIBLE

PHOIBLE is a database of phonological inventories and distinctive features, encompassing more than 3000 phonological inventories (doculects), representing more than 2100 ISO 639-3 language codes. PHOIBLE data is published in browsable form online at PHOIBLE Online, which corresponds with the most recent release of this repository.

Data in machine-readable form is available in this repository. It is not guaranteed to exactly match what is published at PHOIBLE Online, due to the occasional discovery and correction of errors, and the addition of new languages to the database. For this reason, it is recommended that you make use of the most recent release in your own analyses, rather than working from the tip of the master branch.

Documentation for PHOIBLE is hosted at at http://phoible.github.io/, including notational conventions, departures from official IPA usage, citation information, etc.

How to use this repository

Most people will not need to look beyond the data folder of this repository, which contains a phoneme-level data file (one row per languoid-phoneme pair) and a BibTeX file of all the data sources. The rest of the repo contains scripts used in the development and testing of PHOIBLE, such as code to aggregate the raw data files from the various donor databases. These are probably not of general interest or utility. The raw-data directory contains the raw data from the various donor data sources, as well as the feature mapping tables. This is also probably not what you want, so if in doubt, stick to the data directory.

Citing PHOIBLE

If you are citing the database as a whole, or making use of the phonological distinctive feature systems in PHOIBLE, please cite as follows:

Moran, Steven & McCloy, Daniel (eds.) 2019. PHOIBLE 2.0. Jena: Max Planck Institute for the Science of Human History. (Available online at http://phoible.org). DOI: 10.5281/zenodo.2626687

If you are citing phoneme inventory data for a particular language or languages, please use the name of the language as the title, and include the original data source as an element within PHOIBLE. For example:

UCLA Phonological Segment Inventory Database. 2019. Lelemi sound inventory (UPSID). In: Moran, Steven & McCloy, Daniel (eds.) PHOIBLE 2.0. Jena: Max Planck Institute for the Science of Human History. (Available online at http://phoible.org/inventories/view/441)

If you are using the raw data from this repository but are not using a labeled release, we recommend citing using the last commit hash at the time of your most recent cloning/forking of the repository, so that others can reproduce your work starting from the same snapshot of the repository that you are using. For example:

Moran, Steven & McCloy, Daniel (eds.) 2019. PHOIBLE. https://github.com/phoible/phoible/commit/444a46c9a94641d6c99f5c8bbe85b8ae1c6ce65f

History

PHOIBLE was originally developed as an SQL database and RDF knowledgebase for Moran’s dissertation, which explains many of the technical details and developmental challenges:

Moran, Steven. 2012. Phonetics Information Base and Lexicon. PhD thesis, University of Washington. http://hdl.handle.net/1773/22452

Here is a brief list of some publications that we have used PHOIBLE data for:

  • Blasi, Damián, Steven Moran, Scott Moisik, Paul Widmer, Dan Dediu, & Balthasar Bickel. 2019. Human sound systems are shaped by post-Neolithic changes in bite configuration. Science 363(6432), eaav3218. doi:10.1126/science.aav3218

  • Cysouw, Michael, Dan Dediu and Steven Moran. 2012. Still No Evidence for an Ancient Language Expansion from Africa. Science, 335(6069):657. doi:10.1126/science.1208841

  • Moran, Steven. 2012. Using Linked Data to Create a Typological Knowledge Base. In Christian Chiarcos, Sebastian Nordhoff and Sebastian Hellmann (eds), Linked Data in Linguistics: Representing and Connecting Language Data and Language Metadata. Springer, Heidelberg.

  • Moran, Steven, Daniel McCloy, and Richard Wright. 2012. Revisiting Population Size vs. Phoneme Inventory Size. Language 88(4): 877-893. doi:10.1353/lan.2012.0087

  • Moran, Steven and Damián Blasi. 2014. Cross-linguistic Comparison of Complexity Measures in Phonological Systems. In Frederick J. Newmeyer and Laurel Preston (eds), Measuring Grammatical Complexity. Oxford UP, Oxford.

A more complete list of research papers using PHOIBLE can be found on Google Scholar.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].