GitPlanet
Projects
Users
Categories
Languages
About
All Categories
→
No Category
→ corpus-linguistics
Top 7 corpus-linguistics open source projects
kontext
An advanced, extensible web front-end for the Manatee-open corpus search engine
✭ 50
typescript
python
HTML
javascript
shell
PEG.js
user-interface
corpora
corpus-linguistics
corpus-tools
goclassy
An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
✭ 81
go
nlp
corpus-linguistics
fasttext
common-crawl
language-classification
kanji-frequency
Kanji usage frequency data collected from various sources
✭ 92
javascript
HTML
Less
coffeescript
shell
data
japanese
corpus
data-visualization
cjk
kanji
japanese-language
corpus-linguistics
frequency-lists
cjk-characters
kanji-frequency
nerus
Large silver standart Russian corpus with NER, morphology and syntax markup
✭ 47
python
Jupyter Notebook
Makefile
nlp
syntax
morphology
russian
corpus-linguistics
ner
CogNet
CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates
✭ 26
wordnet
corpus-linguistics
language-resources
cognate
bilingual-lexicon-extraction
low-resource-languages
cross-lingual-simialrity
multilinguality
cross-lingual-transfer
bilingual-lexicon-induction
ungoliant
🕷️ The pipeline for the OSCAR corpus
✭ 69
rust
nlp
crawler
corpus-linguistics
fasttext
oscar
commoncrawl
common-crawl
language-classification
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
✭ 16
C#
Rich Text Format
Vue
smalltalk
javascript
SCSS
visualization
nlp
data-science
natural-language-processing
text-mining
data-mining
sdk
big-data
text-analysis
journalism
linguistics
tagger
text-processing
corpus-linguistics
cooccurrence
natural-language-understanding
data-minig
datajournalism
corpus-processing
cleaning-data
1-7
of
7
corpus-linguistics projects