GitPlanet
Projects
Users
Categories
Languages
About
All Git Users
→ oscar-corpus
2 open source projects by oscar-corpus
[ Open user page on Github ]
1.
goclassy
An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
✭ 81
go
nlp
corpus-linguistics
fasttext
common-crawl
language-classification
2.
ungoliant
🕷️ The pipeline for the OSCAR corpus
✭ 69
rust
nlp
crawler
corpus-linguistics
fasttext
oscar
commoncrawl
common-crawl
language-classification
1-2
of
2
user projects