INL / Blacklab
A corpus retrieval engine based on Apache Lucene
Programming Languages
java 68154 projects - #9 most used programming language
Projects that are alternatives of or similar to Blacklab
WordlessAn Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Stars: ✭ 378 (+447.83%)
Mutual labels: corpus
QuantedaAn R package for the Quantitative Analysis of Textual Data
Stars: ✭ 647 (+837.68%)
Mutual labels: corpus
Lyrics CorporaAn unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts
Stars: ✭ 13 (-81.16%)
Mutual labels: corpus
Chinese Nlp CorpusCollections of Chinese NLP corpus
Stars: ✭ 438 (+534.78%)
Mutual labels: corpus
Seq2seq ChatbotChatbot in 200 lines of code using TensorLayer
Stars: ✭ 777 (+1026.09%)
Mutual labels: corpus
Cluecorpus2020Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
Stars: ✭ 278 (+302.9%)
Mutual labels: corpus
Company Names Corpus公司名语料库。机构名语料库。公司简称,缩写,品牌词,企业名。可用于中文分词、机构名实体识别。
Stars: ✭ 868 (+1157.97%)
Mutual labels: corpus
BookcorpusCrawl BookCorpus
Stars: ✭ 443 (+542.03%)
Mutual labels: corpus
Small Chinese CorpusSome useful Chinese corpus datasets 中文语料小数据
Stars: ✭ 462 (+569.57%)
Mutual labels: corpus
CorporaA collection of small corpuses of interesting data for the creation of bots and similar stuff.
Stars: ✭ 4,293 (+6121.74%)
Mutual labels: corpus
Typing AssistantTyping Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Stars: ✭ 32 (-53.62%)
Mutual labels: corpus
FuzzdataFuzzing resources for feeding various fuzzers with input. 🔧
Stars: ✭ 376 (+444.93%)
Mutual labels: corpus
Nlp chinese corpus大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Stars: ✭ 6,656 (+9546.38%)
Mutual labels: corpus
CoarijCorpus of Annual Reports in Japan
Stars: ✭ 55 (-20.29%)
Mutual labels: corpus
Chatterbot CorpusA multilingual dialog corpus
Stars: ✭ 964 (+1297.1%)
Mutual labels: corpus
Naive Bayes ClassifierNaive Bayes classifier is classification algorithm. It uses Naive based Bernoulli and Multinomial equation to classify documents(Text) as ham or spam.
Stars: ✭ 6 (-91.3%)
Mutual labels: corpus
== What is BlackLab? ==
[http://inl.github.io/BlackLab/ BlackLab] is a corpus retrieval engine built on top of [http://lucene.apache.org/ Apache Lucene]. It allows fast, complex searches with accurate hit highlighting on large, tagged and annotated, bodies of text. It was developed at the Institute of Dutch Lexicology (INL) to provide a fast and feature-rich search
interface on our historical and contemporary text corpora.
We're also working on BlackLab Server, a web service interface to BlackLab, so you can access it from any programming language. BlackLab Server is included in the repository as well.
BlackLab and BlackLab Server are licensed under the [http://www.apache.org/licenses/LICENSE-2.0 Apache License 2.0].
More information at the [http://inl.github.io/BlackLab/ official project site].
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at
[email protected].