bookworm📚 social networks from novels
Stars: ✭ 72 (+84.62%)
RmdlRMDL: Random Multimodel Deep Learning for Classification
Stars: ✭ 375 (+861.54%)
EasyocrReady-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Stars: ✭ 13,379 (+34205.13%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+53.85%)
Wordtokenizers.jlHigh performance tokenizers for natural language processing and other related tasks
Stars: ✭ 63 (+61.54%)
GensimTopic Modelling for Humans
Stars: ✭ 12,763 (+32625.64%)
MetQyRepository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)
Stars: ✭ 17 (-56.41%)
MixGCFMixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems, KDD2021
Stars: ✭ 73 (+87.18%)
beirA Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Stars: ✭ 738 (+1792.31%)
conferencias matutinas amloCSVs de las versiones estenográficas de las conferencias matutinas del Presidente Andres Manuel López Obrador ( Mañaneras AMLO )
Stars: ✭ 25 (-35.9%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-58.97%)
website-to-jsonConverts website to json using jQuery selectors
Stars: ✭ 37 (-5.13%)
solrApache Solr open-source search software
Stars: ✭ 651 (+1569.23%)
BERT-QECode and resources for the paper "BERT-QE: Contextualized Query Expansion for Document Re-ranking".
Stars: ✭ 43 (+10.26%)
HARCode for WWW2019 paper "A Hierarchical Attention Retrieval Model for Healthcare Question Answering"
Stars: ✭ 22 (-43.59%)
ConvDRCode repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
Stars: ✭ 36 (-7.69%)
3d model retrieverExperimenting with a newly published deep learning paper and how it can be used for content-based 3D model retrieval. (info retrieval for CAD)
Stars: ✭ 45 (+15.38%)
naacl2018-feverFact Extraction and VERification baseline published in NAACL2018
Stars: ✭ 109 (+179.49%)
EasyMinerEasy association rule mining and classification on the web
Stars: ✭ 14 (-64.1%)
KaliIntelligenceSuiteKali Intelligence Suite (KIS) shall aid in the fast, autonomous, central, and comprehensive collection of intelligence by executing standard penetration testing tools. The collected data is internally stored in a structured manner to allow the fast identification and visualisation of the collected information.
Stars: ✭ 58 (+48.72%)
AsclepiusOpen Price Comparison for US Hospitals
Stars: ✭ 20 (-48.72%)
imbalanced-ensembleClass-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+410.26%)
Medium-Stats-AnalysisExploring data and analyzing metrics for user-specific Medium Stats
Stars: ✭ 27 (-30.77%)
dh-coreFunctional data science
Stars: ✭ 123 (+215.38%)
PyDREAMPython Implementation of Decay Replay Mining (DREAM)
Stars: ✭ 22 (-43.59%)
TextClassification基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+120.51%)
lex-glueLexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Stars: ✭ 98 (+151.28%)
simon-frontend💹 SIMON is powerful, flexible, open-source and easy to use machine learning knowledge discovery platform 💻
Stars: ✭ 114 (+192.31%)
LuceneTutorialA simple tutorial of Lucene for LIS 501 Introduction to Text Mining students at the University of Wisconsin-Madison (Fall 2021).
Stars: ✭ 62 (+58.97%)
EMNLP2020This is official Pytorch code and datasets of the paper "Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News", EMNLP 2020.
Stars: ✭ 55 (+41.03%)
scibloxsciblox - Easier Data Science and Machine Learning
Stars: ✭ 48 (+23.08%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+133.33%)
hierarchical-clusteringA Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.
Stars: ✭ 62 (+58.97%)
patzillaPatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Stars: ✭ 71 (+82.05%)
ImageRetrievalContent Based Image Retrieval Techniques (e.g. knn, svm using MatLab GUI)
Stars: ✭ 51 (+30.77%)
PaperWeeklyAI📚「@MaiweiAI」Studying papers in the fields of computer vision, NLP, and machine learning algorithms every week.
Stars: ✭ 50 (+28.21%)
ProQAProgressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
Stars: ✭ 44 (+12.82%)
query-wellformedness25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural language questions.
Stars: ✭ 80 (+105.13%)
Apriori-and-Eclat-Frequent-Itemset-MiningImplementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.
Stars: ✭ 36 (-7.69%)
gplPowerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Stars: ✭ 216 (+453.85%)
rust-stemmersA rust implementation of some popular snowball stemming algorithms
Stars: ✭ 85 (+117.95%)
sugarcubeMonoidal data processes.
Stars: ✭ 32 (-17.95%)
Data-Analyst-NanodegreeThis repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-66.67%)
Semantic-Busobject flow treatment, data transformation
Stars: ✭ 49 (+25.64%)
scikit-hubnessA Python package for hubness analysis and high-dimensional data mining
Stars: ✭ 41 (+5.13%)
techdocsAccord Project Documentation
Stars: ✭ 48 (+23.08%)
xforestA super-fast and scalable Random Forest library based on fast histogram decision tree algorithm and distributed bagging framework. It can be used for binary classification, multi-label classification, and regression tasks. This library provides both Python and command line interface to users.
Stars: ✭ 20 (-48.72%)
COVID19-IRQANo description or website provided.
Stars: ✭ 32 (-17.95%)
pqlite⚡ A fast embedded library for approximate nearest neighbor search
Stars: ✭ 141 (+261.54%)
FinBERT-QAFinancial Domain Question Answering with pre-trained BERT Language Model
Stars: ✭ 70 (+79.49%)