scikit-hubnessA Python package for hubness analysis and high-dimensional data mining
Stars: ✭ 41 (+141.18%)
dh-coreFunctional data science
Stars: ✭ 123 (+623.53%)
software-analyticsA repository with my data analysis results of software artifacts
Stars: ✭ 37 (+117.65%)
Suod(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)
Stars: ✭ 245 (+1341.18%)
imbalanced-ensembleClass-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible. | 模块化、灵活、易扩展的类别不平衡/长尾机器学习库
Stars: ✭ 199 (+1070.59%)
MatminerData mining for materials science
Stars: ✭ 251 (+1376.47%)
xforestA super-fast and scalable Random Forest library based on fast histogram decision tree algorithm and distributed bagging framework. It can be used for binary classification, multi-label classification, and regression tasks. This library provides both Python and command line interface to users.
Stars: ✭ 20 (+17.65%)
hdnomBenchmarking and Visualization Toolkit for Penalized Cox Models
Stars: ✭ 36 (+111.76%)
LasioPython library for reading and writing well data using Log ASCII Standard (LAS) files
Stars: ✭ 234 (+1276.47%)
Statistical LearningLecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Stars: ✭ 223 (+1211.76%)
KaliIntelligenceSuiteKali Intelligence Suite (KIS) shall aid in the fast, autonomous, central, and comprehensive collection of intelligence by executing standard penetration testing tools. The collected data is internally stored in a structured manner to allow the fast identification and visualisation of the collected information.
Stars: ✭ 58 (+241.18%)
Semantic-Busobject flow treatment, data transformation
Stars: ✭ 49 (+188.24%)
Apriori-and-Eclat-Frequent-Itemset-MiningImplementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.
Stars: ✭ 36 (+111.76%)
kenchiA scikit-learn compatible library for anomaly detection
Stars: ✭ 36 (+111.76%)
TweetfeelsReal-time sentiment analysis in Python using twitter's streaming api
Stars: ✭ 249 (+1364.71%)
PyDREAMPython Implementation of Decay Replay Mining (DREAM)
Stars: ✭ 22 (+29.41%)
MetQyRepository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)
Stars: ✭ 17 (+0%)
ChirpInterface to manage and centralize Google Alert information
Stars: ✭ 227 (+1235.29%)
Medium-Stats-AnalysisExploring data and analyzing metrics for user-specific Medium Stats
Stars: ✭ 27 (+58.82%)
DiVEAn interactive 3D web viewer of up to million points on one screen that represent data. Provides interaction for viewing high-dimensional data that has been previously embedded in 3D or 2D. Based on graphosaurus.js and three.js. For a Linux release of a complete embedding+visualization pipeline please visit https://github.com/sonjageorgievska/Em…
Stars: ✭ 26 (+52.94%)
Prefixspan PyThe shortest yet efficient Python implementation of the sequential pattern mining algorithm PrefixSpan, closed sequential pattern mining algorithm BIDE, and generator sequential pattern mining algorithm FEAT.
Stars: ✭ 214 (+1158.82%)
PaperWeeklyAI📚「@MaiweiAI」Studying papers in the fields of computer vision, NLP, and machine learning algorithms every week.
Stars: ✭ 50 (+194.12%)
scibloxsciblox - Easier Data Science and Machine Learning
Stars: ✭ 48 (+182.35%)
AsclepiusOpen Price Comparison for US Hospitals
Stars: ✭ 20 (+17.65%)
ppmlhdfePoisson pseudo-likelihood regression with multiple levels of fixed effects
Stars: ✭ 46 (+170.59%)
sugarcubeMonoidal data processes.
Stars: ✭ 32 (+88.24%)
hierarchical-clusteringA Python implementation of divisive and hierarchical clustering algorithms. The algorithms were tested on the Human Gene DNA Sequence dataset and dendrograms were plotted.
Stars: ✭ 62 (+264.71%)
QminerAnalytic platform for real-time large-scale streams containing structured and unstructured data.
Stars: ✭ 206 (+1111.76%)
Data-Analyst-NanodegreeThis repo consists of the projects that I completed as a part of the Udacity's Data Analyst Nanodegree's curriculum.
Stars: ✭ 13 (-23.53%)
Awesome Datascience📝 An awesome Data Science repository to learn and apply for real world problems.
Stars: ✭ 17,520 (+102958.82%)
teanaps자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (+435.29%)
Orange3🍊 📊 💡 Orange: Interactive data analysis
Stars: ✭ 3,152 (+18441.18%)
conferencias matutinas amloCSVs de las versiones estenográficas de las conferencias matutinas del Presidente Andres Manuel López Obrador ( Mañaneras AMLO )
Stars: ✭ 25 (+47.06%)
bsu🎓Repository for university labs on FAMCS, BSU
Stars: ✭ 91 (+435.29%)
ReaperSocial media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Stars: ✭ 240 (+1311.76%)
loonA Toolkit for Interactive Statistical Data Visualization
Stars: ✭ 45 (+164.71%)
DatascienceCurated list of Python resources for data science.
Stars: ✭ 3,051 (+17847.06%)
xgboost-smote-detect-fraudCan we predict accurately on the skewed data? What are the sampling techniques that can be used. Which models/techniques can be used in this scenario? Find the answers in this code pattern!
Stars: ✭ 59 (+247.06%)
DeepgraphAnalyze Data with Pandas-based Networks. Documentation:
Stars: ✭ 232 (+1264.71%)
TextClassification基于scikit-learn实现对新浪新闻的文本分类,数据集为100w篇文档,总计10类,测试集与训练集1:1划分。分类算法采用SVM和Bayes,其中Bayes作为baseline。
Stars: ✭ 86 (+405.88%)
Automlpipeline.jlA package that makes it trivial to create and evaluate machine learning pipeline architectures.
Stars: ✭ 223 (+1211.76%)
website-to-jsonConverts website to json using jQuery selectors
Stars: ✭ 37 (+117.65%)
Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (+1182.35%)
Gwu data miningMaterials for GWU DNSC 6279 and DNSC 6290.
Stars: ✭ 217 (+1176.47%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-5.88%)
perkeA keyphrase extractor for Persian
Stars: ✭ 60 (+252.94%)
simon-frontend💹 SIMON is powerful, flexible, open-source and easy to use machine learning knowledge discovery platform 💻
Stars: ✭ 114 (+570.59%)
EasyMinerEasy association rule mining and classification on the web
Stars: ✭ 14 (-17.65%)