All Projects → caseykneale → Chemometricstools.jl

caseykneale / Chemometricstools.jl

Licence: other
A collection of tools for chemometrics and machine learning written in Julia.

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to Chemometricstools.jl

Yannl
Yet another neural network library
Stars: ✭ 37 (-5.13%)
Mutual labels:  classification, regression
Cranium
🤖 A portable, header-only, artificial neural network library written in C99
Stars: ✭ 501 (+1184.62%)
Mutual labels:  classification, regression
Mlr3
mlr3: Machine Learning in R - next generation
Stars: ✭ 463 (+1087.18%)
Mutual labels:  classification, regression
Tensorflow Resources
Curated Tensorflow code resources to help you get started with Deep Learning.
Stars: ✭ 330 (+746.15%)
Mutual labels:  classification, regression
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+2417.95%)
Mutual labels:  classification, regression
Lazypredict
Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
Stars: ✭ 401 (+928.21%)
Mutual labels:  classification, regression
The Deep Learning With Keras Workshop
An Interactive Approach to Understanding Deep Learning with Keras
Stars: ✭ 34 (-12.82%)
Mutual labels:  classification, regression
Fuku Ml
Simple machine learning library / 簡單易用的機器學習套件
Stars: ✭ 280 (+617.95%)
Mutual labels:  classification, regression
Smile
Statistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+13776.92%)
Mutual labels:  classification, regression
Alphapy
Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost
Stars: ✭ 564 (+1346.15%)
Mutual labels:  classification, regression
Go Deep
Artificial Neural Network
Stars: ✭ 303 (+676.92%)
Mutual labels:  classification, regression
Bayesian Neural Networks
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more
Stars: ✭ 900 (+2207.69%)
Mutual labels:  classification, regression
Pycaret
An open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+11679.49%)
Mutual labels:  regression, classification
Tensorflow Tutorial
Tensorflow tutorial from basic to hard, 莫烦Python 中文AI教学
Stars: ✭ 4,122 (+10469.23%)
Mutual labels:  classification, regression
R
All Algorithms implemented in R
Stars: ✭ 294 (+653.85%)
Mutual labels:  classification, regression
Tensorflow Book
Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.
Stars: ✭ 4,448 (+11305.13%)
Mutual labels:  classification, regression
Synthetic-data-gen
Various methods for generating synthetic data for data science and ML
Stars: ✭ 57 (+46.15%)
Mutual labels:  regression, classification
Minimalistic-Multiple-Layer-Neural-Network-from-Scratch-in-Python
Minimalistic Multiple Layer Neural Network from Scratch in Python.
Stars: ✭ 24 (-38.46%)
Mutual labels:  regression, classification
Php Ml Examples
Examples use case of PHP-ML library.
Stars: ✭ 526 (+1248.72%)
Mutual labels:  classification, regression
Tensorflow cookbook
Code for Tensorflow Machine Learning Cookbook
Stars: ✭ 5,984 (+15243.59%)
Mutual labels:  classification, regression

Build Status

ChemometricsTools.jl

This package contains a collection of tools to perform fundamental and advanced Chemometric analysis' in Julia. It is currently richer than any other free chemometrics package available in any other language. If you are uninformed as to what Chemometrics is; it could nonelegantly be described as the marriage between data science and chemistry. Traditionally it is the symbiosis of applied linear algebra/statistics which is disciplined by the physics and meaning of chemical measurements. This is somewhat orthogonal to most specializations of machine learning where "add more layers" is the modus operandi. Sometimes chemometricians also weigh the pros and cons of black box modelling and break out pure machine learning methods - so some of those techniques are in this package.

Tutorials/Demonstrations:

Shootouts/Modeling Examples:

Package Status => Closer to Acceptability (v 0.5.8)

ChemometricsTools has been accepted as an official Julia package! Yep, so you can Pkg.add("ChemometricsTools") to install it. A lot of features have been added since the first public release (v 0.2.3 ). In 0.5.8 almost all of the functionality available can be used/abused. If you find a bug or want a new feature don't be shy - file an issue. In v0.5.1 Plots was removed as a dependency, new plot recipes were added, and now the package compiles much faster! Multilinear modeling, univariate modeling, and DOE functions are now available. Making headway into the release plan for v0.6.0. Convenience functions, documentation, bug fixes, refactoring and clean up are in progress bare with me. The git repo's master branch typically has the most advanced version, but the features on it may be less reliable because I like to do development on it.

Seeking Collaborators

So my time and efforts for building this package are constrained. I really would like to find some collaborators to help flesh this package out, use it, find bugs. Even if your interests are more leaning towards machine learning/statistics I'd love to hear from you. Please file an issue if you are interested - or send me a message on Julia Discourse (ckneale)!

Version Release Strategy

  • < 0.3.0 : Mapping functionality, prototyping
  • < 0.5.0 : Testing via actual usage on real data, look for missing essentials
  • < 0.6.0 : Bake in convenience functions for ease of use. Flesh out Documentation.
  • < 0.7.5 : Public input (find those bugs!). Adequate Unit Tests.
  • < 1.0.0 : Focus on performance, stability, generalizability, lock down the package syntax.

Package Highlights

Transforms/Pipelines

Two design choices introduced in this package are "Transformations" and "Pipelines". We can use transformations to treat data from multiple sources the same way. This helps mitigate user error for cases where test data is scaled based on training data, calibration transfer, etc.

Multiple transformations can easily be chained together and stored using "Pipelines". Pipelines aren't "pipes" like are present in Bash, R and base Julia. They are flexible, yet immutable, convenience objects that allow for sequential preprocessing and data transformations to be reused, chained, or automated for reliable analytic throughput.

Model training

ChemometricsTools offers easy to use iterators for K-folds validation's, and moving window sampling/training. More advanced sampling methods, like Kennard Stone, are just a function call away. Convenience functions for interval selections, weighting regression ensembles, etc are also available. These allow for ensemble models like SIPLS, P-DS, P-OSC, etc to be built quickly. With the tools included both in this package and Base Julia, nothing should stand in your way.

Regression Modeling

This package features dozens of regression performance metrics, and a few built in plots (Bland Altman, QQ, Interval Overlays etc) are included. The list of regression methods currently includes: CLS, Ridge, Kernel Ridge, LS-SVM, PCR, PLS(1/2), ELM's, Regression Trees, Random Forest, Monotone Regression... More to come. Chemometricians love regressions! I've also added some convenience functions for univariate calibrations, standard addition experiments and some automated plot functions for them.

Classification Modeling

In-house classification encodings (one cold/one hot), and easy to retrieve global or multiclass performance statistics. ChemometricsTools currently includes: LDA/PCA with Gaussian discriminants, Hierchical LDA, SIMCA, multinomial softmax/logistic regression, PLS-DA, K-NN, Gaussian Naive Bayes, Classification Trees, Random Forest, Probabilistic Neural Networks, LinearPerceptrons, and more to come. You can also conveniently dump classification statistics to LaTeX/CSV reports!

Multiway/Multilinear Modeling

I've been working to fulfill an obvious gap in the available tooling. Standard methods for Tucker decomposition (HOSVD, and HOOI) have been included. Some preprocessing methods, and even an early view at multilinear PLS. There's a lot that could be done here, please feel free to contribute!

Specialized tools?

This package has tools for specialized fields of analysis'. For instance, fractional derivatives for the electrochemists (and the adventurous), a handful of smoothing methods for spectroscopists, curve resolution (unimodal and nonnegativity constraints available) for forensics, process fault detection methods, etc. There are certainly plans for other tools for analyzing chemical data that packages in other languages have seemingly left out. Stay tuned.

Where's the Data?

Please check out ChemometricsData.jl for access to more publicly available datasets.

Right now the 2002 International Diffuse Reflectance Conference Pharmaceutical NIR, iris, Tecator aka 'meat', and ball gear fault detection (NASA) dataset are included in this package. But, this will be factored out eventually into ChemometricsData.jl.

I'd love for a collaborator to contribute some: spectra, chromatograms, etc. Please reach out to me if you wish to collaborate/contribute. In the mean time you can load in your own datasets using the full extent of Julia ecosystem (XLSX.jl, CSV.jl, JSON.jl, MATLAB.jl, LibPQ.jl, Feather.jl, Arrow.jl, etc).

What about Time Series? Cluster modeling?

Well, I'd love to hammer in some time series methods. That was originally part of the plan. Then I realized OnlineStats.jl already has the essentials for online learning covered, and a there are many efforts for actual time series((TimeSeries.jl)[https://github.com/JuliaStats/TimeSeries.jl]) modelling in the works.

Similarly, if clustering methods are important to you, check out Clustering.jl. I may add a few supportive odds and ends in here (or contribute to the packages directly) but really, most of the Julia 1.0+ ecosystem is really reliable, well made, and community supported.

ToDo:

  • Clean up.
  • Performance improvements.
  • Syntax improvements.
  • Documentation improvements.
  • Unit tests.

Maybes:

  • Design of Experiment tools (Partial Factorial design, D/I-optimal, etc...)?
  • Convenience fns propagation of error, multiequilibria, kinetics?
  • Electrochemical simulations and optical simulations (maybe separate packages...)?
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].