All Projects → JuliaML → PenaltyFunctions.jl

JuliaML / PenaltyFunctions.jl

Licence: MIT license
Julia package of regularization functions for machine learning

Programming Languages

julia
2034 projects

Projects that are alternatives of or similar to PenaltyFunctions.jl

introduction-to-machine-learning
A document covering machine learning basics. 🤖📊
Stars: ✭ 17 (-32%)
Mutual labels:  regularization
traj-pred-irl
Official implementation codes of "Regularizing neural networks for future trajectory prediction via IRL framework"
Stars: ✭ 23 (-8%)
Mutual labels:  regularization
Deeplearning
Python for《Deep Learning》,该书为《深度学习》(花书) 数学推导、原理剖析与源码级别代码实现
Stars: ✭ 4,020 (+15980%)
Mutual labels:  regularization
Deep-Learning-Specialization-Coursera
Deep Learning Specialization Course by Coursera. Neural Networks, Deep Learning, Hyper Tuning, Regularization, Optimization, Data Processing, Convolutional NN, Sequence Models are including this Course.
Stars: ✭ 75 (+200%)
Mutual labels:  regularization
Machine Learning From Scratch
Machine Learning models from scratch with a better visualisation
Stars: ✭ 15 (-40%)
Mutual labels:  regularization
mixup
speechpro.com/
Stars: ✭ 23 (-8%)
Mutual labels:  regularization
pyunfold
Iterative unfolding for Python
Stars: ✭ 23 (-8%)
Mutual labels:  regularization
SparseRegression.jl
Statistical Models with Regularization in Pure Julia
Stars: ✭ 37 (+48%)
Mutual labels:  regularization
Regularization-Pruning
[ICLR'21] PyTorch code for our paper "Neural Pruning via Growing Regularization"
Stars: ✭ 44 (+76%)
Mutual labels:  regularization
AMP-Regularizer
Code for our paper "Regularizing Neural Networks via Adversarial Model Perturbation", CVPR2021
Stars: ✭ 26 (+4%)
Mutual labels:  regularization
AI Learning Hub
AI Learning Hub for Machine Learning, Deep Learning, Computer Vision and Statistics
Stars: ✭ 53 (+112%)
Mutual labels:  regularization
numpy-neuralnet-exercise
Implementation of key concepts of neuralnetwork via numpy
Stars: ✭ 49 (+96%)
Mutual labels:  regularization
tulip
Scaleable input gradient regularization
Stars: ✭ 19 (-24%)
Mutual labels:  regularization
manifold mixup
Tensorflow implementation of the Manifold Mixup machine learning research paper
Stars: ✭ 24 (-4%)
Mutual labels:  regularization
Distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Stars: ✭ 3,760 (+14940%)
Mutual labels:  regularization
L0Learn
Efficient Algorithms for L0 Regularized Learning
Stars: ✭ 74 (+196%)
Mutual labels:  regularization
Machine-Learning-Andrew-Ng
机器学习-Coursera-吴恩达- python+Matlab代码实现
Stars: ✭ 127 (+408%)
Mutual labels:  regularization
SSE-PT
Codes and Datasets for paper RecSys'20 "SSE-PT: Sequential Recommendation Via Personalized Transformer" and NurIPS'19 "Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers"
Stars: ✭ 103 (+312%)
Mutual labels:  regularization
deep-learning-notes
🧠👨‍💻Deep Learning Specialization • Lecture Notes • Lab Assignments
Stars: ✭ 20 (-20%)
Mutual labels:  regularization
FSCNMF
An implementation of "Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks".
Stars: ✭ 16 (-36%)
Mutual labels:  regularization

PenaltyFunctions

PenaltyFunctions.jl is a Julia package that provides generic implementations for a diverse set of penalty functions that are commonly used for regularization purposes in Machine Learning.

License Build Status Coverage Status

Introduction

Many popular models in Machine Learning are parameterized by a set of real-valued coefficients θ (theta), which is usually stored in the form of an array. If our data set has k features, then θ would typically be a vector of k or k+1 numeric elements. Each individual feature x_i of our data set is assigned a corresponding coefficient θ_i, which is used to quantify the feature's influence on the prediction. The concrete values for the coefficient vector are learned by an optimization algorithm, which tries to select the "best" set of coefficients for the given data and model. Without any restriction on their values the optimization algorithm is free to choose the coefficients freely, which may result in overly complex prediction functions. This freedom is known to cause the optimization algorithm to overfit to the noise in the training data. This is where penalties come in!

A penalty is a function of the coefficients and only the coefficients. It associates the given set of coefficients with a cost without any regard for their influence on the predictive power of the prediction function. This cost is then is added to the overall cost of the prediction function. This way the optimization algorithm is encouraged to choose "simpler" coefficients. What exactly "simpler" means depends on the chosen penalty. In general terms: penalties help to reduce the possibility of overfitting.

Available penalties

This package implements a number of carefully crafted penalty functions, as well as an API to query their properties (e.g. convexity). Furthermore, we expose methods to compute their values and derivatives for a single value, coefficient vectors, and even arrays of arbitrary dimensionality. The provided penalty functions fall into one of two main families, namely Element Penalties and Array Penalties.

Element penalties

The first family of penalty functions contains all those that apply to to the individual elements of θ element-wise. The resulting cost of a coefficient array is then the sum of the element-wise results.

Univariate Parameter Bivariate Parameter
univariate_elem bivariate_elem
The cost-values of various penalties as a function of a single coefficient Cross sections of the cost-surfaces. This time for two coefficients

Every penalty that is of this family is subtype of ElementPenalty. From an implementation perspective these penalties are defined using the element-wise functions. The following table lists the implemented types and their definitions.

Penalty value on element
NoPenalty() g(θ) = 0
L1Penalty() g(θ) = abs(θ)
L2Penalty() g(θ) = 0.5 * θ ^ 2
ElasticNetPenalty(α = 0.5) g(θ) = (1 - α) * abs(θ) + α * .5 * θ ^ 2
SCADPenalty(a = 3.7, γ = 1.0) L1Penalty that blends to constant
MCPPenalty(γ = 2.0) g(θ) = abs(θ) < γ ? abs(θ) - θ ^ 2 / 2γ : γ / 2
LogPenalty(η = 1.0) g(θ) = log(1 + η * abs(θ))

The total cost for an array of coefficients is then defined as sum(g, θ).

using PenaltyFunctions
p = L1Penalty()
x = randn(5)
s = randn(5)
buffer = zeros(5)

# value
value(p, x[1])        # evaluate on element
value(p, x)           # evaluate on array
value.(p, x)          # broadcast is supported as well
value(p, x[1], s[1])  # evaluate on element, scaled by scalar
value(p, x, s[1])     # evaluate on array, scaled by scalar
value(p, x, s)        # evaluate on array, element-wise scaling

# value via calling the Penalty object
p = L1Penalty()
p([1,2,3])

# derivatives and gradients
deriv(p, x[1])        # derivative
deriv(p, x[1], s[1])  # scaled derivative
grad(p, x)            # gradient
grad(p, x, s[1])      # scaled gradient
grad(p, x, s)         # element-wise scaled gradient
grad!(buffer, p, x)       # overwrite buffer with gradient
grad!(buffer, p, x, s[1]) # overwrite buffer with scaled gradient
grad!(buffer, p, x, s)    # overwrite buffer with element-wise scaled gradient

# prox operator
prox(p, x[1], s[1]) # prox on element
prox(p, x, s[1])    # prox on array, scaled by scalar
prox(p, x, s)       # prox on array, element-wise scaling
prox!(p, x, s[1])   # overwrite x, scaled by scalar
prox!(p, x, s)      # overwrite x, element-wise scaling

Array penalties

The second family of penalty functions contains all those that that need to be evaluated on the entire coefficient array θ at once. Every penalty that belongs to this family is subtype of ArrayPenalty. The following table outlines the implemented types and their definitions.

Penalty value on array
NuclearNormPenalty() sum of singular values of x
MahalanobisPenalty(C) g(x) = x' * C' * C * x
GroupLassoPenalty() g(x) = vecnorm(x)

Installation

The package is registered in METADATA.jl.

Pkg.add("PenaltyFunctions")

License

This code is free to use under the terms of the MIT "Expat" license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].