Variational autoencoder for anomaly detection
Pytorch/TF1 implementation of Variational AutoEncoder for anomaly detection following the paper
Variational Autoencoder based Anomaly Detection using Reconstruction Probability by Jinwon An, Sungzoon Cho
How to install
pip install vae-anomaly-detection
How To Train a Model
- Define your dataset into dataset.py and put in output into the function get_dataset
- Eventually change encoder and decoder inside VAE.py to fits your data layout
- Run in a terminal python train.py and specify required at least --input-size (pass -h to see all optional parameters)
- Trained model, parameters and Tensorboard log goes into the folder run/{id} where {id} is an integer from 0 to +inf
- After the model training run tensorboard --logdir=run to check all the training results
Make your model
Subclass VAEAnomalyDetection
and define your encoder and decoder like in VaeAnomalyTabular
class VAEAnomalyTabular(VAEAnomalyDetection):
def make_encoder(self, input_size, latent_size):
"""
Simple encoder for tabular data.
If you want to feed image to a VAE make another encoder function with Conv2d instead of Linear layers.
:param input_size: number of input variables
:param latent_size: number of output variables i.e. the size of the latent space since it's the encoder of a VAE
:return: The untrained encoder model
"""
return nn.Sequential(
nn.Linear(input_size, 500),
nn.ReLU(),
nn.Linear(500, 200),
nn.ReLU(),
nn.Linear(200, latent_size * 2)
# times 2 because this is the concatenated vector of latent mean and variance
)
def make_decoder(self, latent_size, output_size):
"""
Simple decoder for tabular data.
:param latent_size: size of input latent space
:param output_size: number of output parameters. Must have the same value of input_size
:return: the untrained decoder
"""
return nn.Sequential(
nn.Linear(latent_size, 200),
nn.ReLU(),
nn.Linear(200, 500),
nn.ReLU(),
nn.Linear(500, output_size * 2) # times 2 because this is the concatenated vector of reconstructed mean and variance
)
How to make predictions:
Once the model is trained (suppose for simplicity that it is under run/0/ ) just load and predict with this code snippet:
import torch
#load X_test
model = VaeAnomalyTabular(input_size=50, latent_size=32)
# could load input_size and latent_size also
# from run/0/train_config.yaml
model.load_state_dict(torch.load('run/0/model.pth'))
# load saved parameters from a run
outliers = model.is_anomaly(X_test)