Anomaly detection 101

Detecting anomalies or outlier is about making hypotheis about the normal (aka usual) bahviour. This assumption is called the base hypothsis (H0) e.g. the given data elements are normal distributed. The respective significance level is rejecting or accepting the H0.

Classes of anomaly detection

Anomaly detection falls into five primary classes:

  • Spectral (aka Reconstruction) as e.g. Auto Encoder (AE) do.
  • Probabilistic e.g. by checking if the test data is in the learned Gaussian Process as given by the Python library sklearn.gaussian_process
  • Distance e.g. by checking the euclidain distance between the center-of-mass between a training and test data as given by the Python library scipy.spatial
  • Classification e.g. by checking if test data falls into (specific) learned classes as given by the Python library sklearn.naive_bayes
  • Information Theory (aka Hypothesis testing) as possible with the Python library scipy.stats e.g.
    • normal distribution is given or
    • same variance is given e.g. via person test

Python libary

A nice Python library for all of this is PyOd

This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *