Anomaly Detection¶

We are looking into how we can tell if a point is behaving irregularly. Being able to look at a time range and select specific points to look at will help reduce the amount of data facilities needs to look at in order to troubleshoot in a given building. We decided that away to do this in the early stages would be to use kmeans clustering, which is an easy-to implement algorithm that we can use through sklearn. Some research that Kiya looked into can be seen here.

We chose this approach because:

We don’t know enough about our data to develop a model-based detection system
Clustering is unsupervised, meaning we don’t already need to know what points are anomalous
By clustering, we don’t expect every point to behave exactly the same way but we do expect some patterns between similar points.

Kmeans¶

Kmeans is a clustering analysis algorithm that, given data points and a number, n, of desired clusters will categorize m-dimensional data points into n categories.

Insert an example here with a couple pictures¶

https://docs.google.com/presentation/d/19NAHDsxQbjwuffGsPYSBg3DXJbbPDzCakr4zOrQdhdE/edit#slide=id.g31e789b1e2_0_1

Insert how to run it here¶

analysis/anomaly_detection/anomaly_detection.py

Follow formatting in plot_cluster to get the data that is shown in above presentation