I would like to try implementing the BFR Algorithm for cluster analysis. It makes use of 3 datasets:
1.The retained set (RS) The set of data points which are not recognized to belong to any cluster, and need to be retained in the buffer;
2.The discard set (DS) The set of data points which can be discarded after updating the summary statistics;
3.The compression set (CS) The set of summary statistics which are representative of each cluster.
Each data point is then assigned to one of these sets on the basis of its local Mahalanobis distance from the center of each cluster with respect to its sample covariance matrix.
It would be helpful if I could receive some direction as to how I could get started with this using the TMVA functions.